DF.validate() does some basic checks but doesn't validate everything that is possible based on Table Schema. In particular, it does not validate primary keys and we have noted that this creates other currently untraced bugs (e.g.: load from a package with invalid primary keys and try to dump again, the package will be incomplete).
We need to explore one of:
The problem with adopting Frictionless is that it can't be incrementally adopted AFAIK - the validation is built into the Resource class and I don't know just from reading the code where that leads (if / how it complicates our code when we use different libraries for managing Frictionless Data specs). Also, it sets state in memory (seen data for primary keys and foreign keys), and I guess based on other patterns in Dataflows we would want to store that data outside of the running python process ( e.g.: using https://github.com/akariv/kvfile ).
DF.validate()does some basic checks but doesn't validate everything that is possible based on Table Schema. In particular, it does not validate primary keys and we have noted that this creates other currently untraced bugs (e.g.: load from a package with invalid primary keys and try to dump again, the package will be incomplete).We need to explore one of:
The problem with adopting Frictionless is that it can't be incrementally adopted AFAIK - the validation is built into the Resource class and I don't know just from reading the code where that leads (if / how it complicates our code when we use different libraries for managing Frictionless Data specs). Also, it sets state in memory (seen data for primary keys and foreign keys), and I guess based on other patterns in Dataflows we would want to store that data outside of the running python process ( e.g.: using https://github.com/akariv/kvfile ).