Harvest interface incompatible with non JSON config
pdekraker-epa opened this issue · 0 comments
I tried creating a harvester and wanted to store the config in YAML instead of JSON. My validate config function works properly checking the items and returning a string of YAML. However before the config is written to the database the harvest_source_extra_validator function runs. It attempts to load the config as JSON. When the json.loads fails the code deletes the config before continuing without an error (it does leave a message in the log). Thus to the user it appears the harvest source was created, but the user supplied configuration is lost.
The specification for the config of a harvest object does not appear to require JSON, the only comment about it I noticed is that the CKANharvester stores its config in JSON.
In exploring this issue I have a few other connected observations:
- The harvester interface has an optional method extra_schema that does not appear to be documented
- The harvest_source_extra_validator appears to use the extra_schema and add the defined keys into the config (leading to the issue)
- the extra_schema fields appear to still be stored in the extras, which kind of makes the inserting into config unnecessary?