dathere/datapusher-plus

affiliated CKAN Service Provider jobs - "DataGroomers" that are meant to periodically groom datastore data

jqnatividad opened this issue · 0 comments

"Datagroomers" as the name implies, continuously "groom" the data in the background based on certain rules/recipes.

At the moment, I envision them as CKAN service provider jobs.

Several "datagroomers" come to mind:

  • libpostal datagroomer - for normalizing addresses
  • geocoding datagroomer
    • using qsv's built-in, low-resolution geonames geocoder
    • using the user's preferred geocoding service, leverage qsv fetch
  • auto-tagging datagroomer - for adding tags based on certain domains (e.g. clean-energy tagger, internet of water tagger, etc)
  • related resources datagroomer - Link related resources based on their data dictionaries