Roadmap Tracking Issue - EPIC
jqnatividad opened this issue · 0 comments
jqnatividad commented
OVERALL VISION: To increase the utility and performance of the CKAN Datastore:
- by enriching resources, so that right after a file is pushed by DP+, it does a lot of data-wrangling tasks that are typically done manually:
- a lot of metadata is inferred, so the Data Publisher does not have to laboriously enter it in
- descriptive statistics are computed, allowing the Data Publisher and the end-user to better understand the resource
- location information is automatically normalized and geocoded
- related datasets/resources are automatically inferred
- auto-tagging
- by taking advantage of PostgreSQL native features
- also use it as a Document Database leveraging JSONB?
- partitioning/sharding?
- by tapping into the rich PostgreSQL extensions ecosystem (in particular - PostGIS, Timescale, Citus, CartoDB, Apache Age and ZomboDB)
- give it "Data Lake"-like capabilities
- enable Datastore API users to issue performant, reliable SQL queries
- #98
- #18
- #11
- Auto-tagging
- Automatic spatial extent calculation
- Automatic processing/recognition of whitelisted common column names (e.g. latitude, longitude, status, open date, closed date, etc.)
- #53
- #47
- #27
- #9
- Auto partitioning
- #60
- Deferred datapush on initial package creation to allow per package Datapusher+ Configuration
- #87
- #17
- Enabling record-level search
- #8
- #13
- #54
- #10
- #19
- #30
- Native PostGIS support
- Native time-series support with Timescale
- #34
- #35
- #46