[DAPS] Migrate to base ES config, to remove repetition from config
jaklinger opened this issue · 0 comments
jaklinger commented
- compactify es config format
- add
get_es_config
toorm_utils
, to replaceget_config
for ES - change
setup_es
andincrement_version
logic - find and replace
setup_es
syntax accordingly - automatically retrieve alias, which is tied to the endpoint name
- Add
endpoint
field to allElasticsearchTask
s - Add
endpoint
field to allSql2EsTasks
s
The following close #267 but are tracked here
- re-align mapping naming syntax with new config:
.
├── datasets # in the future, these will be for overriding the open ontology (daps2)
│ ├── arxiv_mapping.json
│ ├── companies_mapping.json
│ ├── cordis_mapping.json
│ ├── gtr_mapping.json
│ ├── meetup_mapping.json
│ ├── nih_mapping.json
│ └── patstat_mapping.json
├── defaults # e.g. for new analyzers
│ ├── index.json
│ └── settings.json
└── endpoints # project specific stuff
├── arxlive
│ └── arxiv_mapping.json
├── eurito
│ ├── arxiv_mapping.json
│ ├── companies_mapping.json
│ └── patstat_mapping.json
└── health-scanner
├── aliases.json # formerly under "aliases/health-scanner.json"
├── config.yaml # currently just to flag that the aliases should be hard
└── nulls.json # formerly under "field_null_mappings/health_scanner.json"
- verify that all "new" mappings are the same as the "old" ones (show the diff somewhere)
add tests: - all endpoints cannot have identical fields (as they should go under "datasets")
- all fields must match the ontology, as before
- all aliases must match the mappings, as before
- all nulls must match the mappings, as before
- remove old mappings
- add documentation for the logic here (also note that in the future, the configuration will drop the version, as this field will be generated automatically from semver+hash)
- rewire
orm_utils
and relevant batchables for aliases and null mappings - re-run all dev pipelines