Canadian-Geospatial-Platform/stac-to-geocore

Combine the harverter with translation function

Closed this issue · 2 comments

The enhanced STAC to GeoCore translation requires fields from the collection level. The current harvester is focused on item level and saves the item files in an S3 before the translation. I am proposing a workflow that combines the harvester and translation without saving a copy of the item and collection files:

1: Check the access to datacube, if denied, skip the script, if granted, proceed to step 2
2: In the geocore-to-parquet S3 bucket, delete all the items files in lastrun.txt stored in a reuse S3 bucket
3: In the root API, harvest and loop through the collection API, for each collection save the needed fields as variables, then harvest and loop through the items
4: Translate the item JSON to GeoCore format
5: Rewrite the lastrun.txt to keep a copy of the translated items
6: Flag the hard-coded fields as variables

bo-lu commented

I think this is complete

Yes, this is completed. Codes are committed to the main branch: 200c899