Improved crop-mask integration
gabrieltseng opened this issue · 1 comments
gabrieltseng commented
We want to improve the integration between cropharvest and crop-mask.
On the CropHarvest side this consists of:
- Renaming the
tif
files to follow a<location>_<date>
naming convention instead of coupling them to thelabels.geojson
- Rewriting the Engineer to handle variable length tifs, instead of expecting 12-month inputs
- Storing tifs to a google cloud bucket
This has the advantage of not requiring tif files to be downloaded before updating the dataset, which should make it easier to contribute new datasets.
cc @ivanzvonkov
ivanzvonkov commented
You probably already know this but the Engineer
can handle variable timesteps, it does so here: https://github.com/nasaharvest/crop-mask/blob/951b14621838d70eb95f284bf92984aaf35d4cb1/src/ETL/dataset.py#L109
I supposed the real question may be: should we reexport all the unexported tifs in the 24 month timestep style starting from January to make it easy to run models from September to September for example.