Modify Daily pipeline so fills in missing data for a given time window
franTarkenton opened this issue · 1 comments
franTarkenton commented
currently the script is setup to run on a daily basis where it runs using the current date.
Issues with this approach is the data is not always available for the current date. Sometimes there can be a 3-4 day lag before data becomes available.
Changes:
- instead of just running a specific date, the script will check what data exists in the object store bucket vs the data that is currently available. It will then run the pipeline for consecutive days, untill al the data that is currently available has been processed.
franTarkenton commented
Code was re-worked for modis flow, however ended up being a lot of work so didn't re-work for the viirs flow.
Added action that identifies what data is missing and then passes that onto the downstream jobs. Downstream jobs are setup as matrix builds, that allows them to be processed in parrallel.