nasa/opera-sds-ops

[New Feature]: Identify jobs that failed in phase of pipeline and re-submit processing from that phase only

riverma opened this issue · 0 comments

Checked for duplicates

Yes - I've already checked

Alternatives considered

Yes - and alternatives don't suffice

Related problems

@LalaP mentioned there is currently no OPS procedure to clean the system and relaunch jobs at specific phases of the R1 processing pipeline. The result of this is that OPS, for example during failed hand-shakes with the DAAC, has to clean the entire system out of specific granules, and re-download and go through all processing phases in order to deal with a failed CNM receive message for specific granules.

Describe the feature request

@LalaP mentioned the need for OPS to be able to relaunch jobs at any phase of the processing pipeline, i.e. download, ingest, PGE execution, CNM notify, CNM receive.

Things needed to resolve this ticket:

  • OPS procedure of how to identify granules that need reprocessing
  • How to retrigger above granules for specific pipeline phases (download, ingest, PGE, CNM notify, CNM receive) only
  • How to first cleanse the system of granules at a given particular phase