nasa/opera-sds-pcm

[Bug]: DISP-S1 reprocessing will submit for download any set of granules, even if the set doesn't contain all bursts

Opened this issue · 0 comments

Checked for duplicates

Yes - I've already checked

Describe the bug

DISP-S1 reprocessing trigger logic is currently making a naive assumption that any reprocessing trigger that yields at least one burst for that frame-sensing_datetime key would be a candidate for processing. Such assumption was reasonble during the early development period when we didn't know that some sensing time wouldn't have a complete set of bursts. No we know that that logic needs a bit more nuance: it's possible that the user could initiate reprocessing for frame-sensing_datetime that does not have all bursts.

In that scenario the triggering logic should drop all of the CSLC granules which do not form the full burst set for that frame-sensing_datetime key.

What did you expect?

n/t

Reproducible steps

python ~/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py query -c OPERA_L2_CSLC-S1_V1 --chunk-size=1 --k=1 --m=1 --job-queue=opera-job_worker-cslc_data_download --processing-mode=reprocessing --start-date=2019-11-25T14:06:51Z --end-date=2019-12-01T14:07:10Z --frame-id=11113 --use-temporal
...
[2024-10-01 22:05:24,872: INFO/async_query_cmr] QUERY RESULTS: Found 12 granules
[2024-10-01 22:05:24,872: INFO/async_query_cmr] QUERY RESULTS 1 to 12 of 12: [('OPERA_L2_CSLC-S1_T042-088905-IW3_20191129T140653Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088905-IW2_20191129T140652Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088905-IW1_20191129T140651Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088904-IW2_20191129T140650Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088904-IW3_20191129T140650Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088904-IW1_20191129T140649Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088903-IW3_20191129T140648Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088903-IW2_20191129T140647Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088903-IW1_20191129T140646Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088902-IW3_20191129T140645Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088902-IW2_20191129T140644Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088902-IW1_20191129T140643Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1')]
...
[2024-10-01 22:05:25,022: INFO/submit_download_job_submissions_tasks] chunk_batch_ids=['f11113_a1206']
[2024-10-01 22:05:25,022: INFO/submit_download_job_submissions_tasks] payload_hash=None
[2024-10-01 22:05:25,022: INFO/create_download_job_params] download_job_params=[{'name': 'batch_ids', 'value': '--batch-ids f11113_a1206', 'from': 'value'}, {'name': 'smoke_run', 'value': '', 'from': 'value'}, {'name': 'dry_run', 'value': '', 'from': 'value'}, {'name': 'endpoint', 'value': '--endpoint=OPS', 'from': 'value'}, {'name': 'start_datetime', 'value': '--start-date=2019-11-25T14:06:51Z', 'from': 'value'}, {'name': 'end_datetime', 'value': '--end-date=2019-12-01T14:07:10Z', 'from': 'value'}, {'name': 'use_temporal', 'value': '--use-temporal', 'from': 'value'}, {'name': 'chunk_size', 'value': '--chunk-size=1', 'from': 'value'}, {'name': 'transfer_protocol', 'value': '--transfer-protocol=auto', 'from': 'value'}, {'name': 'proc_mode', 'value': '--processing-mode=reprocessing', 'from': 'value'}]
Traceback (most recent call last):
  File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 319, in <module>
    main()
  File "/export/home/hysdsops/mozart/ops/opera-pcm/util/exec_util.py", line 35, in wrapper
    status = func(*args, **kwargs)
  File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 56, in main
    run(sys.argv)
  File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 99, in run
    results["query"] = run_query(args, token, es_conn, cmr, job_id, settings)
  File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 137, in run_query
    return cmr_query.run_query(args, token, es_conn, cmr, job_id, settings)
  File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/query.py", line 111, in run_query
    job_submission_tasks = self.download_job_submission_handler(download_granules, query_timerange)
  File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/query.py", line 230, in download_job_submission_handler
    job_submission_tasks = self.submit_download_job_submissions_tasks(batch_id_to_urls_map, query_timerange)
  File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/query.py", line 278, in submit_download_job_submissions_tasks
    if not cslc_dependency.compressed_cslc_satisfied(frame_id, acq_indices[0], self.es_conn.es_util):
  File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/cslc_utils.py", line 350, in compressed_cslc_satisfied
    if self.get_dependent_compressed_cslcs(frame_id, day_index, eu) == False:
  File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/cslc_utils.py", line 360, in get_dependent_compressed_cslcs
    prev_day_indices = self.get_prev_day_indices(day_index, frame_id)
  File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/cslc_utils.py", line 259, in get_prev_day_indices
    list_index = frame.sensing_datetime_days_index.index(day_index)
ValueError: 1206 is not in list

Environment

- Version of this software [e.g. vX.Y.Z]
- Operating System: [e.g. MacOSX with Docker Desktop vX.Y]
...