[Bug]: DISP-S1 reprocessing will submit for download any set of granules, even if the set doesn't contain all bursts
Opened this issue · 0 comments
philipjyoon commented
Checked for duplicates
Yes - I've already checked
Describe the bug
DISP-S1 reprocessing trigger logic is currently making a naive assumption that any reprocessing trigger that yields at least one burst for that frame-sensing_datetime key would be a candidate for processing. Such assumption was reasonble during the early development period when we didn't know that some sensing time wouldn't have a complete set of bursts. No we know that that logic needs a bit more nuance: it's possible that the user could initiate reprocessing for frame-sensing_datetime that does not have all bursts.
In that scenario the triggering logic should drop all of the CSLC granules which do not form the full burst set for that frame-sensing_datetime key.
What did you expect?
n/t
Reproducible steps
python ~/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py query -c OPERA_L2_CSLC-S1_V1 --chunk-size=1 --k=1 --m=1 --job-queue=opera-job_worker-cslc_data_download --processing-mode=reprocessing --start-date=2019-11-25T14:06:51Z --end-date=2019-12-01T14:07:10Z --frame-id=11113 --use-temporal
...
[2024-10-01 22:05:24,872: INFO/async_query_cmr] QUERY RESULTS: Found 12 granules
[2024-10-01 22:05:24,872: INFO/async_query_cmr] QUERY RESULTS 1 to 12 of 12: [('OPERA_L2_CSLC-S1_T042-088905-IW3_20191129T140653Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088905-IW2_20191129T140652Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088905-IW1_20191129T140651Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088904-IW2_20191129T140650Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088904-IW3_20191129T140650Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088904-IW1_20191129T140649Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088903-IW3_20191129T140648Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088903-IW2_20191129T140647Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088903-IW1_20191129T140646Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088902-IW3_20191129T140645Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088902-IW2_20191129T140644Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1'), ('OPERA_L2_CSLC-S1_T042-088902-IW1_20191129T140643Z_20240712T014909Z_S1B_VV_v1.1', 'revision 1')]
...
[2024-10-01 22:05:25,022: INFO/submit_download_job_submissions_tasks] chunk_batch_ids=['f11113_a1206']
[2024-10-01 22:05:25,022: INFO/submit_download_job_submissions_tasks] payload_hash=None
[2024-10-01 22:05:25,022: INFO/create_download_job_params] download_job_params=[{'name': 'batch_ids', 'value': '--batch-ids f11113_a1206', 'from': 'value'}, {'name': 'smoke_run', 'value': '', 'from': 'value'}, {'name': 'dry_run', 'value': '', 'from': 'value'}, {'name': 'endpoint', 'value': '--endpoint=OPS', 'from': 'value'}, {'name': 'start_datetime', 'value': '--start-date=2019-11-25T14:06:51Z', 'from': 'value'}, {'name': 'end_datetime', 'value': '--end-date=2019-12-01T14:07:10Z', 'from': 'value'}, {'name': 'use_temporal', 'value': '--use-temporal', 'from': 'value'}, {'name': 'chunk_size', 'value': '--chunk-size=1', 'from': 'value'}, {'name': 'transfer_protocol', 'value': '--transfer-protocol=auto', 'from': 'value'}, {'name': 'proc_mode', 'value': '--processing-mode=reprocessing', 'from': 'value'}]
Traceback (most recent call last):
File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 319, in <module>
main()
File "/export/home/hysdsops/mozart/ops/opera-pcm/util/exec_util.py", line 35, in wrapper
status = func(*args, **kwargs)
File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 56, in main
run(sys.argv)
File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 99, in run
results["query"] = run_query(args, token, es_conn, cmr, job_id, settings)
File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 137, in run_query
return cmr_query.run_query(args, token, es_conn, cmr, job_id, settings)
File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/query.py", line 111, in run_query
job_submission_tasks = self.download_job_submission_handler(download_granules, query_timerange)
File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/query.py", line 230, in download_job_submission_handler
job_submission_tasks = self.submit_download_job_submissions_tasks(batch_id_to_urls_map, query_timerange)
File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/query.py", line 278, in submit_download_job_submissions_tasks
if not cslc_dependency.compressed_cslc_satisfied(frame_id, acq_indices[0], self.es_conn.es_util):
File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/cslc_utils.py", line 350, in compressed_cslc_satisfied
if self.get_dependent_compressed_cslcs(frame_id, day_index, eu) == False:
File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/cslc_utils.py", line 360, in get_dependent_compressed_cslcs
prev_day_indices = self.get_prev_day_indices(day_index, frame_id)
File "/export/home/hysdsops/mozart/ops/opera-pcm/data_subscriber/cslc_utils.py", line 259, in get_prev_day_indices
list_index = frame.sensing_datetime_days_index.index(day_index)
ValueError: 1206 is not in list
Environment
- Version of this software [e.g. vX.Y.Z]
- Operating System: [e.g. MacOSX with Docker Desktop vX.Y]
...