cta-observatory/lstosa

Get run date function to find calibration files is highly inefficient

Closed this issue · 2 comments

sequencer currently takes several minutes to run. Below is its profile. This comes with the necessity of guessing the 'date directory' a given run was taken globing over all the R0 date subdirectories. This will get worse as the number of directories increases with time.

image

Some things that can be done:

  • Implement a --check option in sequencer that only returns the processing status table without having to go through all the building of sequences. Somehow detach the checking of sequences from the generation of the workflow (where the names of calibration files are defined)
  • To avoid duplication, there are already several functions in lstchain that look for calibration files/run (drs4 baseline, time, etc)

Since there is no run database, a possible solution would be to merge all the RunSummary files assigning the date to each run. The sequencer could look for the 'directory date' in this merged RunSummary file.

Addressed in #152