nasa/opera-sds-pcm

[Bug]: bach-ui Production Time Summary Report generation for one day is timing out in OPS after 30 mins

Closed this issue · 1 comments

Checked for duplicates

Yes - I've already checked

Describe the bug

https://opera-ops-mozart-fwd.jpl.nasa.gov/bach-api/reports/ProductionTimeSummaryReport?startDateTime=2023-07-17T00:00&endDateTime=2023-07-18T00:00&reportType=sdp&mime=application/zip

If you run that report in OPS right now it runs for 30+ minutes and the proxy server times out. Based on previous optimization efforts we were projecting that we should be able to process ~60-day worth of data under 30 mins. However, we are now seeing that, at least for this particular date, just this one day worth is taking 30min+. So something is wrong.

ES process has no CPU activity after the initial query and it's all gunicorn process from there. I saw it using 100% CPU the entire time (single-threaded) but most importantly the memory usage continue to grow as time went on and I saw up to 37GB memory being used by one single gunicorn process. This is extraordinary for just one day worth of data.

I suspect that dates like this is tripping up some edge case in the bach-api algorithm.

What did you expect?

n/t

Reproducible steps

1.
2.
3.
...

Environment

- Version of this software [e.g. vX.Y.Z]
- Operating System: [e.g. MacOSX with Docker Desktop vX.Y]
...