Requirements for Parameter DBs
Closed this issue · 2 comments
Parameters DBs (at a minimum for JobParameters and PilotParameters) need the following:
- Store key-value pairs.
- The keys should not be pre-set, and it should be possible to add new keys at any time.
- It should be possible to search through the values. Practical example: we should be able to answer which job ran on a certain worker node at time X.
- It should be possible to easily create plots in Grafana. Example: Supposing a
{"ModelName": "some_Intel_AMD_bof"}
parameter I want to see the current "composition" of my Grid. And the composition per-site. - the lifetime of parameters should not be the same of the lifetime of their jobs/pilots.
After playing around with using a dump of the LHCb ElasticJobParametersDB
this I've come to the conclusion that anything that MySQL is not going to play nicely with this use case. The count(*)
queries for making dashboards are too slow when you have many rows due to MVCC. I also tried with postgres and while it has a bunch of features that make it nicer but it still has the same fundamental issue.
I'm sure we could come up with something clever using triggers but it'd be non-trivial and doesn't seem worth it.
IIUC, we keep the current OpenSearch-based solution. If that is the case, then at least DIRACGrid/DIRAC#7292 could be evaluated.