"Aggregate: last" scans the whole data set?
smurfix opened this issue · 5 comments
My incremental import jobs need to ask Akumuli for the date of the last data point in a series.
This takes a very long time; apparently, akumuli scans the whole data set to find that value.
This is obviously somewhat non-optional, to say the least. Please optimize.
Could you please post your query?
It's rather simple:
{"aggregate": {"power": "last"}, "where": {"type": "meter", "loc": "eg", "phase": 3}}
{"aggregate": {"cpu.irq": "last"}, "where": {"typ": "irq", "sub": "27", "host": "os-dispatch"}}
The query is guaranteed to apply to only one series. The requests take anywhere between one and 300 seconds, during which akumuli is busy reading a lot of records from all its data files. (I verified that with strace.)
This is quite surprising. I'll investigate. Thank you for reporting this.
Should this fix the issue? if so I'll test.
It should fix it. But there is a problem with one unit-test. I'll need to investigate before it could be merged.