ActivityWatch/aw-server-rust

List of possible query2 optimizations

johan-bjareholt opened this issue · 4 comments

Processing optimizations:

  • Have the list of events sorted by timestamp, then we could binary search the events intersecting. Doing so would significantly improve performance of transforms like filter_period_intersect and the union_event_split in #179 .

Memory optimizations:

  • Wrap Vec<Event> or whole queries in an allocation arena
  • Reduce the amount of clones in our transforms (already been improved quite a bit the past few months)
  • Add proper lifetimes on variables in query2 to avoid a clone on each variable assignment and reference to it (See #119)

Something else?

@ErikBjare After I benchmarked #183 I realized that memory copying was incredibly fast already and the improvement was still barely noticeable in an extreme benchmark scenario I made up.

I started working on adding tracing instead and got something working quickly, heres the result from a month view (it's running in debug mode though, so significanly slower than usual)

Nov 05 23:31:57.034 TRACE query::function_call::query_bucket: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=256ms
Nov 05 23:31:57.138 TRACE query::function_call::flood: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=103ms
Nov 05 23:31:57.179 TRACE query::function_call::query_bucket: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=4.33ms
Nov 05 23:31:57.181 TRACE query::function_call::flood: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=1.48ms
Nov 05 23:31:57.183 TRACE query::function_call::filter_keyvals: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=1.07ms
Nov 05 23:31:57.730 TRACE query::function_call::filter_period_intersect: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=517ms
Nov 05 23:31:58.503 TRACE query::function_call::categorize: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=677ms
Nov 05 23:31:59.007 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=400ms
Nov 05 23:31:59.069 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=14.3ms
Nov 05 23:31:59.121 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=39.2ms
Nov 05 23:31:59.128 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=397µs
Nov 05 23:31:59.485 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=317ms
Nov 05 23:31:59.531 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=305µs
Nov 05 23:31:59.532 TRACE query::function_call::limit_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=364µs
Nov 05 23:31:59.558 TRACE query::function_call::limit_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=20.2ms
Nov 05 23:31:59.753 TRACE query::function_call::sum_durations: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=144ms
Nov 05 23:31:59.799 TRACE query::function_call::query_bucket: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=698µs
Nov 05 23:31:59.800 TRACE query::function_call::flood: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=150µs
Nov 05 23:32:00.003 TRACE query::function_call::filter_keyvals: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=152ms
Nov 05 23:32:00.062 TRACE query::function_call::filter_period_intersect: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=10.9ms
Nov 05 23:32:00.066 TRACE query::function_call::split_url_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=139µs
Nov 05 23:32:00.066 TRACE query::function_call::concat: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=129µs
Nov 05 23:32:00.066 TRACE query::function_call::sort_by_timestamp: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=113µs
Nov 05 23:32:00.269 TRACE query::function_call::query_bucket: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=203ms
Nov 05 23:32:00.338 TRACE query::function_call::flood: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=68.9ms
Nov 05 23:32:00.532 TRACE query::function_call::filter_keyvals: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=133ms
Nov 05 23:32:07.393 TRACE query::function_call::filter_period_intersect: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=6.78s
Nov 05 23:32:07.740 TRACE query::function_call::split_url_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=261ms
Nov 05 23:32:08.030 TRACE query::function_call::concat: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=181ms
Nov 05 23:32:08.150 TRACE query::function_call::sort_by_timestamp: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=120ms
Nov 05 23:32:08.568 TRACE query::function_call::split_url_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=317ms
Nov 05 23:32:09.043 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=323ms
Nov 05 23:32:09.125 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=18.5ms
Nov 05 23:32:09.176 TRACE query::function_call::limit_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=24.5ms
Nov 05 23:32:09.505 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=266ms
Nov 05 23:32:09.565 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=2.17ms
Nov 05 23:32:09.570 TRACE query::function_call::limit_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=2.04ms
Nov 05 23:32:09.788 TRACE query::function_call::sum_durations: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=175ms
Nov 05 23:32:10.049 DEBUG interpret: aw_query::interpret: close time.busy=0.00ns time.idle=13.3s

There is one huge outlier here and that's one call to filter_period_intersect which takes 6.78 seconds out of 13.3 seconds total.

The previous filter_period_intersects are much faster so I assume that this is when we do an intersection between the window events and the browser events as that's when we have the most amount of events.

EDIT: Tried running the tracing with my refcounting too, the same query took 13.0 seconds instead of 13.3 seconds so it's probably not worth it considering the increased complexity.

Apparently the calculate_endtime() function is incredibly slow, I was able to do a 5x optimization by caching the endtime for the filter events in each loop. Will push a PR soon.

Same query on exact same data in the web-ui in the above example went from 13.3 seconds to 7.53 seconds where the big filter_period_intersect went from 6.78 seconds to 1.24 seconds.

I'd consider that a huge win, changes are in #184

Wrapping our event lists in allocation arenas should significantly improve memory usage, would probably completely fix the "leak-like" behavior of the allocator keeping a lot more memory after big queries than it needs to.

Exactly how to do this in a nice manner in Rust though I have to investigate a bit more.