Try re-using threadlocal arrays in ArrayHitCounter
alexklibisz opened this issue · 2 comments
Background
The ArrayHitCounter class gets reinstantiated for every segment in every LSH query. It involves allocating an array of shorts of size equivalent to the number of docs in the segment. Maybe using a "pool" of threadlocal arrays would minimize time spent allocating the new arrays?
Deliverables
- Modify ArrayHitCounter to take its arrays from a ThreadLocal "pool" of arrays
- Benchmark
- Thoroughly test, incl. some form of parallelized testing
Related Issues
No response
I took a first pass at this here: #600, latest commit at time of writing is: 47cede7
The implementation is relatively simple:
I verified (with some logging) that it is actually clearing and reusing the arrays.
Unfortunately, it has not made much of an impact on the Fashion Mnist benchmark. The raw benchmark results are all close enough that it's not obvious anything actually changed:
Perhaps any time saved is spent clearing the arrays?
GC time and memory use are also about the same before and after: #600 (comment)
Closing, see previous comment