Implement WormSet [HPPC-188]
Closed this issue · 4 comments
WormMap was introduced in #218.
This Jira issue is to not forget to add WormSet (derived from WormMap code).
Also remind that currently the benchmark for WORM is not accurate for Set (it is internally creating a HashSet implementation). It is accurate for Map.
Issue: HPPC-188 (migrated from JIRA), created by Bruno Roustant (@bruno-roustant), resolved Jul 31 2023
Comment by Bruno Roustant (@bruno-roustant) (migrated from JIRA)
Of course I'm volunteer to provide a PR soon.
Comment by Bruno Roustant (@bruno-roustant) (migrated from JIRA)
PR added: #11
I ran the benchmarks. They confirm WormSet should be used when there are less than 2M elements, above the perf gain on contains() diminishes while the loss on add() increases.
At the price of a slower add(), perf -30% on average, then contains() is faster and stable with the load, perf +20% on average. Also, IntWormSet uses on average 10% more memory (25% more but the internal array enlarges later).
Benchmark for contains():
B006_HashSet_Contains_Random.contains
Internal array capacity: 2^20
HPPC:
Load Score
0.45 0.569 ± 0.144 s/op
0.5 0.568 ± 0.127 s/op
0.55 0.533 ± 0.024 s/op
0.6 0.554 ± 0.009 s/op
0.65 0.635 ± 0.071 s/op
0.7 0.637 ± 0.041 s/op
0.74 0.629 ± 0.058 s/op
WORM:
Load Score
0.45 0.438 ± 0.063 s/op
0.5 0.493 ± 0.228 s/op
0.55 0.458 ± 0.037 s/op
0.6 0.493 ± 0.186 s/op
0.65 0.460 ± 0.082 s/op
0.7 0.472 ± 0.093 s/op
0.74 0.486 ± 0.052 s/op
FASTUTIL:
Load Score
0.45 0.482 ± 0.136 s/op
0.5 0.484 ± 0.060 s/op
0.55 0.508 ± 0.039 s/op
0.6 0.667 ± 0.274 s/op
0.65 0.603 ± 0.174 s/op
0.7 0.633 ± 0.078 s/op
0.74 0.652 ± 0.133 s/op
KOLOBOKE:
Load Score
0.45 0.461 ± 0.043 s/op
0.5 0.517 ± 0.132 s/op
0.55 0.520 ± 0.056 s/op
0.6 0.542 ± 0.042 s/op
0.65 0.614 ± 0.075 s/op
0.7 0.647 ± 0.115 s/op
0.74 0.671 ± 0.133 s/op
Benchmark for add():
B005_HashSet_Add_Random.add
HPPC:
MB_of_keys Score
1 0.022 ± 0.006 s/op
2 0.049 ± 0.003 s/op
3 0.080 ± 0.003 s/op
4 0.125 ± 0.006 s/op
WORM:
MB_of_keys Score
1 0.029 ± 0.002 s/op
2 0.068 ± 0.004 s/op
3 0.106 ± 0.006 s/op
4 0.184 ± 0.007 s/op
FASTUTIL:
MB_of_keys Score
1 0.022 ± 0.001 s/op
2 0.051 ± 0.007 s/op
3 0.079 ± 0.006 s/op
4 0.125 ± 0.004 s/op
KOLOBOKE:
MB_of_keys Score
1 0.024 ± 0.002 s/op
2 0.053 ± 0.004 s/op
3 0.084 ± 0.003 s/op
4 0.139 ± 0.004 s/op
``
Comment by Bruno Roustant (@bruno-roustant) (migrated from JIRA)
@d
weiss could you close this issue?