"copy ctor" benchmark doesn't scale linearly
camel-cdr opened this issue · 0 comments
I was comparing my implementation against @ktprime's hash_map8, which uses almost the same copy implementation as my implementation (malloc + memcpy), but for some reason mine was >2 times slower.
hash_map8
allocated 32780176
bytes, while my implementation allocated 35651601
bytes, so there shouldn't be a 2x performance difference. (it's even worse when run on godbolt: https://godbolt.org/z/oqe4rY3qq)
After a bunch of testing and help from a friend, I figured out that the malloc mmap threshold (DEFAULT_MMAP_THRESHOLD_MAX
) happens to be 33554432
, which is exactly between the two sizes.
I don't think that this is a good idea to have this influence the benchmark that much, as the constants involved are arbitrary.
I'm not certain what the best way to fix this is, so I haven't written an PR yet.
We could bump the size of the elements to DEFAULT_MMAP_THRESHOLD_MAX
, so every map is treated the same, or we could lower the threshold with mallopt
.