pisa-engine/pisa

`topk_queue` rapidcheck tests fail on Clang 15

elshize opened this issue · 2 comments

We have some property tests implemented that check topk_queue thresholds, and they mysteriously fail on Clang 15.

It seems that recent changes to the pop_heap implementation in libc++ assume that the entire range [first, last) is a valid heap, including the last element that we are pushing. I'm trying to figure out if it's a bug, or if the standard in fact requires that.

On another note, the implementation of pop_heap in both libc++ and libstdc++ use "heapsort with bounce" approach to popping. That is, first a full sift down is performed to find the leaf, and then the heap is fixed by sifting up if necessary (https://en.wikipedia.org/wiki/Heapsort#Bottom-up_heapsort). The idea is that this takes fewer comparisons (see wiki article for details) because the value we sift down-and-up is in the leaf. But in our case this value can be anything that is larger than the top of the heap, so not sure how much we're gaining by doing this hack.

I want to actually do some benchmarks for this.

LLVM commit introducing the change: llvm/llvm-project@79d08e3

LLVM discourse thread I started to find out more: https://discourse.llvm.org/t/binary-heap-pop-push-operation-using-pop-heap/67430

Ok, so it turns out that the standard does state that the entire range must be a valid heap: http://eel.is/c++draft/pop.heap#2

The fix itself is trivial (need to do a pop before appending the new value, and then swap the old with new, and call push), but I will (a) look into whether there is a more efficient alternative, and (b) do some benchmarking.