Indexing with `slice::binary_search_by`'s result should not produce panicking branch
Opened this issue · 8 comments
In this code:
fn binary_search_repro(needle: &str) -> Option<&str> {
let haystack = &["a", "b"];
haystack.binary_search(&needle).ok().map(|i| haystack[i])
}the compiler should be able to use the hint from slice::binary_search_by in order to elide the length check in the index operation (haystack[i]).
Seems the knowledge from this assert_unchecked is lost somehow:
rust/library/core/src/slice/mod.rs
Line 2975 in e10aa88
Godbolt: https://rust.godbolt.org/z/8zqq8vrrv
(I should probably start searching for existing issues and then filing new ones, but...)
Might be considered a duplicate of this: #98258
Just found this codegen test:
rust/tests/codegen-llvm/binary-search-index-no-bound-check.rs
Lines 9 to 14 in e10aa88
Weirdly enough, if I modify my example's haystack to be a slice of u8s, like in this test, the panicking branch does indeed get elided. With a slice of &strs it doesn't.
not addressing the issue, just sidestepping it:
#[unsafe(no_mangle)]
pub fn binary_search_repro(needle: &str) -> Option<&str> {
let haystack = &["a", "b"];
haystack.binary_search(&needle).ok().and_then(|i| haystack.get(i)).map(|v| &**v)
}Yeah, if that has the "ideal" codegen then that sounds a bit like an optimization bug.
The pattern we end up with looks something like this: https://alive2.llvm.org/ce/z/ZGBKEp
This issue is specific to the case where the "haystack" is a small known array.
The pattern we end up with looks something like this: https://alive2.llvm.org/ce/z/ZGBKEp
This issue is specific to the case where the "haystack" is a small known array.
FWIW, this is the real-life place where I encountered this issue:
https://github.com/pulldown-cmark/pulldown-cmark/blob/f4a326d225e79412b5ecabd1c241c851e8160815/pulldown-cmark/src/entities.rs#L2153-L2158
it's not exactly a "small" array, but it is known.
But thanks to your hint, I managed to elide the panicking branch by rewriting the function as such:
pub(crate) fn get_entity(bytes: &[u8]) -> Option<&'static str> {
let entities = black_box(ENTITIES);
entities
.binary_search_by_key(&bytes, |&(key, _value)| key)
.ok()
.map(|i| entities[i].1)
}(also had to make ENTITIES a slice instead of an array)
it's not exactly a "small" array, but it is known.
Oh, I see. I guess the log2(size) needs to be "small", but that of course is also true for quite large arrays.