False positive with hashing
nickbabcock opened this issue · 5 comments
no-panic is causing a panic with:
cargo run --release < Cargo.toml
In a simple project that wraps the highway hash crate:
Cargo.toml:
[package]
name = "tmp"
version = "0.1.0"
edition = "2021"
[dependencies]
no-panic = "0.1"
highway = "1.1"
[profile.release]
lto = "thin" # we call to non-generic non-inline
# functions from a different crate
main.rs:
use std::io::Read;
use highway::{PortableHash, HighwayHash};
use no_panic::no_panic;
#[no_panic]
fn hash_data(mut hasher: PortableHash, data: &[u8]) -> u64 {
hasher.append(data);
1 // hasher.finalize64()
}
fn main() {
let stdin = std::io::stdin();
let mut data = Vec::new();
stdin.lock().read_to_end(&mut data).unwrap();
let hasher = PortableHash::default();
println!("{}", hash_data(hasher, &data));
}
Why do I think this doesn't panic?
- Tracing through
cargo asm
output, I did not see any code related to panics emitted (eg: noslice_end_index_len_fail
/ unwraps, etc):git clone https://github.com/nickbabcock/highway-rs.git cd highway-rs cargo asm --lib highway::portable::PortableHash::append cargo asm --lib highway::portable::PortableHash::update 0
- There are are some references to memcpy, but afaik those don't panic and simplifying the underlying code to remove memcpy is still causing the false positive.
- The rust compiler determined that the highwayhash wasm payload can omit a memory allocator, which afaik the stack unwinding panic handler would otherwise require.
- Followed instructions to run on release, use linker with thin LTO, and not use panic abort.
I know your viewpoint that there can't be false positives, so I'm scratching my head a bit about where the panic is thought to be coming from and why I think there is a discrepancy.
Usually you can help rustc's static analysis by adding #[no_panic] to PortableHash::append and everything called from it. Or if it doesn't fix, this would at least point you to exactly which leaf function contains the panic codepath (possibly in the standard library).
Thanks for the direction, I think I may need a tad more advice. I annotated the library with no_panic
and removing any line from the following diff will cause no_panic
to fail.
diff --git a/src/portable.rs b/src/portable.rs
index 9a99b50..57a7174 100644
--- a/src/portable.rs
+++ b/src/portable.rs
@@ -153,6 +153,7 @@ impl PortableHash {
self.update(permuted);
}
+ #[no_panic::no_panic]
fn update(&mut self, lanes: [u64; 4]) {
for (i, lane) in lanes.iter().enumerate() {
self.v1[i] = self.v1[i].wrapping_add(*lane);
@@ -180,6 +181,7 @@ impl PortableHash {
PortableHash::zipper_merge_and_add(self.v0[3], self.v0[2], &mut self.v1, 3, 2);
}
+ #[inline]
fn zipper_merge_and_add(v1: u64, v0: u64, lane: &mut [u64; 4], add1: usize, add0: usize) {
lane[add0] = lane[add0].wrapping_add(
(((v0 & 0xff00_0000) | (v1 & 0x00ff_0000_0000)) >> 24)
@@ -200,6 +202,7 @@ impl PortableHash {
);
}
+ #[no_panic::no_panic]
fn data_to_lanes(d: &[u8]) -> [u64; 4] {
let mut result = [0u64; 4];
for (x, dest) in d.chunks_exact(8).zip(result.iter_mut()) {
@@ -262,6 +265,7 @@ impl PortableHash {
self.update(PortableHash::data_to_lanes(&packet));
}
+ #[no_panic::no_panic]
fn append(&mut self, data: &[u8]) {
if self.buffer.is_empty() {
let mut chunks = data.chunks_exact(PACKET_SIZE);
The call stack is append -> update -> zipper_merge_and_add
The following diff will cause no_panic to pass, so I'm a bit confused:
cargo asm
shows thatzipper_merge_and_add
already gets inlined intoupdate
, so I'm unsure why#[inline]
is required forno_panic
to pass.- Is it normally required to annotate all functions in the callstack with
no_panic
? data_to_lanes
can also be annotated with#[inline]
to pass, but it requires either#[no_panic]
or#[inline]
(which is a bit odd as it's simple enough for the compiler to inline anyways).
Does this give you some insight? Something seems a tad fishy, but I can't pinpoint it yet (or maybe I'm misunderstanding how to use no_panic
).
The mystery deepens. It looks like the mere inclusion of code that uses AVX2 or SSE 4.1 that implements a trait also implemented by PortableHash
is enough to cause no_panic to fail even if the code is never invoked or even referenced.
I'm at a loss how commenting out unused and unreferenced code can cause no_panic to pass.
// this is the trait implementation that was commented out.
// impl HighwayHash for AvxHash {
// #[inline]
// fn append(&mut self, data: &[u8]) {
// unsafe {
// self.append(data);
// }
// }
// #[inline]
// fn finalize64(mut self) -> u64 {
// unsafe { Self::finalize64(&mut self) }
// }
// #[inline]
// fn finalize128(mut self) -> [u64; 2] {
// unsafe { Self::finalize128(&mut self) }
// }
// #[inline]
// fn finalize256(mut self) -> [u64; 4] {
// unsafe { Self::finalize256(&mut self) }
// }
// }
EDIT: after a few more hours debugging this, I'm not closer to a resolution. I feel like this related to platform intrinsics that are (again) unused and unrefereced in the functions marked as #[no_panic]
-- and it's somehow related to traits 😕
The solution is that I needed to compile with codegen-units = 1
in addition to lto = "thin"
.
Let me know how you want to proceed with this issue, close it or have this mentioned in the docs.
EDIT: to clarify this seems only for no_panic to succeed, this doesn't affect whether a panic actually emitted in the resulting artifact.
EDIT2: And sometimes fat LTO is required too
Good find. I would welcome a PR to mention codegen-units=1 and fat LTO in the readme.