rustwasm/twiggy

Strip hashes from Rust mangled symbols?

Opened this issue ยท 3 comments

๐Ÿ› Bug Description

Rust symbols have hashes in them. Both the old symbol mangling format and the new one.

  • old format, demangled: wee_alloc::alloc_first_fit::h6af2b7fe0cb0a62f
  • new format, demangled: core[369c8b9e1df8da3a]::slice::sort::recurse::<Foo, <[Foo]>::sort_unstable::{closure#0}>

(Note, you can enable the new symbol format with RUSTFLAGS=-Zsymbol-mangling-version=v0 when building with cargo)

In the old format, the hash is the only thing differentiating between different monomorphizations of the same generic function. Therefore, it might make sense to keep the hash for the old symbol format.

The new format has the type parameters mangled into the symbol's name (eg Foo in the symbol above). In this case, it should be fine to strip the hash all the time (but perhaps with an option to disable the stripping?)

twiggy version: master

๐ŸŒ Test Case

fn my_generic<T: Default>() -> T { T::default() }

#[no_mangle]
pub extern fn foo() -> *mut u8 {
    my_generic::<usize> as fn() -> usize as *mut _
}

#[no_mangle]
pub extern fn bar() -> *mut u8 {
    my_generic::<i8> as fn() -> i8 as *mut _
}

๐Ÿ‘Ÿ Steps to Reproduce

Precise steps describing how to reproduce the issue, including commands and
flags run. For example:

  • rustc --target wasm32-unknown-unknown --crate-type cdylib -C opt-level=3 -C lto=fat test_case.rs -o test_case_old_symbols.wasm
  • rustc -Zsymbol-mangling-version=v0 --target wasm32-unknown-unknown --crate-type cdylib -C opt-level=3 -C lto=fat test_case.rs -o test_case_new_symbols.wasm
  • twiggy top test_case_old_symbols.wasm
  • twiggy top test_case_new_symbols.wasm

๐Ÿ˜ฒ Actual Behavior

Function names have hashes

๐Ÿค” Expected Behavior

New symbols have hashes removed

(TBD exactly what to do with old symbols' hashes)

I think the hash should remain when there are two versions of the same crate.

Good point, it should probably only be stripped if it is a unique monomorphization. Even within the same crate, we can have multiple duplicate monomorphizations in different codegen units, IIUC.

I believe duplicate monomorphizations have the same symbol, they are just private to their own codegen unit.