memory fault in emmalloc prev_region
cuviper opened this issue · 16 comments
In Fedora, we are building wasi-libc with MALLOC_IMPL=emmalloc
, because dlmalloc's CC0 is problematic (#319). However, in further testing I have run into a null pointer crash in this emmalloc port. My reproducer is:
- Build current wasi-libc with
MALLOC_IMPL=emmalloc
- Point to that build in the rust toolchain config
[target.wasm32-wasi] wasi-root = "path/to/sysroot"
, and./x build library --target wasm32-wasi
. - Start a new project,
cargo new --lib foo; cd foo
. - Build the test,
cargo +stage1 test --no-run --target wasm32-wasi
. - Run
wasmtime -- ./target/wasm32-wasi/debug/deps/foo-*.wasm --help
.
Error: failed to run main module `./target/wasm32-wasi/debug/deps/foo-80180d1f382acf95.wasm`
Caused by:
0: failed to invoke command default
1: error while executing at wasm backtrace:
0: 0x455cd - prev_region
at ~/src/wasi-libc/emmalloc/emmalloc.c:309:27
- attempt_allocate
at ~/src/wasi-libc/emmalloc/emmalloc.c:681:26
[...]
2: memory fault at wasm address 0xfffffffc in linear memory of size 0x120000
3: wasm trap: out of bounds memory access
I get similar crash running with wasmer
too. I only see this when wasi-libc is built with -O2 -DNDEBUG
(and I added -g
for backtraces), but it doesn't crash or hit any assertions without -DNDEBUG
. I also tried with (rebased) #378 and it was no better.
In that line of code, it seems clear that 0xfffffffc
must be a null pointer wrapping around by [-1]
.
Line 309 in aecd368
clang-analyzer
also finds a null pointer error on that line: report-3472e3.html.gz. That error path includes some of the initialization in claim_more_memory
that's different than the original emscripten code, but it's not the same path as what I actually hit at runtime.
Full wasmtime backtrace:
Error: failed to run main module `./target/wasm32-wasi/debug/deps/foo-80180d1f382acf95.wasm`
Caused by:
0: failed to invoke command default
1: error while executing at wasm backtrace:
0: 0x455cd - prev_region
at ~/src/wasi-libc/emmalloc/emmalloc.c:309:27
- attempt_allocate
at ~/src/wasi-libc/emmalloc/emmalloc.c:681:26
1: 0x45156 - allocate_memory
at ~/src/wasi-libc/emmalloc/emmalloc.c:800:19
- emmalloc_memalign
at ~/src/wasi-libc/emmalloc/emmalloc.c:893:15
2: 0x4573c - emmalloc_malloc
at ~/src/wasi-libc/emmalloc/emmalloc.c:915:10
- malloc
at ~/src/wasi-libc/emmalloc/emmalloc.c:920:10
3: 0x435fc - std::sys::wasi::alloc::<impl core::alloc::global::GlobalAlloc for std::alloc::System>::alloc::h30bf2b5c8e81bf65
at ~/rust/library/std/src/sys/wasi/../unix/alloc.rs:14:13
- __rdl_alloc
at ~/rust/library/std/src/alloc.rs:381:13
4: 0x24bb - <unknown>!__rust_alloc
5: 0x38755 - alloc::alloc::alloc::h7d5c328179f6fa67
at ~/rust/library/alloc/src/alloc.rs:100:9
- alloc::alloc::Global::alloc_impl::h1bd1691f491977af
at ~/rust/library/alloc/src/alloc.rs:183:73
- <alloc::alloc::Global as core::alloc::Allocator>::allocate::h12ef56f8ce7e59a1
at ~/rust/library/alloc/src/alloc.rs:243:9
- alloc::alloc::exchange_malloc::h609177d2ac860183
at ~/rust/library/alloc/src/alloc.rs:332:18
- alloc::boxed::Box<T>::new::hb82ce197ac45e458
at ~/rust/library/alloc/src/boxed.rs:217:9
- getopts::Options::usage_items::hc5b79851b290aebc
at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/getopts-0.2.21/src/lib.rs:592:9
6: 0x38333 - getopts::Options::usage_with_format::hdda5b0b199d95f00
at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/getopts-0.2.21/src/lib.rs:513:24
- getopts::Options::usage::h1649511b98ad5ea4
at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/getopts-0.2.21/src/lib.rs:498:9
7: 0x10d5c - test::cli::usage::h13bd525bb03cc06d
at ~/rust/library/test/src/cli.rs:192:17
- test::cli::parse_opts::h4ef2547817c55788
at ~/rust/library/test/src/cli.rs:213:9
8: 0x33a31 - test::test_main::h9b57329eed8a804d
at ~/rust/library/test/src/lib.rs:99:26
9: 0x3475a - test::test_main_static::h8c233200492b5e32
at ~/rust/library/test/src/lib.rs:158:5
10: 0x21c2 - foo::main::h27f72609c412c5ae
at /tmp/tmp.6bCstl/tmp/foo/src/lib.rs:1:1
11: 0x1bf2 - core::ops::function::FnOnce::call_once::hfe53c2cf666d96fa
at ~/rust/library/core/src/ops/function.rs:250:5
12: 0x1d84 - std::sys_common::backtrace::__rust_begin_short_backtrace::h994502d568a74c3c
at ~/rust/library/std/src/sys_common/backtrace.rs:135:18
13: 0xaf4 - std::rt::lang_start::{{closure}}::hcd76d0dca9e3dad5
at ~/rust/library/std/src/rt.rs:166:18
14: 0x3cbc2 - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::hffbdde799ca12b5f
at ~/rust/library/core/src/ops/function.rs:284:13
- std::panicking::try::do_call::h4b5d5542f1c86349
at ~/rust/library/std/src/panicking.rs:500:40
- std::panicking::try::h51bfb7e0e56693e6
at ~/rust/library/std/src/panicking.rs:464:19
- std::panic::catch_unwind::h11d480fe44ac6ca4
at ~/rust/library/std/src/panic.rs:142:14
- std::rt::lang_start_internal::{{closure}}::h6e0ffd5421ca9da0
at ~/rust/library/std/src/rt.rs:148:48
- std::panicking::try::do_call::hfa71db6696b40a2e
at ~/rust/library/std/src/panicking.rs:500:40
- std::panicking::try::h63d1edb0b6dbf2fd
at ~/rust/library/std/src/panicking.rs:464:19
- std::panic::catch_unwind::h9e006dc595210a8f
at ~/rust/library/std/src/panic.rs:142:14
- std::rt::lang_start_internal::h2c53e8a6f6be7467
at ~/rust/library/std/src/rt.rs:148:20
15: 0xa91 - std::rt::lang_start::h8dc3ccc2d1a087a6
at ~/rust/library/std/src/rt.rs:165:17
16: 0x21e6 - <unknown>!__main_void
17: 0xa0c - _start
at ~/src/wasi-libc/libc-bottom-half/crt/crt1-command.c:43:13
2: memory fault at wasm address 0xfffffffc in linear memory of size 0x120000
3: wasm trap: out of bounds memory access
clang-analyzer
[...] not the same path as what I actually hit at runtime.
Wild guess: maybe UB-related optimization nerfed the code in the clang-analyzer
reported path, leading to bad state in the backtrace I hit at runtime.
A couple of questions:
- looks like this is a "wasi-libc in Rust" scenario--the issue does not happen with dlmalloc in the same scenario? (I'm guessing it doesn't which is why we haven't seen this yet?)
- also, does this happen in "wasi-libc in C" scenarios?
I have not seen any issues like this with dlmalloc.
I haven't tried with a C program -- I imagine it would have the same issue, because Rust is just calling malloc
/realloc
/free
, but I can try to figure out the allocator pattern to reproduce it in C.
I can reproduce it with this C program:
#include <stdlib.h>
int main() {
char *p = malloc(1);
for (unsigned i = 1; i < 20; ++i) {
p = realloc(p, 1 << i);
}
}
It crashes at i = 16
, although I removed my prints to minimize it here. :)
Error: failed to run main module `crash.wasm`
Caused by:
0: failed to invoke command default
1: error while executing at wasm backtrace:
0: 0x726 - prev_region
at /home/jistone/src/wasi-libc/emmalloc/emmalloc.c:309:27
- attempt_allocate
at /home/jistone/src/wasi-libc/emmalloc/emmalloc.c:681:26
1: 0x3a2 - allocate_memory
at /home/jistone/src/wasi-libc/emmalloc/emmalloc.c:854:19
- emmalloc_memalign
at /home/jistone/src/wasi-libc/emmalloc/emmalloc.c:893:15
2: 0xceb - emmalloc_aligned_realloc
at /home/jistone/src/wasi-libc/emmalloc/emmalloc.c:1142:18
- emmalloc_realloc
at /home/jistone/src/wasi-libc/emmalloc/emmalloc.c:1243:10
- realloc
at /home/jistone/src/wasi-libc/emmalloc/emmalloc.c:1248:10
3: 0x16c - <unknown>!__original_main
4: 0xb5 - _start
at /home/jistone/src/wasi-libc/libc-bottom-half/crt/crt1-command.c:43:13
2: memory fault at wasm address 0xfffffffc in linear memory of size 0x20000
3: wasm trap: out of bounds memory access
This one is a little different:
#include <stdlib.h>
int main() {
char *p = malloc(1);
p = realloc(p, 12);
p = malloc(1);
}
Error: failed to run main module `crash-c.wasm`
Caused by:
0: failed to invoke command default
1: error while executing at wasm backtrace:
0: 0x73b - create_free_region
at /home/jistone/src/wasi-libc/emmalloc/emmalloc.c:376:43
- attempt_allocate
at /home/jistone/src/wasi-libc/emmalloc/emmalloc.c:703:5
1: 0x252 - allocate_memory
at /home/jistone/src/wasi-libc/emmalloc/emmalloc.c:800:19
- emmalloc_memalign
at /home/jistone/src/wasi-libc/emmalloc/emmalloc.c:893:15
2: 0x838 - emmalloc_malloc
at /home/jistone/src/wasi-libc/emmalloc/emmalloc.c:915:10
- malloc
at /home/jistone/src/wasi-libc/emmalloc/emmalloc.c:920:10
3: 0x12f - main
at /home/jistone/bugs/wasi-libc-421/crash.c:5:9
4: 0xb5 - _start
at /home/jistone/src/wasi-libc/libc-bottom-half/crt/crt1-command.c:43:13
2: memory fault at wasm address 0x2109c in linear memory of size 0x20000
3: wasm trap: out of bounds memory access
Ouch, yes, both test programs also crash when compiled with zig cc
, which also links emmalloc
.
As a point of comparison both those programs run fine when build with emscripten + run under node. Perhaps something how about sbrk() differs? Or local diffs with emmalloc itself?
One thing I found is that this calls clz(0)
, which is undefined:
Line 843 in 7018e24
In emscripten, the constructor adds at least one region, but that's commented out here in wasi:
Lines 642 to 643 in 7018e24
I tried adding if (freeRegionBucketsUsed == 0) claim_more_memory(3*sizeof(Region));
near the top of that function that had clz(0)
, but it didn't change the crashes overall.
Just adding a couple printf()
makes the crash go away. So, likely a UB or memory corruption.
By the way, you should declare p
as char * volatile p
, or else it may work just because the compiler optimizes out all the code since allocations are not actually used.
I'm not optimizing the main
code, but ok, I went ahead and added volatile
in my copies.
I have found that -fno-strict-aliasing
(while compiling wasi-libc, including emmalloc) also makes the crash go away. I'm playing with a few asm-barriers to see if I can nail down where aliasing may be misbehaving.
In the Region
struct, defining size
as a uint32_t
instead of size_t
also makes everything work 🤷
Oh, one other difference is that alignof(max_align_t)
differs between wasm32-emscripten and wasm32-wasi. (8 in former 16 in the latter).
This barrier makes it work:
@@ -1046,10 +1046,11 @@ static int attempt_region_resize(Region *region, size_t size)
assert(HAS_ALIGNMENT(newNextRegionStartPtr, sizeof(size_t)));
// Next region does not shrink to too small size?
if (newNextRegionStartPtr + sizeof(Region) <= nextRegionEndPtr)
{
unlink_from_free_list(nextRegion);
+ __asm__ __volatile__("":::"memory"); // barrier
create_free_region(newNextRegionStartPtr, nextRegionEndPtr - newNextRegionStartPtr);
link_to_free_list((Region*)newNextRegionStartPtr);
create_used_region(region, newNextRegionStartPtr - (uint8_t*)region);
return 1;
}
There's quite a bit of type punning throughout -- maybe we should just add -fno-strict-aliasing
to emmalloc's CFLAGS?
__asm__ __volatile__("":::"memory"); // barrier
Shouldn't a barrier be added every time the size of a different region is accessed/modified?
diff --git a/emmalloc/emmalloc.c b/emmalloc/emmalloc.c
index c98e42e..41dd308 100644
--- a/emmalloc/emmalloc.c
+++ b/emmalloc/emmalloc.c
@@ -302,10 +302,13 @@ static int compute_free_list_bucket(size_t allocSize)
return bucketIndex;
}
+#define MEMORY_BARRIER __asm__ __volatile__("":::"memory")
+
#define DECODE_CEILING_SIZE(size) ((size_t)((size) & ~FREE_REGION_FLAG))
static Region *prev_region(Region *region)
{
+ MEMORY_BARRIER;
size_t prevRegionSize = ((size_t*)region)[-1];
prevRegionSize = DECODE_CEILING_SIZE(prevRegionSize);
return (Region*)((uint8_t*)region - prevRegionSize);
@@ -318,6 +321,7 @@ static Region *next_region(Region *region)
static size_t region_ceiling_size(Region *region)
{
+ MEMORY_BARRIER;
return ((size_t*)((uint8_t*)region + region->size))[-1];
}
@@ -363,6 +367,7 @@ static void create_used_region(void *ptr, size_t size)
assert(size >= sizeof(Region));
*(size_t*)ptr = size;
((size_t*)ptr)[(size/sizeof(size_t))-1] = size;
+ MEMORY_BARRIER;
}
static void create_free_region(void *ptr, size_t size)
@@ -374,6 +379,7 @@ static void create_free_region(void *ptr, size_t size)
Region *freeRegion = (Region*)ptr;
freeRegion->size = size;
((size_t*)ptr)[(size/sizeof(size_t))-1] = size | FREE_REGION_FLAG;
+ MEMORY_BARRIER;
}
static void prepend_to_free_list(Region *region, Region *prependTo)
@@ -712,6 +718,7 @@ static void *attempt_allocate(Region *freeRegion, size_t alignment, size_t size)
// region as used memory, not leaving a free memory region behind.
// Initialize the free region as used by resetting the ceiling size to the same value as the size at bottom.
((size_t*)((uint8_t*)freeRegion + freeRegion->size))[-1] = freeRegion->size;
+ MEMORY_BARRIER;
}
#ifdef __EMSCRIPTEN_TRACING__
@@ -986,6 +993,7 @@ void emmalloc_free(void *ptr)
#endif
// Check merging with left side
+ MEMORY_BARRIER;
size_t prevRegionSizeField = ((size_t*)region)[-1];
size_t prevRegionSize = prevRegionSizeField & ~FREE_REGION_FLAG;
if (prevRegionSizeField != prevRegionSize) // Previous region is free?
@@ -1038,6 +1046,7 @@ static int attempt_region_resize(Region *region, size_t size)
// is a free region.
Region *nextRegion = next_region(region);
uint8_t *nextRegionEndPtr = (uint8_t*)nextRegion + nextRegion->size;
+ MEMORY_BARRIER;
size_t sizeAtCeiling = ((size_t*)nextRegionEndPtr)[-1];
if (nextRegion->size != sizeAtCeiling) // Next region is free?
{
@@ -1392,6 +1401,7 @@ static int trim_dynamic_heap_reservation(size_t pad)
return 0; // emmalloc is not controlling any dynamic memory at all - cannot release memory.
uint8_t *previousSbrkEndAddress = listOfAllRegions->endPtr;
assert(sbrk(0) == previousSbrkEndAddress);
+ MEMORY_BARRIER;
size_t lastMemoryRegionSize = ((size_t*)previousSbrkEndAddress)[-1];
assert(lastMemoryRegionSize == 16); // // The last memory region should be a sentinel node of exactly 16 bytes in size.
Region *endSentinelRegion = (Region*)(previousSbrkEndAddress - sizeof(Region));
Shouldn't a barrier be added every time the size of a different region is accessed/modified?
I don't think the barrier is a real solution, more of a hack that I was using to figure out where it was going wrong. It's still UB to violate strict aliasing, but the barrier only works to impede the main optimization opportunity that would take advantage of aliasing. I think -fno-strict-aliasing
is a cleaner solution because it removes the UB, without any chance that we missed somewhere that's aliasing.