WebAssembly/wasi-libc

`errno` is broken on `wasm32-wasi-threads`

dicej opened this issue · 15 comments

dicej commented

This appears to be my fault, specifically the change I made to __errno.h to support shared libraries: https://github.com/WebAssembly/wasi-libc/pull/429/files#diff-2be50b6cfc90ea05ebc180c8532db926692ac7c6e6afc7da5adf716ba3e34d8a

That change was required to avoid libc++.so importing _ZTH5errno (AKA "thread-local initialization routine for errno"). Switching from C++'s thread_local to C's _Thread_local resolved that, but when I was adding support for testing the wasm32-wasi-threads target in the wasi-sdk test suite, I discovered errno isn't working for that target.

At this point, my only idea is to split the wasi-sysroot/include directory into per-target directories (as I was planning to do anyway as part of the wasm32-wasi-preview2 work) and give wasm32-wasi-threads a __errno.h that uses thread_local while the other targets use _Thread_local.
EDIT: turns out the old code was broken too -- we just never tested it until now.

I'm open to other ideas, and if anyone happens to understand why replacing thread_local with _Thread_local broke wasm32-wasi-threads, I'd love to hear the explanation.

Another option here would be to make the errno variable static, and change the errno macro to #define errno (*__errno_location()). That's likely to avoid this issue, and it's how musl normally works on other platforms.

dicej commented

Further testing shows that the original code (i.e. prior to my shared library change) is also broken, meaning:

  • this is apparently not my fault (yay)
  • errno probably never worked on wasm32-wasi-threads (boo)
dicej commented

Also, for the record: I worked with @sunfishcode to implement his above suggestion, but that caused more tests to fail, for reasons we don't yet understand. Could still be a fruitful avenue if we can debug it.

dicej commented

After chatting with @abrown about this, it could be that thread local storage in general is broken for wasm32-wasi-threads (e.g. due to a Wasmtime or LLVM issue), in which case errno is just one example.

yamt commented

do you have a simple app to demonstrate the breakage?
last time i checked it was working fine.

dicej commented

do you have a simple app to demonstrate the breakage? last time i checked it was working fine.

The tests/general/signals.c test in the wasi-sdk repo is failing, and 8 of the libc-test tests used by this repo are failing. You have to enable those tests for wasm32-wasi-threads to see the failures, of course -- currently only wasm32-wasi is tested by CI. I suspect some of the libc-test tests are failing for other reasons, but at least some of them are due to errno.

yamt commented

do you have a simple app to demonstrate the breakage? last time i checked it was working fine.

The tests/general/signals.c test in the wasi-sdk repo is failing, and 8 of the libc-test tests used by this repo are failing. You have to enable those tests for wasm32-wasi-threads to see the failures, of course -- currently only wasm32-wasi is tested by CI. I suspect some of the libc-test tests are failing for other reasons, but at least some of them are due to errno.

i tried libc-test and it passed for me.

cd wasi-libc
make -j8 CC=/opt/wasi-sdk/bin/clang THREAD_MODEL=posix
cd test
ENGINE="toywasm --wasi" CC=/opt/wasi-sdk-21.0/bin/clang CFLAGS="--target=wasm32-wasi-threads -pthread --sysroot=../sysroot" make
yamt commented

well, i guess it actually should be:

ENGINE="toywasm --wasi" CC=/opt/wasi-sdk-21.0/bin/clang CFLAGS="--target=wasm32-wasi-threads -pthread -Wl,--import-memory -Wl,--export-memory" make

but it fails on wasmtime 15.0.1.
i consider it a wasmtime bug.
cf. WebAssembly/wasi-threads#49

dicej commented

Thanks for tracking this down, @yamt. I'll close this issue, since it's a Wasmtime bug.

dicej commented

FWIW, I just re-ran the wasi-libc tests using Wasmtime 16 and using the -pthread when compiling. Turns out the latter is essential. All those tests pass now.

Still having trouble with the wasi-sdk tests, though -- even after adding -pthread. Looks like a different issue now: Wasmtime goes into a busy wait loop when trying to write to stdout. I'll continue to debug.

dicej commented

One last update: with bytecodealliance/wasmtime#7750 all tests are passing in both wasi-libc and wasi-sdk.

To my knowledge, the only remaining unresolved issue is that using clang --target=wasm32-wasi-threads without -pthread silently creates broken modules which don't do thread local storage correctly (as demonstrated by failures in both the wasi-sdk and wasi-libc test suites using any runtime). @sbc100 should we consider that an LLVM bug?

sbc100 commented

I'm not sure I understand what the issue is there. Could you elaborate?

dicej commented

I'm not sure I understand what the issue is there. Could you elaborate?

For example, using https://github.com/WebAssembly/wasi-sdk/blob/main/tests/general/signals.c and any recent version of wasi-sdk (e.g. 21.0) and Wasmtime (e.g. 16.0.0), we get the expected output when using -pthread:

 $ /opt/wasi-sdk-21.0/bin/clang --target=wasm32-wasi-threads -pthread -D_WASI_EMULATED_SIGNAL -lwasi-emulated-signal signals.c -o signals.wasm
 $ wasmtime run -S threads signals.wasm 
psignal message for SIGINT: Interrupt
strsignal for SIGHUP: 'Hangup'
beginning handler test:
handler for signal Window changed
finished handler test

If we omit -pthread, we get an assertion failure:

 $ /opt/wasi-sdk-21.0/bin/clang --target=wasm32-wasi-threads -D_WASI_EMULATED_SIGNAL -lwasi-emulated-signal signals.c -o signals-no-pthread.wasm
 $ wasmtime run -S threads signals-no-pthread.wasm                                                                                              
Assertion failed: raise(_NSIG) == -1 && errno == EINVAL (: main: 31)
Error: failed to run main module `signals-no-pthread.wasm`

Caused by:
    0: failed to invoke command default
    1: error while executing at wasm backtrace:
           0: 0x1072 - <unknown>!abort
           1: 0x1179 - <unknown>!__assert_fail
           2:  0x716 - <unknown>!__original_main
           3:  0x1da - <unknown>!_start
    2: wasm trap: wasm `unreachable` instruction executed

The behavior is consistent across Wasmtime, Toywasm, and Wasmer -- they all run the -pthread build successfully, and they all get the assertion failure when running the non--pthread build due to errno being read as a garbage value rather than the last value written to it.

sbc100 commented

It should be impossible to link object files built with threading support with object files built without threading support. I guess adding --target=wasm32-wasi-threads but not -pthread is somehow enabling some level of threading support? It should really fail to link because signals.c is being built without threading support but libc etc are built with thread support.

I assume building with --target=wasm32-wasi-threads causes pthread-enabled standard libraries to be linked in.. regardless of -pthread?

dicej commented

I assume building with --target=wasm32-wasi-threads causes pthread-enabled standard libraries to be linked in.. regardless of -pthread?

That sounds plausible. I would assume it's using /opt/wasi-sdk-21.0/share/wasi-sysroot/lib/wasm32-wasi-threads/* in the example above.