managarm/mlibc

posix: pthread_cancel test fails intermittently on RISC-V Linux

64 opened this issue · 4 comments

64 commented

On commit 3459cb4, the CI encountered an intermittent failure in the pthread_cancel test on RISC-V (under the Linux sysdeps):

  61/144 mlibc:posix / pthread_cancel                 FAIL            1.10s   killed by signal 6 SIGABRT
――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
stderr:
In function main, file ../../../src/mlibc/tests/posix/pthread_cancel.c:94: Assertion '!ret' failed!

I haven't been able to reproduce this anywhere, so it's possible that it is just a qemu-user bug or toolchain bug. It's also possible that the bug is present in the arch/OS-independent code too.

64 commented

This was also reproduced on #684 (specifically, RISC-V failing).

As discussed on discord, the suspected cause is the following sequence of events:

  1. Thread 2 goes to sleep for a second.
  2. Thread 1 cancels thread 2, by first setting tcbCancelTriggerBit in tcb->cancelBits, then, if cancellation is enabled, sending a SIGCANCEL to the thread.
  3. Thread 2 wakes up after tcbCancelTriggerBit was set, but before the signal was sent.
  4. Thread 2 calls sleep again, sees that cancellation was requested, and exits via __mlibc_do_cancel.
  5. Thread 1 finally gets around to sending SIGCANCEL, but it's too late, as thread 2 has quit due to the cancellation request.
  6. Thread 1's call to sys_tgkill fails, and pthread_cancel erroneously forwards the error code.

The solution would be to check for tcb->cancelBits & tcbExitingBit if sys_tgkill fails, and if it's set, ignoring the error.

to reproduce, run yes | parallel strace -e trace=tgkill -e status=failed qemu-riscv64 tests/posix-pthread_cancel on a ubuntu 20.04 machine. you'll need to watch out since the output will be spammed by useless exit codes since ubuntu 20.04 is too old for straces quiet option, but we've gotta match ci

Tange, O. (2022, May 22). GNU Parallel 20220522 ('NATO').
Zenodo. https://doi.org/10.5281/zenodo.6570228