ivmai/libatomic_ops

SIGSEGV in test_malloc on Alpine linux/s390x

TBK opened this issue · 12 comments

TBK commented

As part of Alpine Linux's (musl libc) packaging process we run make check. Unfortunately test_malloc fails on s390x:

================================================
   libatomic_ops 7.6.10: tests/test-suite.log
================================================

# TOTAL: 5
# PASS:  4
# SKIP:  0
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: test_malloc
=================

Performing 1000 reversals of 1000 element lists in 16 threads
Segmentation fault (core dumped)
FAIL test_malloc (exit status: 139)

btw Travis CI also supports arm64, ppc64le and s390x - https://config.travis-ci.com/ref/arch

ivmai commented

It would be good if you provide stack traces of all threads.

ivmai commented

@FROGGS: Tobias, you are the only contributor of s390[x] target (over a decade at least), could you please have a look?

ivmai commented

btw Travis CI also supports arm64, ppc64le and s390x - https://config.travis-ci.com/ref/arch

I have added arm64, pc64le and s390x builds (gcc/clang, -O3) to Travis CI (commit 00fe891)

ivmai commented

As part of Alpine Linux's (musl libc) packaging process we run make check. Unfortunately test_malloc fails on s390x

I can't reproduce the crash on s390x (on Travis CI Ubuntu, not using musl).

TBK commented

It would be good if you provide stack traces of all threads.

I am uncertain if it is sufficient but I ran strace ./test_malloc - https://gitlab.alpinelinux.org/TBK/aports/-/jobs/79038/raw

ivmai commented
  1. How often do you see the crash? I launched >40 s390x builds on Travis in various configuration - no fail.
  2. Please try latest master - I've redirected all operations to GCC built-in atomic ones. (I assume you use gcc-5.4.0 or later.)
ivmai commented

Would be good to get a stack trace of the crash.
Probably relates to #45 (because the stack implementation is almost-lock-free)

ivmai commented

Hello @bhaible,
If you have possibility, could you please check if this bug is reproducible on Alpine linux/s390x on libatomic_ops master (or v7.6.10).
./autogen.sh && ./configure --enable-assertions && make -j check
(probably make check should be launched several times, e.g. 50 times I think should be enough)

If reproducible, what's the stack trace.

the tests pass on s390x at 7.8.0:

Internal ctest changing into directory: /home/demon/src/aports/community/libatomic_ops/src/libatomic_ops-7.8.0/build
Test project /home/demon/src/aports/community/libatomic_ops/src/libatomic_ops-7.8.0/build
    Start 1: test_atomic
1/5 Test #1: test_atomic ......................   Passed    0.21 sec
    Start 2: test_atomic_generalized
2/5 Test #2: test_atomic_generalized ..........   Passed    0.00 sec
    Start 3: test_atomic_pthreads
3/5 Test #3: test_atomic_pthreads .............   Passed    0.09 sec
    Start 4: test_stack
4/5 Test #4: test_stack .......................   Passed    0.49 sec
    Start 5: test_malloc
5/5 Test #5: test_malloc ......................   Passed    0.02 sec

100% tests passed, 0 tests failed out of 5
ivmai commented

Okay, thank you for checking. Closing it as no longer reproducible.

I confirm: On Alpine Linux/s390x, libatomic_ops-7.8.0, configured with --enable-assertions, passes its tests, even 50 times in a row.