JuliaParallel/MPI.jl

Threads tests segfault on Julia nightly

Opened this issue · 0 comments

Unrelated to this PR, something is horribly broken on master (Julia Version 1.12.0-DEV.1641 Commit 7fa26f011ec (2024-11-16 19:20 UTC)) with OpenMPI_jll:

malloc(): unaligned tcache chunk detected

[3920] signal 6 (-6): Aborted
in expression starting at /home/runner/work/MPI.jl/MPI.jl/test/test_threads.jl:18

[3920] signal 11 (1): Segmentation fault
in expression starting at /home/runner/work/MPI.jl/MPI.jl/test/test_threads.jl:18

[3918] signal 15: Terminated
in expression starting at /home/runner/work/MPI.jl/MPI.jl/test/test_threads.jl:39
unknown function (ip: 0x7f63d4691115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
jl_parallel_gc_threadfun at /cache/build/builder-amdci5-5/julialang/julia-master/src/gc-stock.c:3544
unknown function (ip: 0x7f63d4694ac2) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: 0x7f63d472684f) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: (nil)) at (unknown file)
unknown function (ip: 0x7f63d4691115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
jl_parallel_gc_threadfun at /cache/build/builder-amdci5-5/julialang/julia-master/src/gc-stock.c:3544
unknown function (ip: 0x7f63d4694ac2) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: 0x7f63d472684f) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: (nil)) at (unknown file)
unknown function (ip: 0x7f63d4691115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
jl_parallel_gc_threadfun at /cache/build/builder-amdci5-5/julialang/julia-master/src/gc-stock.c:3544
unknown function (ip: 0x7f63d4694ac2) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: 0x7f63d472684f) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: (nil)) at (unknown file)
unknown function (ip: 0x7f63d4691115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
ijl_task_get_next at /cache/build/builder-amdci5-5/julialang/julia-master/src/scheduler.c:520
poptask at ./task.jl:1163
wait at ./task.jl:1172
task_done_hook at ./task.jl:839
jfptr_task_done_hook_97036.1 at /opt/hostedtoolcache/julia/nightly/x64/lib/julia/sys.so (unknown line)
jl_apply at /cache/build/builder-amdci5-5/julialang/julia-master/src/julia.h:2240 [inlined]
jl_finish_task at /cache/build/builder-amdci5-5/julialang/julia-master/src/task.c:338
start_task at /cache/build/builder-amdci5-5/julialang/julia-master/src/task.c:1274
unknown function (ip: (nil)) at (unknown file)
unknown function (ip: 0x7f63d4691115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
ijl_task_get_next at /cache/build/builder-amdci5-5/julialang/julia-master/src/scheduler.c:520
poptask at ./task.jl:1163
wait at ./task.jl:1172
task_done_hook at ./task.jl:839

[3917] signal 15: Terminated
in expression starting at /home/runner/work/MPI.jl/MPI.jl/test/test_threads.jl:39
unknown function (ip: 0x7fb132e91115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
jl_parallel_gc_threadfun at /cache/build/builder-amdci5-5/julialang/julia-master/src/gc-stock.c:3544
unknown function (ip: 0x7fb132e94ac2) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: 0x7fb132f2684f) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: (nil)) at (unknown file)
unknown function (ip: 0x7fb132e91115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
jl_parallel_gc_threadfun at /cache/build/builder-amdci5-5/julialang/julia-master/src/gc-stock.c:3544
unknown function (ip: 0x7fb132e94ac2) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: 0x7fb132f2684f) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: (nil)) at (unknown file)
unknown function (ip: 0x7fb132e91115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
jl_parallel_gc_threadfun at /cache/build/builder-amdci5-5/julialang/julia-master/src/gc-stock.c:3544
unknown function (ip: 0x7fb132e94ac2) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: 0x7fb132f2684f) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: (nil)) at (unknown file)
unknown function (ip: 0x7fb132e91115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
ijl_task_get_next at /cache/build/builder-amdci5-5/julialang/julia-master/src/scheduler.c:520
poptask at ./task.jl:1163
wait at ./task.jl:1172
task_done_hook at ./task.jl:839
jfptr_task_done_hook_97036.1 at /opt/hostedtoolcache/julia/nightly/x64/lib/julia/sys.so (unknown line)
jl_apply at /cache/build/builder-amdci5-5/julialang/julia-master/src/julia.h:2240 [inlined]
jl_finish_task at /cache/build/builder-amdci5-5/julialang/julia-master/src/task.c:338
start_task at /cache/build/builder-amdci5-5/julialang/julia-master/src/task.c:1274
unknown function (ip: (nil)) at (unknown file)
unknown function (ip: 0x7fb132e91115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
ijl_task_get_next at /cache/build/builder-amdci5-5/julialang/julia-master/src/scheduler.c:520
poptask at ./task.jl:1163
wait at ./task.jl:1172
task_done_hook at ./task.jl:839
jfptr_task_done_hook_97036.1 at /opt/hostedtoolcache/julia/nightly/x64/lib/julia/sys.so (unknown line)
jl_apply at /cache/build/builder-amdci5-5/julialang/julia-master/src/julia.h:2240 [inlined]
jl_finish_task at /cache/build/builder-amdci5-5/julialang/julia-master/src/task.c:338
start_task at /cache/build/builder-amdci5-5/julialang/julia-master/src/task.c:1274
unknown function (ip: (nil)) at (unknown file)
unknown function (ip: 0x7fb132e91115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
ijl_task_get_next at /cache/build/builder-amdci5-5/julialang/julia-master/src/scheduler.c:520
poptask at ./task.jl:1163
wait at ./task.jl:1172
task_done_hook at ./task.jl:839
jfptr_task_done_hook_97036.1 at /opt/hostedtoolcache/julia/nightly/x64/lib/julia/sys.so (unknown line)
jl_apply at /cache/build/builder-amdci5-5/julialang/julia-master/src/julia.h:2240 [inlined]
jl_finish_task at /cache/build/builder-amdci5-5/julialang/julia-master/src/task.c:338
start_task at /cache/build/builder-amdci5-5/julialang/julia-master/src/task.c:1274
unknown function (ip: (nil)) at (unknown file)
clock_nanosleep at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
__nanosleep at /lib/x86_64-linux-gnu/libc.so.6 (unknown line
[3916] signal 15: Terminated
in expression starting at /home/runner/work/MPI.jl/MPI.jl/test/test_threads.jl:18
unknown function (ip: 0x7fa2b4c91115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
jl_parallel_gc_threadfun at /cache/build/builder-amdci5-5/julialang/julia-master/src/gc-stock.c:3544
unknown function (ip: 0x7fa2b4c94ac2) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: 0x7fa2b4d2684f) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: (nil)) at (unknown file)
unknown function (ip: 0x7fa2b4c91115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
jl_parallel_gc_threadfun at /cache/build/builder-amdci5-5/julialang/julia-master/src/gc-stock.c:3544
unknown function (ip: 0x7fa2b4c94ac2) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: 0x7fa2b4d2684f) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: (nil)) at (unknown file)
unknown function (ip: 0x7fa2b4c91115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
jl_parallel_gc_threadfun at /cache/build/builder-amdci5-5/julialang/julia-master/src/gc-stock.c:3544
unknown function (ip: 0x7fa2b4c94ac2) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: 0x7fa2b4d2684f) at /lib/x86_64-linux-gnu/libc.so.6
unknown function (ip: (nil)) at (unknown file)
unknown function (ip: 0x7fa2b4c91115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
ijl_task_get_next at /cache/build/builder-amdci5-5/julialang/julia-master/src/scheduler.c:520
poptask at ./task.jl:1163
wait at ./task.jl:1172
task_done_hook at ./task.jl:839
jfptr_task_done_hook_97036.1 at /opt/hostedtoolcache/julia/nightly/x64/lib/julia/sys.so (unknown line)
jl_apply at /cache/build/builder-amdci5-5/julialang/julia-master/src/julia.h:2240 [inlined]
jl_finish_task at /cache/build/builder-amdci5-5/julialang/julia-master/src/task.c:338
start_task at /cache/build/builder-amdci5-5/julialang/julia-master/src/task.c:1274
unknown function (ip: (nil)) at (unknown file)
unknown function (ip: 0x7fa2b4c91115) at /lib/x86_64-linux-gnu/libc.so.6
pthread_cond_wait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv_cond_wait at /workspace/srcdir/libuv/src/unix/thread.c:822
ijl_task_get_next at /cache/build/builder-amdci5-5/julialang/julia-master/src/scheduler.c:520
poptask at ./task.jl:1163
wait at ./task.jl:1172
task_done_hook at ./task.jl:839
)
usleep at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
ompi_mpi_finalize at /home/runner/.julia/artifacts/c519b6f9838786c8d97506fb28f7e10dfc74b9a3/lib/libmpi.so (unknown line)
MPI_Finalize at /home/runner/work/MPI.jl/MPI.jl/src/api/generated_api.jl:1872 [inlined]
Finalize at /home/runner/work/MPI.jl/MPI.jl/src/environment.jl:263
unknown function (ip: 0x7fb0f9b1b51f) at (unknown file)
jl_apply at /cache/build/builder-amdci5-5/julialang/julia-master/src/julia.h:2240 [inlined]
do_call at /cache/build/builder-amdci5-5/julialang/julia-master/src/interpreter.c:125
eval_value at /cache/build/builder-amdci5-5/julialang/julia-master/src/interpreter.c:222
eval_stmt_value at /cache/build/builder-amdci5-5/julialang/julia-master/src/interpreter.c:173 [inlined]
eval_body at /cache/build/builder-amdci5-5/julialang/julia-master/src/interpreter.c:684
jl_interpret_toplevel_thunk at /cache/build/builder-amdci5-5/julialang/julia-master/src/interpreter.c:895
jl_toplevel_eval_flex at /cache/build/builder-amdci5-5/julialang/julia-master/src/toplevel.c:1065
jl_toplevel_eval_flex at /cache/build/builder-amdci5-5/julialang/julia-master/src/toplevel.c:1005
ijl_toplevel_eval at /cache/build/builder-amdci5-5/julialang/julia-master/src/toplevel.c:1076
ijl_toplevel_eval_in at /cache/build/builder-amdci5-5/julialang/julia-master/src/toplevel.c:1118
eval at ./boot.jl:460
include_string at ./loading.jl:2839
_include at ./loading.jl:2899

For the record, last successful run was with Julia Version 1.12.0-DEV.1263 Commit 17445fe752b (2024-09-29 09:41 UTC) (this may help with bisection). Version of OpenMPI_jll is v5.0.5+0 in both cases, so that doesn't seem to be relevant.

Originally posted by @giordano in #887 (comment)

Update: tests passed in https://github.com/JuliaParallel/MPI.jl/actions/runs/12728449076/job/35479046456?pr=887 with Julia Version 1.12.0-DEV.1872 Commit db8cc484fd8 (2025-01-10 20:34 UTC), so the issue may have been fixed already. Leaving the ticket open as a reminder to double check all is good.