facebookincubator/gloo

Exception during initialization terminates device thread

Closed this issue · 0 comments

As part of the fix, the handleEvents function should be marked noexcept if possible...

Backtrace:

#0  0x00007fc6c6854269 in raise (sig=6) at ../sysdeps/unix/sysv/linux/pt-raise.c:35
#1  0x00007fc6ccefe611 in backward::SignalHandling::sig_handler (signo=6, info=0x7fc6545ec6f0, _ctx=0x7fc6545ec5c0) at ../3rdparty/backward/backward.hpp:2094
#2  <signal handler called>
#3  0x00007fc6c5c01428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#4  0x00007fc6c5c0302a in __GI_abort () at abort.c:89
#5  0x00007fc6c65438f7 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007fc6c6549a46 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007fc6c6549a81 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007fc6c6549a3a in std::rethrow_exception(std::__exception_ptr::exception_ptr) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007fc6cb4b3d4f in gloo::transport::tcp::Pair::signalAndThrowException (this=this@entry=0x7fc628001c00, ex=...) at /workspace/3rdparty/pytorch/third_party/gloo/gloo/transport/tcp/pair.cc:1158
#10 0x00007fc6cb4b3dc1 in gloo::transport::tcp::Pair::signalAndThrowException (this=this@entry=0x7fc628001c00,
    msg="[/workspace/3rdparty/pytorch/third_party/gloo/gloo/transport/tcp/pair.cc:724] connect [100.97.72.168]:166: Connection refused")
    at /workspace/3rdparty/pytorch/third_party/gloo/gloo/transport/tcp/pair.cc:1153
#11 0x00007fc6cb4b4772 in gloo::transport::tcp::Pair::handleConnecting (this=this@entry=0x7fc628001c00) at /workspace/3rdparty/pytorch/third_party/gloo/gloo/transport/tcp/pair.cc:724
#12 0x00007fc6cb4b724a in gloo::transport::tcp::Pair::handleEvents (this=0x7fc628001c00, events=29) at /workspace/3rdparty/pytorch/third_party/gloo/gloo/transport/tcp/pair.cc:683
#13 0x00007fc6cb4a5b22 in gloo::transport::tcp::Device::loop (this=0x7fc6280014b0) at /workspace/3rdparty/pytorch/third_party/gloo/gloo/transport/tcp/device.cc:300
#14 0x00007fc6c65748f0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#15 0x00007fc6c684a6ba in start_thread (arg=0x7fc6545f5700) at pthread_create.c:333
#16 0x00007fc6c5cd341d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109