zeromq/zmqpp

Thread hangs while creating ZMQ context

yogeshj83 opened this issue · 3 comments

In my project I'm using ZMQ 4.2.2 and ZMQPP for communication between the two daemons. Platform is RHEL 6 and RHEL 7.
But when a daemon tries to get the zmqpp::context object, it goes in a sleep state. Below is the gstack o/p:
#0 0x00007eff26c26650 in __nanosleep_nocancel () from /lib64/libc.so.6
#1 0x00007eff26c26504 in sleep () from /lib64/libc.so.6
#2 0x00007eff2545f4f1 in randombytes (x=0x7ffda14c8570 "\020(h\001", xlen=4) at src/tweetnacl.c:928
#3 0x00007eff253ea8dc in zmq::ctx_t::ctx_t (this=0x16ee230) at src/ctx.cpp:98
#4 0x00007eff25456260 in zmq_ctx_new () at src/zmq.cpp:163
#5 0x00007eff2653fbf4 in context (this=0x7eff267e0508 IPCCommunication::getInstance()::Instance+8) at /home/yogesh_joshi/workspaces/680/build/tools/zmq/4.2.2/include/zmqpp/context.hpp:79
#6 IPCCommunication::IPCCommunication (this=0x7eff267e0500 IPCCommunication::getInstance()::Instance) at ../tools/ipc/IPCCommunication.cpp:6

It remains in this state for long time. I'm not sure what is happening here.
I have a daemon which loads a shared object which in turn links to libzmq.so and libzmqpp.so

Any help will be appreciated.

-Yogesh

At a guess you don't have enough entropy for the randombytes function to return. I don't know enough about tweetnacl but I do know that is a thing that can occur when pulling from dev/rand.

I'm not sure what to suggest as a solution here though sorry.

Actually I doubt if it is an issue with entropy. As I mentioned, Im using zmq for communication between two daemons on the same machine. The other daemon is successfully able to get the zmqpp::context. Whereas, this daemon hangs while doing same.
The only noticeable difference I can see between the two daemons, I'm linking libzmq and libzmqpp directly to the executable of a daemon (one that is working). Whereas the daemon which has issue loads a shared library which in turn links to libzmq.so and libzmqpp.so.
Not sure if this is relevant but this is the difference I can see.
Do you think if this issue can be related to GCC version?

-Yogesh

While going through the ZMQ site, I came across their latest release viz 4.2.3. In their release notes I noticed that they have fixed one race condition related tweetnacl
https://github.com/zeromq/libzmq/releases/tag/v4.2.3
Fixed #2632 - Fix file descriptor leak when using Tweetnacl (internal NACL implementation) instead of Libsodium, and fix race condition when using multiple ZMQ contexts with Tweetnacl

My issue got resolved when I compiled my code with ZMQ 4.2.3

Regards,
Yogesh