open-mpi/hwloc

GL backend hangs if something else is listening to port 600x

bgoglin opened this issue · 0 comments

The GL backend hangs in XOpenDisplay() if the corresponding port (600x for Xserver :x.0) is open by somebody else. This has been reported in OMPI open-mpi/ompi#10025 and observed by several other people. It may also be related to #517 which I couldn't debug.

The issue is that the GL backend iterates over displays :0, :1, ... until :9 and has no way to be sure there's a valid X server listening to the port. We could try to check whether /tmp/.X11-unix/X0 exists before accessing :0 but I don't know how reliable/portable this is.

Maybe we should only try :1 only if :0 succeeded (or failed with an error saying there's a Xserver but I couldn't open it). Or maybe we should only look at the current DISPLAY? I don't know if people are still using different displays nowadays.

@eile and @marwan-abdellah do you still care? how do you feel about restricting the code to some displays, at least by default?