brainvisa/anatomist-gpl

headless anatomist segfault in soma-forge workspace

Opened this issue · 4 comments

Describe the bug
Segfault when trying to instantiate headless anatomist within soma-forge workspace.
This problem might be related to this issue.

To Reproduce
Steps to reproduce the behavior:

  1. create a soma-forge workspace
  2. vglrun ipython
  3. run the following script:
import anatomist.headless as hana
a = hana.HeadlessAnatomist() # segfault here
w = a.createWindow('3D')
w.snapshot('snapshot.jpg', width=3000, height=2500)

The output is:

The XKEYBOARD keymap compiler (xkbcomp) reports:
> Internal error:   Could not resolve keysym XF86AudioPreset
> Internal error:   Could not resolve keysym XF86MonBrightnessCycle
> Internal error:   Could not resolve keysym XF86WWAN
> Internal error:   Could not resolve keysym XF86RFKill
> Internal error:   Could not resolve keysym XF86Keyboard
> Internal error:   Could not resolve keysym XF86RotationLockToggle
> Internal error:   Could not resolve keysym XF86FullScreen
Errors from xkbcomp are not fatal to the X server
VirtualGL found.
The XKEYBOARD keymap compiler (xkbcomp) reports:
> Internal error:   Could not resolve keysym XF86AudioPreset
> Internal error:   Could not resolve keysym XF86MonBrightnessCycle
> Internal error:   Could not resolve keysym XF86WWAN
> Internal error:   Could not resolve keysym XF86RFKill
> Internal error:   Could not resolve keysym XF86Keyboard
> Internal error:   Could not resolve keysym XF86RotationLockToggle
> Internal error:   Could not resolve keysym XF86FullScreen
Errors from xkbcomp are not fatal to the X server
The XKEYBOARD keymap compiler (xkbcomp) reports:
> Internal error:   Could not resolve keysym XF86AudioPreset
> Internal error:   Could not resolve keysym XF86MonBrightnessCycle
> Internal error:   Could not resolve keysym XF86WWAN
> Internal error:   Could not resolve keysym XF86RFKill
> Internal error:   Could not resolve keysym XF86Keyboard
> Internal error:   Could not resolve keysym XF86RotationLockToggle
> Internal error:   Could not resolve keysym XF86FullScreen
Errors from xkbcomp are not fatal to the X server
VirtualGL should work.
Running through VirtualGL + Xvfb: this is optimal.
The XKEYBOARD keymap compiler (xkbcomp) reports:
> Internal error:   Could not resolve keysym XF86AudioPreset
> Internal error:   Could not resolve keysym XF86MonBrightnessCycle
> Internal error:   Could not resolve keysym XF86WWAN
> Internal error:   Could not resolve keysym XF86RFKill
> Internal error:   Could not resolve keysym XF86Keyboard
> Internal error:   Could not resolve keysym XF86RotationLockToggle
> Internal error:   Could not resolve keysym XF86FullScreen
Errors from xkbcomp are not fatal to the X server
existing QApplication: 0
build a QApplication
create qapp
Invalid MIT-MAGIC-COOKIE-1 key
[VGL] ERROR: Could not open display :0.

Expected behavior
Creation of a useless empty image snapshot.jpg

During my tests, a :2 display was created and os.environ['DISPLAY'] had the right value. The error is in the creation of AnatomistSip instance. Due to the error message about display :0, I tried to add ('-display', ':2') to AnatomistSip args but it did not change anything.

You normally don't need to call vglrun ipython but just python or ipython: anatomist.headless takes care of loading vglrun (if it can...).
Anyway, I get also an error, which is a bit different (or a bit more informative):

VirtualGL found.
VirtualGL should work.
Running through VirtualGL + Xvfb: this is optimal.
existing QApplication: 0
create qapp
[VGL] ERROR: Could not load GLX/OpenGL functions
[VGL]    /volatile/riviere/casa-distro/conda/workspace/.pixi/envs/default/lib/python3.10/lib-dynload/../../libvglfaker.so: undefined symbol: glXGetProcAddressARB

So virtualGL does not succeed in getting the right GLX library with the symbol glXGetProcAddressARB in it.

Interestingly, if I run (from the pixi environment):

vglrun glxgears

or:

vglrun anatomist-bin

they seem to start normally and work. Not sure what vglrun does in this situation, but at least it doesn't complain.
However if I do:

vglrun anatomist

then I get the error:

[VGL] ERROR: Could not load GLX/OpenGL functions
[VGL]    /volatile/riviere/casa-distro/conda/workspace/.pixi/envs/default/bin/../lib/libvglfaker.so: undefined symbol: glXGetProcAddressARB

the difference between the 2 runs of anatomist is that the former one is a C++ binary executable, whereas the latter is a python script which uses pyanatomist in an ipython kernel.
Now this is just an observation, and I have no explanation of these behaviors...

There may be a C++ dependency that is taken from the host system with different symbols ? Not sure because the symbol seems to be a C symbol. Anyway, I will try from within a container to be sure that it only uses libraries from the conda environment.

Possibly. I don't know how VirtualGL gets the GL/GLX libs, but anatomist-bin is linked against these ones:

$ ldd /volatile/riviere/casa-distro/conda/workspace/build/bin/anatomist-bin | fgrep GL
        libQt5OpenGL.so.5 => /volatile/riviere/casa-distro/conda/workspace/.pixi/envs/default/lib/libQt5OpenGL.so.5 (0x00007828f37a5000)
        libGL.so.1 => /lib/x86_64-linux-gnu/libGL.so.1 (0x00007828f2979000)
        libGLdispatch.so.0 => /lib/x86_64-linux-gnu/libGLdispatch.so.0 (0x00007828f1592000)
        libGLX.so.0 => /lib/x86_64-linux-gnu/libGLX.so.0 (0x00007828f3725000)
        libGLU.so.1 => /volatile/riviere/casa-distro/conda/workspace/.pixi/envs/default/lib/libGLU.so.1 (0x00007828f151c000)
        libOpenGL.so.0 => /lib/x86_64-linux-gnu/libOpenGL.so.0 (0x00007828e91d4000)