Xingyu-Lin/softgym

Segmentation Fault (OpenGL error)

wilson1yan opened this issue · 3 comments

Hi,

I've been trying to install softgym and am having trouble doing so on a remote machine.

To preface: I have successfully installed + run it on my local machine (Ubuntu 20.04, CUDA 11.3) by compiling in Docker and then running outside of it.

The remote machine is running Ubuntu 18.04, CUDA 10.2, which I believe should be possible to install. I follow the same installation process as my local machine, but when I try to run examples/random_env.py, I get the following error:

$ python examples/random_env.py --env_name RopeFlatten --headless 1
Waiting to generate environment variations. May take 1 minute for each variation...
libGL: Can't open configuration file /etc/drirc: No such file or directory.
libGL: Can't open configuration file /home/wilson/.drirc: No such file or directory.
libGL: Can't open configuration file /etc/drirc: No such file or directory.
libGL: Can't open configuration file /home/wilson/.drirc: No such file or directory.
OpenGL: glRenderbufferStorageMultisample(GL_RENDERBUFFER, samples, GL_RGBA8, width, height) - error Invalid Operation in /workspace/softgym/PyFlex/bindings/opengl/shadersGL.cpp at line 3464
python: /workspace/softgym/PyFlex/bindings/opengl/shader.cpp:81: void glAssert(const char*, long int, const char*): Assertion `0' failed.
Aborted (core dumped)

I think the main error is from the line OpenGL: glRenderbufferStorageMultisample(GL_RENDERBUFFER, samples, GL_RGBA8, width, height) - error Invalid Operation in /workspace/softgym/PyFlex/bindings/opengl/shadersGL.cpp at line 3464 , and am not sure why it's happening.

It might be due to OpenGL not working correctly for some reason or something wrong with it being over ssh + headless / remote. I'm not too familiar with this, so any insight into this issue would help!

I think ssh would not cause any problem. Most likely it's compatibility with GL on the remote machine, or something wrong with linking. Are you also using docker on the remote machine? I use a singularity for remote machines that don't support docker

Okay I finally figured out that the issue was that turning on X11Fowarding in my ssh config was messing things up with OpenGL (maybe due to incompatible remote / local versions or something). Works now, thanks for getting back so quickly!

Hi,

I have met the similar question. I am using Ubuntu 16.04 LTS, CUDA 9.2.148 and Nvidia driver version 440.64, which is the same with softgym test environment. I am testing on the local machine and entirely outside of docker.

I got the similar error as this issue

$ python examples/random_env.py --env_name RopeFlatten --headless 1
Waiting to generate environment variations. May take 1 minute for each variation...
OpenGL: glRenderbufferStorageMultisample(GL_RENDERBUFFER, samples, GL_RGBA8, width, height) - error Invalid Operation in /home/liaolab/softgym_ws/softgym/PyFlex/bindings/opengl/shadersGL.cpp at line 3464
python: /home/liaolab/softgym_ws/softgym/PyFlex/bindings/opengl/shader.cpp:81: void glAssert(const char*, long int, const char*): Assertion `0' failed.
Aborted (core dumped)

Besides, if I did not turn on the headless mode, I got the Segmentation fault mentioned in other issues.

$ python examples/random_env.py --env_name RopeFlatten
Waiting to generate environment variations. May take 1 minute for each variation...
Could not initialize GL extensions
Reshaping
Segmentation fault (core dumped)

Since I am working on a local machine, I wonder how I could handle this "error invalid operation error".

Thank you for any help you can offer!