Gymnasium AsyncVectorEnv for individual metaworld env
Chaoqi-LIU opened this issue · 11 comments
as title, when I try to create a async vec env wrapper for metaworld 2.0.0 version, it failed because gymnasium.envs.mujoco.mujoco_rendering
involves
def _import_egl(width, height):
from mujoco.egl import GLContext
return GLContext(width, height)
in which contains
class GLContext:
"""An EGL context for headless accelerated OpenGL rendering on GPU devices."""
def __init__(self, max_width, max_height):
del max_width, max_height # unused
num_configs = ctypes.c_long() <---- ctypes
config_size = 1
config = EGL.EGLConfig()
EGL.eglReleaseThread()
EGL.eglChooseConfig(
EGL_DISPLAY,
EGL_ATTRIBUTES,
ctypes.byref(config),
config_size,
num_configs)
if num_configs.value < 1:
[skip]
Therefore, we got error from multiprocessing
, saying that ValueError: ctypes objects containing pointers cannot be pickled
, which then cause EGL / viewers / renderer breaks. When I use SyncVectorEnv as the wrapper, everything is fine.
Can ppl provide me a way to async vectorize individual env in metaworld? Thanks.
Do you need to use async and not sync? If so please raise the error in gymnasium not meta world as we need to fix it there
yes, I need async. I'll raise a new issue in gymnasium. Thanks
Just FYI, so that maybe this is not a problem with gymnasium but is a mistake I made, here's the minimal code to reproduce the error I got:
import metaworld
import gymnasium
from gymnasium.envs.mujoco.mujoco_rendering import MujocoRenderer
from gymnasium.vector.async_vector_env import AsyncVectorEnv
task_name = 'shelf-place-v2-goal-observable'
seed = 42
n_env = 2
class MetaworldEnv(gymnasium.Env):
metadata = {'render_modes': ['rgb_array', 'depth_array']}
def __init__(self):
self.env = metaworld.envs.ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE[task_name](seed=seed)
self.observation_space = self.env.observation_space
self.action_space = self.env.action_space
self.corner_renderer = MujocoRenderer(
self.env.model, self.env.data, None,
128, 128, 1000, None, 'corner', {}
)
self.render_mode = 'rgb_array'
def _get_rgb(self):
# NOTE: I'm using more than one camera in actual implementation,
# here is just for error reproduction, so I'm using only one camera
return {
'default': self.env.mujoco_renderer.render('rgb_array'), # <-- This is the default metaworld
'corner': self.corner_renderer.render('rgb_array'), # renderer, and it also cannot
} # work. Mine too.
def reset(self, **kwargs):
self.env.reset()
self.env.reset_model()
state = self.env.reset()
obs_dict = self._get_rgb() # <-- This is the line that causes the error
obs_dict['full_state'] = state
return obs_dict
def env_fn():
return MetaworldEnv()
env = AsyncVectorEnv([env_fn for _ in range(n_env)])
env.reset()
And the error message I got:
Exception ignored in: <function GLContext.__del__ at 0x7ec273eb8430>
Traceback (most recent call last):
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/site-packages/mujoco/egl/__init__.py", line 130, in __del__
self.free()
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/site-packages/mujoco/egl/__init__.py", line 120, in free
if self._context:
AttributeError: 'GLContext' object has no attribute '_context'
Exception ignored in: <function OffScreenViewer.__del__ at 0x7ec37d226790>
Traceback (most recent call last):
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/site-packages/gymnasium/envs/mujoco/mujoco_rendering.py", line 202, in __del__
self.free()
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/site-packages/gymnasium/envs/mujoco/mujoco_rendering.py", line 199, in free
self.opengl_context.free()
AttributeError: 'OffScreenViewer' object has no attribute 'opengl_context'
Exception ignored in: <function GLContext.__del__ at 0x7ec273eb8430>
Traceback (most recent call last):
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/site-packages/mujoco/egl/__init__.py", line 130, in __del__
self.free()
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/site-packages/mujoco/egl/__init__.py", line 120, in free
if self._context:
AttributeError: 'GLContext' object has no attribute '_context'
Exception ignored in: <function OffScreenViewer.__del__ at 0x7ec37d226790>
Traceback (most recent call last):
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/site-packages/gymnasium/envs/mujoco/mujoco_rendering.py", line 202, in __del__
self.free()
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/site-packages/gymnasium/envs/mujoco/mujoco_rendering.py", line 199, in free
self.opengl_context.free()
AttributeError: 'OffScreenViewer' object has no attribute 'opengl_context'
^CTraceback (most recent call last):
File "/home/chaoqi/Desktop/gym_meta_bug/reproduce.py", line 40, in <module>
env.reset()
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/site-packages/gymnasium/vector/async_vector_env.py", line 226, in reset
return self.reset_wait()
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/site-packages/gymnasium/vector/async_vector_env.py", line 299, in reset_wait
self._raise_if_errors(successes)
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/site-packages/gymnasium/vector/async_vector_env.py", line 623, in _raise_if_errors
index, exctype, value = self.error_queue.get()
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/queues.py", line 103, in get
res = self._recv_bytes()
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/connection.py", line 414, in _recv_bytes
buf = self._recv(4)
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
Are you using a system (ie a Ubuntu server) that doesn't have a display output by default? If so, this isn't an issue with Metaworld or Gymnasium. It's an issue with your server not having a display driver. You can look into EGL or osmesa rendering options.
Hi, I have display output, and actually with metaworld 0.1.0 + gym + mujoco-py, I can display properly with EGL, and for macOS, I also have display, the error "AttributeError: 'OffScreenViewer' object has no attribute 'opengl_context'", is caused by pickle not able to handle ctype, and it breaks the instantiation of GLContext(width, height). You can refer to my initial comment, where during the instantiation of GLContext, ctype appears and cannot be handled.
It can be helpful if ppl try to run the code I provided, and see if they can make it through.
What operating system are you using? Your error trace seems to be from a Linux based system. If it's Linux, are you running a Ubuntu variant? Would it be the desktop or server Ubuntu variant?
This is ubunbu 22.04, macOS 14.4.1, gymnasium 1.0.0a2, metaworld 2.0.0, mujoco 2.3.7.
Thanks for redirect me to this issue. I'm currently busy with some ddl. I'll try this fix and report how macOS and ubuntu w/ and w/o this multiprocessing.set_start_method('spawn')
behave tomorrow. Thank you again. 🙂
I'm a bit confused about where to add this line multiprocessing.set_start_method('spawn')
. I tried two places, a) inside async vec env file, I put this line at the very beginning, and b) the very beginning of the program / main. The two places both generated the exact same error. Here's the error I got
- with
mp.set_start_method('spawn')
I observe
MetaworldEnv: task=basketball-v2-goal-observable, seed=None
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/runpy.py", line 288, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/chaoqi/Desktop/mrp-dev/test.py", line 12, in <module>
multiprocessing.set_start_method('spawn')
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/context.py", line 243, in set_start_method
raise RuntimeError('context has already been set')
RuntimeError: context has already been set
Traceback (most recent call last):
File "/home/chaoqi/Desktop/mrp-dev/test.py", line 30, in <module>
env = AsyncVectorEnv(env_fns)
File "/home/chaoqi/Desktop/mrp-dev/modular_policy/gymnasium_util/async_vector_env.py", line 177, in __init__
self._check_spaces()
File "/home/chaoqi/Desktop/mrp-dev/modular_policy/gymnasium_util/async_vector_env.py", line 640, in _check_spaces
pipe.send(("_check_spaces", spaces))
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/connection.py", line 405, in _send_bytes
self._send(buf)
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
- with
mp.set_start_method('spawn', force=True)
I observe
MetaworldEnv: task=basketball-v2-goal-observable, seed=None
MetaworldEnv: task=basketball-v2-goal-observable, seed=None
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/runpy.py", line 288, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/chaoqi/Desktop/mrp-dev/test.py", line 30, in <module>
env = AsyncVectorEnv(env_fns)
File "/home/chaoqi/Desktop/mrp-dev/modular_policy/gymnasium_util/async_vector_env.py", line 173, in __init__
process.start()
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 42, in _launch
prep_data = spawn.get_preparation_data(process_obj._name)
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Traceback (most recent call last):
File "/home/chaoqi/Desktop/mrp-dev/test.py", line 30, in <module>
env = AsyncVectorEnv(env_fns)
File "/home/chaoqi/Desktop/mrp-dev/modular_policy/gymnasium_util/async_vector_env.py", line 177, in __init__
self._check_spaces()
File "/home/chaoqi/Desktop/mrp-dev/modular_policy/gymnasium_util/async_vector_env.py", line 640, in _check_spaces
pipe.send(("_check_spaces", spaces))
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/connection.py", line 405, in _send_bytes
self._send(buf)
File "/home/chaoqi/anaconda3/envs/mrp-beta/lib/python3.9/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
I'm on ubuntu 22.04, I'm using the display, sitting right in front of the computer. You might notice that I'm calling my customized async vector env, but that's just direct copy&paste from gymnasium 1.0.0a2 release.
I think you can fix both issues with the same fix. Move all of your non class code into a if __name__ == "__main__":
block. Multiprocessing needs to create a complete copy of the context for your code to work. Right now everytime a new process is made it calls all of your non-class code.
import multiprocessing as mp
import metaworld
import gymnasium
from gymnasium.envs.mujoco.mujoco_rendering import MujocoRenderer
from gymnasium.vector.async_vector_env import AsyncVectorEnv
# class and function definitions from above code
if __name__ == "__main__":
task_name = 'shelf-place-v2-goal-observable'
seed = 42
n_env = 2
env = AsyncVectorEnv([env_fn for _ in range(n_env)])
env.reset()
yes you are right, I was in a rush and didn't realize this was causing problem, moving everything to main is the fix. Thanks, I'll close that issue in Gymnasium as well.