NVlabs/mimicgen

robomimic train.py error

Dhanushvarma opened this issue · 3 comments

The robomimic train.py script when ran with any config file other than bc.json results in error

The exact command I am running

python train.py --config robomimic/exps/templates/iris.json --dataset mimicgen_environments/datasets/core/stack_three_d0.hdf5

Error :

============= Model Summary =============
ObservationKeyToModalityDict: value not found, adding value to mapping with assumed low_dim modality!
ObservationKeyToModalityDict: mean not found, adding mean to mapping with assumed low_dim modality!
ObservationKeyToModalityDict: logvar not found, adding logvar to mapping with assumed low_dim modality!

I don't see any error in that stack trace - that output is expected. Was there any other error when running this?

SequenceDataset: loading dataset into memory...
  0%|          | 0/1000 [00:00<?, ?it/s]
run failed with error:
'Unable to synchronously open object (component not found)'

Traceback (most recent call last):
  File "train.py", line 378, in main
    train(config, device=device)
  File "train.py", line 137, in train
    trainset, validset = TrainUtils.load_data_for_training(
  File "/home/dpenmets/LIRA_work/robomimic_vd/robomimic/utils/train_utils.py", line 122, in load_data_for_training
    train_dataset = dataset_factory(config, obs_keys, filter_by_attribute=train_filter_by_attribute)
  File "/home/dpenmets/LIRA_work/robomimic_vd/robomimic/utils/train_utils.py", line 166, in dataset_factory
    dataset = SequenceDataset(**ds_kwargs)
  File "/home/dpenmets/LIRA_work/robomimic_vd/robomimic/utils/dataset.py", line 134, in __init__
    self.hdf5_cache = self.load_dataset_in_memory(
  File "/home/dpenmets/LIRA_work/robomimic_vd/robomimic/utils/dataset.py", line 289, in load_dataset_in_memory
    all_data[ep]["next_obs"] = {k: hdf5_file["data/{}/next_obs/{}".format(ep, k)][()] for k in obs_keys}
  File "/home/dpenmets/LIRA_work/robomimic_vd/robomimic/utils/dataset.py", line 289, in <dictcomp>
    all_data[ep]["next_obs"] = {k: hdf5_file["data/{}/next_obs/{}".format(ep, k)][()] for k in obs_keys}
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/dpenmets/miniconda3/envs/robosuite_vd/lib/python3.8/site-packages/h5py/_hl/group.py", line 357, in __getitem__
    oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5o.pyx", line 189, in h5py.h5o.open
KeyError: 'Unable to synchronously open object (component not found)'

Exception ignored in: <function MjRenderContext.__del__ at 0x7fbc1970fd30>
Traceback (most recent call last):
  File "/home/dpenmets/LIRA_work/robosuite_vd/robosuite/utils/binding_utils.py", line 199, in __del__
    self.gl_ctx.free()
  File "/home/dpenmets/LIRA_work/robosuite_vd/robosuite/renderers/context/egl_context.py", line 149, in free
    EGL.eglMakeCurrent(EGL_DISPLAY, EGL.EGL_NO_SURFACE, EGL.EGL_NO_SURFACE, EGL.EGL_NO_CONTEXT)
  File "/home/dpenmets/miniconda3/envs/robosuite_vd/lib/python3.8/site-packages/OpenGL/error.py", line 230, in glCheckError
    raise self._errorClass(
OpenGL.raw.EGL._errors.EGLError: EGLError(
        err = EGL_NOT_INITIALIZED,
        baseOperation = eglMakeCurrent,
        cArguments = (
                <OpenGL._opaque.EGLDisplay_pointer object at 0x7fbc11ae0140>,
                <OpenGL._opaque.EGLSurface_pointer object at 0x7fbc1b835ec0>,
                <OpenGL._opaque.EGLSurface_pointer object at 0x7fbc1b835ec0>,
                <OpenGL._opaque.EGLContext_pointer object at 0x7fbc1b835b40>,
        ),
        result = 0
)
Exception ignored in: <function EGLGLContext.__del__ at 0x7fbc1970fb80>
Traceback (most recent call last):
  File "/home/dpenmets/LIRA_work/robosuite_vd/robosuite/renderers/context/egl_context.py", line 155, in __del__
    self.free()
  File "/home/dpenmets/LIRA_work/robosuite_vd/robosuite/renderers/context/egl_context.py", line 149, in free
    EGL.eglMakeCurrent(EGL_DISPLAY, EGL.EGL_NO_SURFACE, EGL.EGL_NO_SURFACE, EGL.EGL_NO_CONTEXT)
  File "/home/dpenmets/miniconda3/envs/robosuite_vd/lib/python3.8/site-packages/OpenGL/error.py", line 230, in glCheckError
    raise self._errorClass(
OpenGL.raw.EGL._errors.EGLError: EGLError(
        err = EGL_NOT_INITIALIZED,
        baseOperation = eglMakeCurrent,
        cArguments = (
                <OpenGL._opaque.EGLDisplay_pointer object at 0x7fbc11ae0140>,
                <OpenGL._opaque.EGLSurface_pointer object at 0x7fbc1b835ec0>,
                <OpenGL._opaque.EGLSurface_pointer object at 0x7fbc1b835ec0>,
                <OpenGL._opaque.EGLContext_pointer object at 0x7fbc1b835b40>,
        ),
        result = 0
)```

This is because we did not include "next_obs" in order to save space for these datasets - see this link for more information on extracting observations, including the "next_obs" key (which is generally only needed for offline RL methods like IRIS). You can run this on the datasets yourself to fix this issue.