enajx/HebbianMetaLearning

stack underflow error

NagisaZj opened this issue · 1 comments

Hi. I am trying to reproduce the CarRacing-V0 results with the following commands:
CUDA_VISIBLE_DEVICES=5 python train_hebb.py --environment CarRacing-v0 --threads 10

Then it reports the error:

........................................................................

Initilisating Hebbian ES for CarRacing-v0 with ABCD_lr Hebbian rule


........................................................................

 ??(????(??? Starting Evolution ?(????(??? ? 

Run: 1665985483

........................................................................

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/zj/anaconda3/envs/hebbian2/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/zj/anaconda3/envs/hebbian2/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/data2/zj/HebbianMetaLearning/evolution_strategy_hebb.py", line 48, in worker_process_hebb_coevo
    r = get_reward_func( hebb_rule,  eng,  init_weights, coeffs, coevolved_parameters) + decay
  File "/data2/zj/HebbianMetaLearning/fitness_functions.py", line 173, in fitness_hebb
    observation, reward, done, info = env.step(action)
  File "/home/zj/anaconda3/envs/hebbian2/lib/python3.8/site-packages/gym/core.py", line 263, in step
    observation, reward, done, info = self.env.step(action)
  File "/home/zj/anaconda3/envs/hebbian2/lib/python3.8/site-packages/gym/core.py", line 263, in step
    observation, reward, done, info = self.env.step(action)
  File "/home/zj/anaconda3/envs/hebbian2/lib/python3.8/site-packages/gym/wrappers/time_limit.py", line 16, in step
    observation, reward, done, info = self.env.step(action)
  File "/home/zj/anaconda3/envs/hebbian2/lib/python3.8/site-packages/gym/envs/box2d/car_racing.py", line 323, in step
    self.state = self.render("state_pixels")
  File "/home/zj/anaconda3/envs/hebbian2/lib/python3.8/site-packages/gym/envs/box2d/car_racing.py", line 399, in render
    self.render_indicators(WINDOW_W, WINDOW_H)
  File "/home/zj/anaconda3/envs/hebbian2/lib/python3.8/site-packages/gym/envs/box2d/car_racing.py", line 469, in render_indicators
    self.score_label.draw()
  File "/home/zj/anaconda3/envs/hebbian2/lib/python3.8/site-packages/pyglet/text/layout.py", line 898, in draw
    self._batch.draw()
  File "/home/zj/anaconda3/envs/hebbian2/lib/python3.8/site-packages/pyglet/graphics/__init__.py", line 557, in draw
    func()
  File "/home/zj/anaconda3/envs/hebbian2/lib/python3.8/site-packages/pyglet/text/layout.py", line 539, in unset_state
    glPopAttrib()
  File "/home/zj/anaconda3/envs/hebbian2/lib/python3.8/site-packages/pyglet/gl/lib.py", line 107, in errcheck
    raise GLException(msg)
pyglet.gl.lib.GLException: b'stack underflow'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "train_hebb.py", line 51, in <module>
    main(sys.argv)
  File "train_hebb.py", line 45, in main
    es.run(args.generations, print_step=args.print_every, path=args.folder)
  File "/data2/zj/HebbianMetaLearning/evolution_strategy_hebb.py", line 344, in run
    rewards = self._get_rewards_coevolved(pool, population, population_coevolved)                       # Compute population fitness:  Step 6   
  File "/data2/zj/HebbianMetaLearning/evolution_strategy_hebb.py", line 272, in _get_rewards_coevolved
    rewards  = pool.map(worker_process_hebb_coevo, worker_args)
  File "/home/zj/anaconda3/envs/hebbian2/lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/zj/anaconda3/envs/hebbian2/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
pyglet.gl.lib.GLException: b'stack underflow'

No errors are reported when I set --threads to 1, so it seems that there is something wrong when env's render and multiprocessing are used together. Do you have any idea? Here's my pip list (I use python 3.8):

akro                         0.0.8
astunparse                   1.6.3
atari-py                     0.2.6
Box2D                        2.3.10
certifi                      2022.9.24
cloudpickle                  1.3.0
cma                          2.7.0
contourpy                    1.0.5
cycler                       0.11.0
Deprecated                   1.2.13
dm-tree                      0.1.6
dowel                        0.0.3
flatbuffers                  2.0
fonttools                    4.37.4
future                       0.18.2
garage                       2021.3.0
gast                         0.4.0
google-pasta                 0.2.0
gym                          0.17.1
gym-notices                  0.0.8
importlib-metadata           5.0.0
keras                        2.7.0
Keras-Preprocessing          1.1.2
kiwisolver                   1.4.4
libclang                     12.0.0
llvmlite                     0.32.1
matplotlib                   3.2.1
mkl-fft                      1.3.1
mkl-random                   1.2.2
mkl-service                  2.4.0
numba                        0.49.0
numpy                        1.23.1
opencv-python                4.2.0.34
opt-einsum                   3.3.0
packaging                    21.3
Pillow                       9.2.0
pip                          22.2.2
pybullet                     2.6.6
pyglet                       1.5.0
pyparsing                    3.0.9
python-dateutil              2.8.2
ray                          1.9.0
redis                        4.0.2
scipy                        1.8.1
setproctitle                 1.2.2
setuptools                   63.4.1
six                          1.16.0
tabulate                     0.8.9
tensorboard-data-server      0.6.1
tensorflow                   2.7.0
tensorflow-estimator         2.7.0
tensorflow-io-gcs-filesystem 0.23.1
tensorflow-probability       0.15.0
termcolor                    1.1.0
torch                        1.6.0
torchvision                  0.8.2
typing_extensions            4.4.0
wheel                        0.37.1
zipp                         3.9.0

I suspect it's caused by package versions. Can you provide the detailed versions of python, gym, pyglet and opencv-python? Thanks a lot.

enajx commented

I can't reproduce the error. If you are running on a headless server, remember to use xvfb-run to run it.

Here's a minimal pip list that runs without errors:


box2d-py           2.3.8
cloudpickle        2.2.0
gym                0.21.0
gym-notices        0.0.8
importlib-metadata 5.0.0
llvmlite           0.39.1
numba              0.56.3
numpy              1.23.4
opencv-python      4.6.0.66
pip                22.1.1
pybullet           3.2.5
pygame             2.1.2
pyglet             1.5.27
setuptools         62.3.2
swig               4.0.2
torch              1.12.1
typing_extensions  4.4.0
zipp               3.9.0