Bug of cleanrl_utils/evals

Problem Description

The eval model feature cannot be used.

The main reason is that #370 migrates the implementation of dqn algorithm from gym to gymnasium, and eval module still uses gym as the environment for evaluation.

Checklist

I have installed dependencies via poetry install (see CleanRL's installation guideline.
I have checked that there is no similar issue in the repo.
I have checked the documentation site and found not relevant information in GitHub issues.

Current Behavior

Expected Behavior

Possible Solution

Update the cleanrl_utils/evals/* module.

Steps to Reproduce

Running the command python dqn_atari_jax.py --save-model True --total-timesteps 2000 --learning-starts 1000 will throw an error:

Traceback (most recent call last):
  File "/home/server/ZYX/cleanrl/dqn_atari_jax.py", line 295, in <module>
    episodic_returns = evaluate(
  File "/home/server/ZYX/cleanrl/cleanrl_utils/evals/dqn_jax_eval.py", line 22, in evaluate
    envs = gym.vector.SyncVectorEnv([make_env(env_id, 0, 0, capture_video, run_name)])
  File "/home/server/anaconda3/envs/zyx_cleanrl_pr344/lib/python3.9/site-packages/gym/vector/sync_vector_env.py", line 64, in __init__
    super().__init__(
  File "/home/server/anaconda3/envs/zyx_cleanrl_pr344/lib/python3.9/site-packages/gym/vector/vector_env.py", line 38, in __init__
    self.observation_space = batch_space(observation_space, n=num_envs)
  File "/home/server/anaconda3/envs/zyx_cleanrl_pr344/lib/python3.9/functools.py", line 888, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/home/server/anaconda3/envs/zyx_cleanrl_pr344/lib/python3.9/site-packages/gym/vector/utils/spaces.py", line 40, in batch_space
    raise ValueError(
ValueError: Cannot batch space with type `<class 'gymnasium.spaces.box.Box'>`. The space must be a valid `gym.Space` instance.
free(): invalid pointer
Aborted (core dumped)

Ah nice catch! Thanks for the report. Would you be up to creating a fix for this? We should also add end-to-end test cases for the evaluation scripts.

I am working on this and creating a PR later. I haven't figured out how to write the test yet.

Thanks! Regarding tests, we usually just make sure the script runs without errors with something like this:

cleanrl/tests/test_atari_jax.py

Lines 4 to 9 in 9f8b64b

    
           def test_dqn_jax(): 
        
               subprocess.run( 
        
                   "python cleanrl/dqn_atari_jax.py --learning-starts 10 --total-timesteps 16 --buffer-size 10 --batch-size 4", 
        
                   shell=True, 
        
                   check=True, 
        
               )

Then we ran github actions to ensure the script runs on multiple operating systems such as Windows, Linux, and MacOs.

cleanrl/.github/workflows/tests.yaml

Lines 18 to 55 in 9f8b64b

    
               python-version: [3.8] 
        
               poetry-version: [1.3] 
        
               os: [ubuntu-22.04, macos-latest, windows-latest] 
        
           runs-on: ${{ matrix.os }} 
        
           steps: 
        
             - uses: actions/checkout@v2 
        
             - uses: actions/setup-python@v2 
        
               with: 
        
                 python-version: ${{ matrix.python-version }} 
        
             - name: Run image 
        
               uses: abatilo/actions-poetry@v2.0.0 
        
               with: 
        
                 poetry-version: ${{ matrix.poetry-version }} 
        
             # classic control tests 
        
             - name: Install core dependencies 
        
               run: poetry install -E pytest 
        
             - name: Downgrade setuptools 
        
               run: poetry run pip install setuptools==59.5.0 
        
             - name: Run core tests 
        
               run: poetry run pytest tests/test_classic_control.py 
        
             - name: Install jax 
        
               if: runner.os == 'Linux' || runner.os == 'macOS' 
        
               run: poetry install -E "pytest jax" 
        
             - name: Run gymnasium migration dependencies 
        
               run: poetry run pip install "stable_baselines3==2.0.0a1" 
        
             - name: Run gymnasium tests 
        
               run: poetry run pytest tests/test_classic_control_gymnasium.py 
        
             - name: Run core tests with jax 
        
               if: runner.os == 'Linux' || runner.os == 'macOS' 
        
               run: poetry run pytest tests/test_classic_control_jax.py 
        
             - name: Run gae tests with jax 
        
               if: runner.os == 'Linux' || runner.os == 'macOS' 
        
               run: poetry run pytest tests/test_jax_compute_gae.py 
        
             - name: Install tuner dependencies 
        
               run: poetry install -E "pytest optuna" 
        
             - name: Run tuner tests 
        
               run: poetry run pytest tests/test_tuner.py

	def test_dqn_jax():
	subprocess.run(
	"python cleanrl/dqn_atari_jax.py --learning-starts 10 --total-timesteps 16 --buffer-size 10 --batch-size 4",
	shell=True,
	check=True,
	)

	python-version: [3.8]
	poetry-version: [1.3]
	os: [ubuntu-22.04, macos-latest, windows-latest]
	runs-on: ${{ matrix.os }}
	steps:
	- uses: actions/checkout@v2
	- uses: actions/setup-python@v2
	with:
	python-version: ${{ matrix.python-version }}
	- name: Run image
	uses: abatilo/actions-poetry@v2.0.0
	with:
	poetry-version: ${{ matrix.poetry-version }}

	# classic control tests
	- name: Install core dependencies
	run: poetry install -E pytest
	- name: Downgrade setuptools
	run: poetry run pip install setuptools==59.5.0
	- name: Run core tests
	run: poetry run pytest tests/test_classic_control.py
	- name: Install jax
	if: runner.os == 'Linux' \|\| runner.os == 'macOS'
	run: poetry install -E "pytest jax"
	- name: Run gymnasium migration dependencies
	run: poetry run pip install "stable_baselines3==2.0.0a1"
	- name: Run gymnasium tests
	run: poetry run pytest tests/test_classic_control_gymnasium.py
	- name: Run core tests with jax
	if: runner.os == 'Linux' \|\| runner.os == 'macOS'
	run: poetry run pytest tests/test_classic_control_jax.py
	- name: Run gae tests with jax
	if: runner.os == 'Linux' \|\| runner.os == 'macOS'
	run: poetry run pytest tests/test_jax_compute_gae.py
	- name: Install tuner dependencies
	run: poetry install -E "pytest optuna"
	- name: Run tuner tests
	run: poetry run pytest tests/test_tuner.py