lcswillems/rl-starter-files

Handle full observability

leobix opened this issue · 8 comments

Hi Lucas!

If we use the FullyObsWrapper on a Minigrid environment then the format of observation_space will go from Dict(image:Box(7, 7, 3)) to Box(19, 19, 3). (19 is an example)

In utils/format.py the get_preprocessor function first tries if re.match("MiniGrid-.*", env_id)
and assumes that every MiniGrid environment will be partially observable and won't be able to handle a fully observable minigrid environment.

We could just change the order of the if ... and elif ... to make it work, but I am not sure this would be optimal, this is why I prefer opening an issue.

Thanks :)

Hi Leobix,

The code is not assuming any size at all. It takes the size given by the Gym MiniGrid environment observation_space.

I think the issue is rather coming from the Gym MiniGrid environment where the observation space is always the same, whether the environment is partially or fully observable.

The observation_space is not the same when using the FullyObsWrapper, it's a box instead of a dict, and usually larger than (7,7,3). The issue is that the preprocessor checks that the environment name starts with MiniGrid- to decide what it does with the observation. It should probably check that the observation space is a dict instead.

Okay, I see, sorry for my misunderstanding. But this means that the FullyObsWrapper observation space doesn't contain any instruction?

Indeed, with the FullyObsWrapper you don't have the instruction nor the mission anymore in the observation, just the observation tensor.

@leobix If you have some code working, could you write it here? Are you sure it is sufficient to exchange the if and elif?

Edit: Yes it is sufficient. I will commit soon.

@leobix I have committed. Can you tell me if you still have the issue?

I get this error now:

maximecb@T740p:~/Desktop/rl-starter-files$ python3 -m scripts.train --algo ppo --env MiniGrid-Empty-8x8-v0 --model DoorKey --save-interval 10 --frames 8000000
/home/maximecb/Desktop/rl-starter-files/scripts/train.py --algo ppo --env MiniGrid-Empty-8x8-v0 --model DoorKey --save-interval 10 --frames 8000000

Namespace(algo='ppo', batch_size=256, clip_eps=0.2, discount=0.99, entropy_coef=0.01, env='MiniGrid-Empty-8x8-v0', epochs=4, frames=8000000, frames_per_proc=None, gae_lambda=0.95, log_interval=1, lr=0.0007, max_grad_norm=0.5, mem=False, model='DoorKey', optim_alpha=0.99, optim_eps=1e-05, procs=16, recurrence=1, save_interval=10, seed=1, tb=False, text=False, value_loss_coef=0.5)

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/maximecb/Desktop/rl-starter-files/scripts/train.py", line 105, in <module>
    obs_space, preprocess_obss = utils.get_obss_preprocessor(args.env, envs[0].observation_space, model_dir)
  File "/home/maximecb/Desktop/rl-starter-files/utils/format.py", line 12, in get_obss_preprocessor
    print(obs_space.spaces.keys())
AttributeError: 'Box' object has no attribute 'spaces'

@lcswillems to try the FullyObsWrapper, you only need to add two lines to scripts/train.py:

from gym_minigrid.wrappers import FullyObsWrapper

# Add after gym.make(...)
env = FullyObsWrapper(env)

Then you can test with:

python3 -m scripts.train --algo ppo --env MiniGrid-Empty-8x8-v0 --model DoorKey --save-interval 10 --frames 8000000

Thank you Maxime!

It is fixed now.