lcswillems/rl-starter-files

ZeroDivisionError when running with FullyObsWrapper

ErikVester opened this issue · 4 comments

Hi,

First of all, thank you for this great project!

I am trying to work with a fully observable environment and I am doing so by making the following changes to env.py:

def make_env(env_key, seed=None): env = gym.make(env_key) env = gym_minigrid.wrappers.FullyObsWrapper(env) env.reset() env.seed(seed) return env

When I run this on the MiniGrid-DoorKey-5x5-v0, MiniGrid-Empty-5x5-v0 and
MiniGrid-Empty-6x6-v0 environments - I think all the small environments - I get the following error:
ZeroDivisionError: float division by zero

This seems to happen during the initialisation of the acmodel:
acmodel = ACModel(obs_space, envs[0].action_space, args.mem, args.text)

I tried this for the MiniGrid-DoorKey-8x8-v0 environment as well and here I do not get this error. Any idea what this could be and how we could solve it?

Kind regards,
Erik

Hi,

I think the problem is that with a fully observable environment in say MiniGrid-DoorKey-5x5-v0 is that the observation size is 5x5x3. And this size is not suitable for the default convolutional model in model.py (since the height and width reduces too quickly). This is the default model in model.py -

# Define image embedding self.image_conv = nn.Sequential( nn.Conv2d(3, 16, (2, 2)), nn.ReLU(), nn.MaxPool2d((2, 2)), nn.Conv2d(16, 32, (2, 2)), nn.ReLU(), nn.Conv2d(32, 64, (2, 2)), nn.ReLU() ) n = obs_space["image"][0] m = obs_space["image"][1] self.image_embedding_size = ((n-1)//2-2)*((m-1)//2-2)*64

It should work if you change the model by reducing the convolutional layers (or by introducing padded convolutions). And also changing the image embedding size appropriately. For example, this works for me -

self.image_conv = nn.Sequential( nn.Conv2d(3, 16, (2, 2)), nn.ReLU(), nn.Conv2d(16, 32, (2, 2)), nn.ReLU(), ) n = obs_space["image"][0] m = obs_space["image"][1] self.image_embedding_size = (n-2)*(m-2)*32

@Aditya-Ramesh-10 I did something along these lines to go for full observability. This is in ACModel:

# Define image embedding
self.image_conv = nn.Sequential(
    nn.Conv2d(3, 16, (2, 2), padding=1 if obs_space['image'][0] < 7 else 0),
    nn.ReLU(),
    nn.MaxPool2d((2, 2)),
    nn.Conv2d(16, 32, (2, 2)),
    nn.ReLU(),
    nn.Conv2d(32, 64, (2, 2)),
    nn.ReLU()
)
n = obs_space["image"][0]
m = obs_space["image"][1]
self.image_embedding_size = max(((n-1)//2-2)*((m-1)//2-2)*64, 64)

This was enough to train the model. However, it doesn't learn well at all! I suspect it's because CNNs really aren't amenable to this allocentric (as opposed to ego-centric) observation. We really need a fundamentally different architecture that doesn't use CNNs. I'm working on one now.

@SkittlePox Hi there! Were you ever able to figure out an architecture that worked with the fully observable wrapper? I've been struggling to figure this out myself; I've played around with a couple of MLP architectures that use embedding layers but no luck.

Hi, has anyone been able to figure out an architecture that works with fully observable or symbolic wrapper? Any help will be greatly appreciated!