lcswillems/rl-starter-files

Training from image

Closed this issue · 4 comments

I wanted to try to train the model from the environment rgb image.
I added a wrapper from miniworld like that:

resolution = 32
class ImageWrapper(gym.core.ObservationWrapper):

def __init__(self, env):
    super().__init__(env)
    self.__dict__.update(vars(env))  # hack to pass values to super wrapper
    self.observation_space = gym.spaces.Box(
        low=0,
        high=255,
        shape=(resolution, resolution, 3),  # number of cells
        dtype='uint8'
    )
def reset(self):
    obs = self.env.reset()
    img = self.env.render(mode="rgb_array")
    obs["image"] = skimage.transform.resize(img, (resolution, resolution), anti_aliasing=False)
    return obs

def step(self, action):
    obs, reward, done, info = self.env.step(action)
    img = self.env.render(mode="rgb_array")
    obs["image"] = skimage.transform.resize(img, (resolution, resolution), anti_aliasing=False)
    return obs, reward, done, info

I know this solution is not so elegant but it's just for testing.
I tried to change the image size from 7 to 32px.
But results are always similar to:
screenshot from 2019-03-01 14-33-33

When the image size is 7px, it should obtains similar results as without using my wrapper? However, my results are much worse. Did any one try using a rgb image?

I am sorry but I don't understand what you are doing:

  • which environment do you use?
  • what is your image?
  • which reward after how many updates do you get when you train with 7px?

Even if I don't understand what you do, a reason is that you may have more weights in your network and hence it takes a bit more time to learn.

Thank you for your help.

I use the minigrid environment; MiniGrid-DoorKey-8x8-v0
My image is the full image of the environnment (I obtained it with the wrapper I wrote before).
For the reward, I cannot learn anything after 1777664 frames (it sometimes get higher than 0 but it cannot converge)
U 868 | F 1777664 | FPS 0203 | D 7898 | rR:μσmM 0.00 0.00 0.00 0.00 | F:μσmM 640.0 0.0 640.0 640.0 | H 1.946 | V 0.000 | pL 0.000 | vL 0.000 | ∇ 0.000

For this experiment, I use a 773 image. Since the image is very simple, the results should be similar as the training on the original observations ? (their shape is identical as the images as I use)

I see. I never tried to do that, but indeed, I would expect it to learn easily.

Did you try to:

  • print the images the agent see to be sure they are correct,
  • increase/decrease the learning rate
  • put intermediary sizes in the network to see when it stops learning?

I don't think the issue you have is related to the code in this repository.

I close this issue because I don't have enough information to work on it. Feel free to reopen it if you have more to add (see my previous message).