Training from image
Closed this issue · 4 comments
I wanted to try to train the model from the environment rgb image.
I added a wrapper from miniworld like that:
resolution = 32
class ImageWrapper(gym.core.ObservationWrapper):
def __init__(self, env):
super().__init__(env)
self.__dict__.update(vars(env)) # hack to pass values to super wrapper
self.observation_space = gym.spaces.Box(
low=0,
high=255,
shape=(resolution, resolution, 3), # number of cells
dtype='uint8'
)
def reset(self):
obs = self.env.reset()
img = self.env.render(mode="rgb_array")
obs["image"] = skimage.transform.resize(img, (resolution, resolution), anti_aliasing=False)
return obs
def step(self, action):
obs, reward, done, info = self.env.step(action)
img = self.env.render(mode="rgb_array")
obs["image"] = skimage.transform.resize(img, (resolution, resolution), anti_aliasing=False)
return obs, reward, done, info
I know this solution is not so elegant but it's just for testing.
I tried to change the image size from 7 to 32px.
But results are always similar to:
When the image size is 7px, it should obtains similar results as without using my wrapper? However, my results are much worse. Did any one try using a rgb image?
I am sorry but I don't understand what you are doing:
- which environment do you use?
- what is your image?
- which reward after how many updates do you get when you train with 7px?
Even if I don't understand what you do, a reason is that you may have more weights in your network and hence it takes a bit more time to learn.
Thank you for your help.
I use the minigrid environment; MiniGrid-DoorKey-8x8-v0
My image is the full image of the environnment (I obtained it with the wrapper I wrote before).
For the reward, I cannot learn anything after 1777664 frames (it sometimes get higher than 0 but it cannot converge)
U 868 | F 1777664 | FPS 0203 | D 7898 | rR:μσmM 0.00 0.00 0.00 0.00 | F:μσmM 640.0 0.0 640.0 640.0 | H 1.946 | V 0.000 | pL 0.000 | vL 0.000 | ∇ 0.000
For this experiment, I use a 773 image. Since the image is very simple, the results should be similar as the training on the original observations ? (their shape is identical as the images as I use)
I see. I never tried to do that, but indeed, I would expect it to learn easily.
Did you try to:
- print the images the agent see to be sure they are correct,
- increase/decrease the learning rate
- put intermediary sizes in the network to see when it stops learning?
I don't think the issue you have is related to the code in this repository.
I close this issue because I don't have enough information to work on it. Feel free to reopen it if you have more to add (see my previous message).