Denys88/rl_games

The input shape of MLP for discrete observation space

Closed this issue · 4 comments

I am trying to use the BlackjackEnv (https://github.com/openai/gym/blob/master/gym/envs/toy_text/blackjack.py) in rl_games. It seems rl_games doesn't support the discrete observation space like:
spaces.Tuple((spaces.Discrete(32), spaces.Discrete(11), spaces.Discrete(2)))
Any plan to support this feature?

hi @yuemingl it should work with
gym.spaces.Tuple([gym.spaces.Discrete(2),gym.spaces.Discrete(3)]) should work.
Could you check your yaml config.
I have separate model for it:
model:
name: multi_discrete_a2c
I should merge it with just discrete_a2c but didn't do it yet.

My problem is that I got error from function _calc_input_size() in network_builder.py. The cause of the error is the shape of the discrete observation space is '()'. I found an issue here openai/gym#791 that talking about the discrete observation/action spaces. Returning '()' for discrete is changed as expected. Do you use one-hot for discrete observation/action spaces? For example, do you expect a shape like '(5,)' instead of '()' for spaces.Discrete(5)?

Ah, I missed that. Yes It might not be supported.
There are two things you can do right now:
I support multiple obs as the Dict:
spaces = {
'pos': gym.spaces.Box(low=0, high=1, shape=(2, ), dtype=np.float32),
'info': gym.spaces.Box(low=0, high=1, shape=(4, ), dtype=np.float32),
}
self.observation_space = gym.spaces.Dict(spaces)
But you will require to create a custom neural network.
here is simple test network example: https://github.com/Denys88/rl_games/blob/master/rl_games/envs/test_network.py

I think you best solution is to create a wrapper which will just merge all observations into the one if possible and try it first with the default solution,

@yuemingl I hope you made it work.
closing it for now :)