openai/gym

Value Error "Can not squeeze" for CarRacing-v0 environment

Closed this issue · 23 comments

I have tried running the ppo, ddpg, and vpg for the CarRacing-v0 and continuously receive the same ValueError :

ValueError: Can not squeeze dim[1], expected a dimension of 1, got 96 for 'v/Squeeze' (op: 'Squeeze') with input shapes: [?,96,96,1].

Additionally, the below error also appears earlier in the output- invalidArgumentError:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Can not squeeze dim[1], expected a dimension of 1, got 96 for 'v/Squeeze' (op: 'Squeeze') with input shapes: [?,96,96,1].

Below is the command line code I ran in bold and the script code for the VPG I ran. Both, returned the same result.

Experiment 3
Changes = vpg

python -m spinup.run vpg --exp_name Car3 --env CarRacing-v0 --epochs 50 --data_dir /Users/brianhaney/Desktop/Gym/CarRacing --dt

#results = returned non-zero exit status 1

from spinup import vpg
import tensorflow as tf
import gym

env_fn = lambda : gym.make('CarRacing-v0')

ac_kwargs = dict(hidden_sizes=[64,64], activation=tf.nn.relu)

logger_kwargs = dict(output_dir='path/to/output_dir', exp_name='experiment_name')

vpg(env_fn=env_fn, ac_kwargs=ac_kwargs, steps_per_epoch=5000, epochs=250, 
logger_kwargs=logger_kwargs)

I am running TensorFlow Version: 1.10.1 on my machine. I'm not sure what the error means - other than some input dimensions for Tensorflow were off. The closest issue I have found on the internet was on StackOverFlow.

The correct response to the question on Stack explains - (vector labels must provide a single specific index for the true class for each row of logits. So, vector labels must include only class-indices.) But, I am not sure that answer is relevant in this case.

Going forward, I have three hypotheses about what is going wrong:

  1. There is an issue with Tensorflow on my machine. However, I'm not sure what that could be.
  2. There is an issue with the way the environment is being loaded on my machine. I have successfully run other Box2D environments - LunarLander and LunarLanderContinuous - on my machine. But, this error is relating to an input shape - which makes me think it could be an environment problem.
  3. For some reason the code I am writing is buggy. I don't think this is the case because the code is pretty straight forward, I have successfully executed all three algorithms in other environments, and follows the same format as the documentation. But, it is always a possibility.

I would really appreciate any help, ideas, or suggestions in resolving this issue. Thanks!

Have you solved this problem??? I have the same now :(

@FlavioLorenzi I have not solved the problem yet. I have tried running the code on a new machine, but received the same problem. I still need to figure out what the error message:

"Can not squeeze dim[1], expected a dimension of 1, got 96 for 'v/Squeeze' (op: 'Squeeze') with input shapes: [?,96,96,1]."

means. My thought was to review and re-read the TensorFlow Documentation to try and develop a solution. I am happy to help any way I can and will be sure to post a solution when I develop one!

I don’t think that’s about TensorFlow but I think the problem is something about the different size of input output data flow between rl-algorithms and Car-Racing observation size(we have a vector of 3 elements: 96, 96, 3). In rl-algo it pretends a vector of size 1 in observation space.
Working with SAC algo I received another similiar error: Could not broadcast input array from shape (96,96,3) into shape (3).
I’m going crazy...

It sounds like you are ahead of me in trying to resolve the issue. I tried it with SAC, VPG, DDPG, and PPO and I got errors for all of them. What do you mean by: "the different size of input output data flow between rl-algorithms and Car-Racing observation size(we have a vector of 3 elements: 96, 96, 3)"?

If you look the code, in the observation space of CarRacing, there is a vector (created by .Box) of shape STATE_H,STATE_W and 3 (with STATE=96 in the init part).
So we have a size of 3 elements that I think they describe resolution(number of pixel), while RL algorithms “expect a dimension of 1”.

SAC for example receives this vector as input in its obs-space, but it’s predifined for a code (like lunarLander) with a one-dimensional vector(there’s compatibility!).
So the problem is only whit Car Racing: how convert this sequence? And how “reshape” a vector that describe pixels? 😫

I hope I was clear. It’s difficult also explain all this... Tell me if you not understand.
I started only yesterday to work on it, I hope to find solution soon. If you have any idea let me know!

That makes sense. Thanks for the description. I will let you know if I find anything further!

@FlavioLorenzi this thread may be of help. However, I still cannot figure out how to run the modified SAC algorithm in the CarRacing-v0 environment. Let me know what you think.

Yes, it seems that RL SpinningUp is only for codes with flat observations and not for image-like (pixel in the CarRacing). So we must create a new function that can take as input this type of image.
The problem is that I have no idea how.

@FlavioLorenzi This is a cool problem. I am going to work on it today and see what I can come up with.

I think @FlavioLorenzi is correct here, while Spinning up does support some of the Box2D environments, CarRacing-v0 uses pixel observations which it looks like are not supported. This discussion should be moved to @FlavioLorenzi's issue on spinning up: openai/spinningup#120

@FlavioLorenzi @christopherhesse The statement: "while Spinning up does support some of the Box2D environments, CarRacing-v0 uses pixel observations which "it looks like" are not supported." is illogical. The description is laced with passive voice, revealing a passive mind and attitude toward the problem, which I took considerable time and effort to define. Solving the problem of running a deep reinforcement learning algorithm for autonomous vehicle control in a 2-D environment does not require SpinningUp or CarRacing-v0. So it is irrelevant whether pixel observations are supported. Indeed, a convolutional neural network, by definition, observes pixels to make predictions about the value of current and future state-action pair values. Flavio's Issue is sufficiently different than this question. This question has not been answered and should be re-opened.

Did you find any solution for that yet or not???

No. Nothing.

Me neither. I moved to another environment...

Hi all, author of Spinning Up here. Chiming in to confirm @christopherhesse: the default actor-critic code for SAC (and other algorithms in Spinning Up) creates MLP networks, which expect vector inputs and will produce errors for image-shaped inputs (as you have found). If you want to use the Spinning Up algorithm implementations with CarRacing-v0, you will have to write custom actor-critic functions that are compatible with image-shaped inputs.

@jachiam First of all, your work on SpinningUp is incredible. It is a phenomenal platform and I sincerely appreciate your hard work. Second, thanks for the response - that clears things up.

Glad to help! I also realized shortly after I made that post---there is a second issue for DDPG/TD3/SAC (which is not present, I believe, in VPG/TRPO/PPO). My DDPG/TD3/SAC code build replay buffers and placeholders for observations which also assume vector inputs, and these would need to be amended for the code to support image observations. The VPG/TRPO/PPO codes have examples of how to implement those (since their replay buffers and placeholders use space.shape as the basis for their sizes instead of space.shape[0]).

@jachiam @FlavioLorenzi @christopherhesse Does anyone object to opening the issue because the problem hasn't been solved. A solution would require a modified SAC performing in the car racing environment. I am going to work on developing one. Once, I post it - or someone beats me to it - you can mark it closed. But, my issue still isn't fixed - so, this problem is not closed.

This is the ppo I am running from the command line:

python -m spinup.run ppo --exp_name carr --env CarRacing-v0 --epochs 50 --data_dir /Users/brianhaney/Desktop --dt

This is the error I am getting:

Traceback (most recent call last): File "/Users/brianhaney/spinningup/spinup/utils/run_entrypoint.py", line 11, in <module> thunk() File "/Users/brianhaney/spinningup/spinup/utils/run_utils.py", line 162, in thunk_plus thunk(**kwargs) File "/Users/brianhaney/spinningup/spinup/algos/ppo/ppo.py", line 187, in ppo pi, logp, logp_pi, v = actor_critic(x_ph, a_ph, **ac_kwargs) File "/Users/brianhaney/spinningup/spinup/algos/ppo/core.py", line 103, in mlp_actor_critic v = tf.squeeze(mlp(x, list(hidden_sizes)+[1], activation, None), axis=1) File "/anaconda3/envs/spinningup/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/anaconda3/envs/spinningup/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py", line 3146, in squeeze return gen_array_ops.squeeze(input, axis, name) File "/anaconda3/envs/spinningup/lib/python3.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 9043, in squeeze "Squeeze", input=input, squeeze_dims=axis, name=name) File "/anaconda3/envs/spinningup/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/anaconda3/envs/spinningup/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/anaconda3/envs/spinningup/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/anaconda3/envs/spinningup/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1823, in __init__ control_input_ops) File "/anaconda3/envs/spinningup/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1662, in _create_c_op raise ValueError(str(e)) ValueError: Can not squeeze dim[1], expected a dimension of 1, got 96 for 'v/Squeeze' (op: 'Squeeze') with input shapes: [?,96,96,1].

PPO Analysis

I am looking at the array_ops.py file and the ops.py file. But I don't know how these files relate to the gym enviornment or the ppo.py. One of the error lines says:

packages/tensorflow/python/ops/gen_array_ops.py", line 9043, in squeeze "Squeeze", input=input, squeeze_dims=axis, name=name)

But, the file isn't 9043 lines long - So, I don't know where to keep looking for the bug.

This is the SAC I am running:

spinningup brianhaney$ python -m spinup.run sac --exp_name carr0 --env CarRacing-v0 --epochs 50 --data_dir /Users/brianhaney/Desktop

This is the error I am getting:

2019-03-17 20:14:43.643281: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA Track generation: 923..1158 -> 235-tiles track Traceback (most recent call last): File "/Users/brianhaney/spinningup/spinup/utils/run_entrypoint.py", line 11, in <module> thunk() File "/Users/brianhaney/spinningup/spinup/utils/run_utils.py", line 162, in thunk_plus thunk(**kwargs) File "/Users/brianhaney/spinningup/spinup/algos/sac/sac.py", line 259, in sac replay_buffer.store(o, a, r, o2, d) File "/Users/brianhaney/spinningup/spinup/algos/sac/sac.py", line 24, in store self.obs1_buf[self.ptr] = obs ValueError: could not broadcast input array from shape (96,96,3) into shape (96)

SAC Analysis
In both instances there is an issue with an input array or shape. In the SAC's case the error is coming from line 24:

    def __init__(self, obs_dim, act_dim, size):
        self.obs1_buf = np.zeros([size, obs_dim], dtype=np.float32)
        self.obs2_buf = np.zeros([size, obs_dim], dtype=np.float32)
        self.acts_buf = np.zeros([size, act_dim], dtype=np.float32)
        self.rews_buf = np.zeros(size, dtype=np.float32)
        self.done_buf = np.zeros(size, dtype=np.float32)
        self.ptr, self.size, self.max_size = 0, 0, size

    #line 22 of the code, function begins on line 23, first indent is line 24
    def store(self, obs, act, rew, next_obs, done):
        self.obs1_buf[self.ptr] = obs
        self.obs2_buf[self.ptr] = next_obs
        self.acts_buf[self.ptr] = act
        self.rews_buf[self.ptr] = rew
        self.done_buf[self.ptr] = done
        self.ptr = (self.ptr+1) % self.max_size
        self.size = min(self.size+1, self.max_size)

    def sample_batch(self, batch_size=32):
        idxs = np.random.randint(0, self.size, size=batch_size)
        return dict(obs1=self.obs1_buf[idxs],
                    obs2=self.obs2_buf[idxs],
                    acts=self.acts_buf[idxs],
                    rews=self.rews_buf[idxs],
                    done=self.done_buf[idxs])

I am not not sure the purpose of this line or how it relates to the value error and input arrays. Any advice would be appreciated.

Thanks

Guys can you try the following
`
observation = env.reset()

Crop the image to 84x84

image = image[0:self.84, :]
image = image[:, 96-84:]

Convert image to grayscale

np.dot(image[..., :3], [0.299, 0.587, 0.114])

normalize the image

min_image = np.min(image)
max_image = np.max(image)
if min_image != max_image:
image = 2 * (image - min_image) / (max_image - min_image) - 1.0
else:
image[:,:] = 0
image = image[:, :, np.newaxis]

Image shape will be (84, 84, 1)

`

@Bhaney44 @FlavioLorenzi @christopherhesse I generally hate pulling the rank in this fashion, but, as a DRI for the gym I declare this issue closed, and object to reopening it here, because it has nothing to do with the gym. As suggested by @christopherhesse and supported by the author of spinning up @jachiam spinning up code assumes vector observations, whereas CarRacing Environment produces image observations. Ergo, in order to make Spinning Up code work with CarRacing, one needs to write custom actor-critic functions compatible with spinning up on one end, and with image observations on other end (and modify replay buffers per @jachiam's suggestion). At the fear of repeating myself - the issue is not related to openai gym, or tensorflow for that matter, only to the way spinningup is currently structured. If need be, this discussion can be moved into either openai/spinningup#120 or a new issue in spinning up. @Bhaney44 If you are using ppo, and don't want to figure out how to make spinning up work with image observations, I would recommend using implementation of ppo that is compatible with images out of the box - for instance, openai baselines (https://github.com/openai/baselines) or stable baselines (https://github.com/hill-a/stable-baselines).

@pzhokhov you "generally hate pulling the rank in this fashion" because you are trying to abuse whatever pseudo-authority you believe you have. In addition to your atrocious use of phrases like "in order to" your response is wrong. As I clearly stated "the problem hasn't been solved." I have worked hard at solving this problem, I am not finished, and no one else has proposed a solution. With that said, I have no interest continuing to interact with you or anyone else at your "non-profit." I will solve future issues on my own.

Brian S. Haney