
Can the `Data cardinality is ambiguous error` in Tensorflow 2.4 or 2.5 be solved as follows?

Hi, thanks very much for your work. I use docker to build an environment to learn your work. When I use FROM tensorflow/tensorflow:2.3.3-gpu-jupyter to create a container, and test the examples

python InvertedPendulumBulletEnv-v0
python InvertedDoublePendulumBulletEnv-v0 -n 5000
python HalfCheetahBulletEnv-v0 -n 5000 -b 5

all the tests passed. But when I use newer images, for instance, FROM tensorflow/tensorflow:2.4.2-gpu-jupyter, I got the ValueError: Data cardinality is ambiguous error as presented below.

$ python InvertedPendulumBulletEnv-v0
['/home/wezardlza/workspace/trpo', '/usr/lib/', '/usr/lib/python3.6', '/usr/lib/python3.6/lib-dynload', '/home/wezardlza/.local/lib/python3.6/site-packages', '/usr/local/lib/python3.6/dist-packages', '/usr/lib/python3/dist-packages', '/home/wezardlza/workspace']
pybullet build time: Jun 22 2021 23:31:53
Traceback (most recent call last):
  File "", line 351, in <module>
  File "", line 317, in main
    policy.update(observes, actions, advantages, logger)  # update policy
  File "/home/wezardlza/workspace/trpo/", line 61, in update
    old_means, old_logvars, old_logp])
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/", line 1725, in train_on_batch
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/", line 1513, in single_batch_iterator
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/", line 1529, in _check_data_cardinality
    raise ValueError(msg)
ValueError: Data cardinality is ambiguous:
  x sizes: 369, 369, 369, 369, 1, 369
Make sure all arrays contain the same number of samples.

After some checks, I found in file ./trpo/ the below code caused the mismatched batch size

class PolicyNN(Layer):
    """ Neural net for policy approximation function.

    Policy parameterized by Gaussian means and variances. NN outputs mean
     action based on observation. Trainable variables hold log-variances
     for each action dimension (i.e. variances not determined by NN).
    def build(self, input_shape):
        self.batch_sz = input_shape[0]
    def call(self, inputs, **kwargs):
        y = self.dense1(inputs)
        y = self.dense2(y)
        y = self.dense3(y)
        means = self.dense4(y)
        logvars = K.sum(self.logvars, axis=0, keepdims=True) + self.init_logvar
        logvars = K.tile(logvars, (self.batch_sz, 1))
        return [means, logvars]

which set the first dimension of logvars to be one during runtime constantly while the first dimension of inputs seems varied. Thus, based on the code above, the first dimension of means is also different from logvars which causes the error

  File "/home/wezardlza/workspace/trpo/", line 61, in update
    old_means, old_logvars, old_logp])

Thus, I do the following things: In file ./trpo/, add

from tensorflow import shape

and change logvars = K.tile(logvars, (self.batch_sz, 1)) to logvars = K.tile(logvars, (shape(inputs)[0], 1)). These helped me to pass the exmple

python InvertedPendulumBulletEnv-v0

but it seems self.batch_sz will not be used anymore. Perhaps we can just change logvars = K.tile(logvars, (self.batch_sz, 1)) to logvars = K.tile(logvars, (shape(inputs)[0], 1)) and remove the build() method above? I am new to TensorFlow and would like to know whether my changes will cause any problems or even errors for the TRPO results. Thanks for help!

@wezardlza I have the same issue.

You need to reshape the old_logvars value after 'old_means, old_logvars = self.policy(observes)' line.

You can do it by adding the below line.

old_logvars = K.tile(old_logvars, (observes.shape[0], 1))

I can see the mean reward is increased to 1000. (: