keras-team/keras

Bug with BatchNormalization(axis=1): ValueError: Shape must be rank 1 but is rank 4 for 'batch_normalization_1/cond/FusedBatchNorm'

Closed this issue ยท 43 comments

Current master version of keras (commit b3cb261), TensorFlow 1.8.0

BatchNormalization(axis=1) for 'channels_first' seems to fail.

import os
os.environ['KERAS_BACKEND'] = 'tensorflow'
import keras.backend as K
from keras.layers import Activation, Conv2D, Input
from keras.layers.normalization import BatchNormalization

# declare network model with channels first: ERROR
K.set_image_data_format('channels_first')
input = Input(shape=(3, 1001, 1001), dtype='float32')
x = Conv2D(filters=64, kernel_size=(3, 3), strides=1, padding='same')(input)
x = BatchNormalization(axis=1)(x)
x = Activation('relu')(x)

gives the error

Traceback (most recent call last):
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1567, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape must be rank 1 but is rank 4 for 'batch_normalization_1/cond/FusedBatchNorm' (op: 'FusedBatchNorm') with input shapes: [?,64,1001,1001], [1,64,1,1], [1,64,1,1], [1,64,1,1], [1,64,1,1].
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "<input>", line 11, in <module>
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/keras/engine/base_layer.py", line 459, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/keras/layers/normalization.py", line 204, in call
    training=training)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 3069, in in_train_phase
    x = switch(training, x, alt)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 3004, in switch
    else_expression_fn)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 432, in new_func
    return func(*args, **kwargs)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2072, in cond
    orig_res_f, res_f = context_f.BuildCondBranch(false_fn)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1913, in BuildCondBranch
    original_result = fn()
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/keras/layers/normalization.py", line 165, in normalize_inference
    epsilon=self.epsilon)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 1894, in batch_normalization
    is_training=False
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/nn_impl.py", line 904, in fused_batch_norm
    name=name)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 3429, in _fused_batch_norm
    is_training=is_training, name=name)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1734, in __init__
    control_input_ops)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1570, in _create_c_op
    raise ValueError(str(e))
ValueError: Shape must be rank 1 but is rank 4 for 'batch_normalization_1/cond/FusedBatchNorm' (op: 'FusedBatchNorm') with input shapes: [?,64,1001,1001], [1,64,1,1], [1,64,1,1], [1,64,1,1], [1,64,1,1].

Meanwhile, BatchNormalization(axis=3) for 'channels_last' works.

import os
os.environ['KERAS_BACKEND'] = 'tensorflow'
import keras.backend as K
from keras.layers import Activation, Conv2D, Input
from keras.layers.normalization import BatchNormalization

# declare network model with channels last: NO ERROR
K.set_image_data_format('channels_last')
input = Input(shape=(1001, 1001, 3), dtype='float32')
x = Conv2D(filters=64, kernel_size=(3, 3), strides=1, padding='same')(input)
x = BatchNormalization(axis=3)(x)
x = Activation('relu')(x)

doesn't give any error.

@rcasero For BatchNormalization layer, if you set axis = 1, the internal implementation would broadcast mean/var/beta/gama tensors to 4 dimension, but tf.nn.fused_batch_norm only accept them as 1 dimension tensor, that leads to the exception. I have sent #10684 to fix it, wish it can help you as well.

Hi, I'm having the same problem. Is there any workaround while still using TF-GPU as backend? Thanks

@rcasero how did you handle it?

@damnko , if you check the last part of my comment above, the bug doesn't happen for 'channels_last'.

Thanks for your reply @rcasero , I am currently using 'channels_last' with axis = 1 . I am not sure what should I do to bypass the problem. It works normally on my local pc running on CPU.
Do you think I should try to reshape the input and use 'channels_first'?

@damnko The problem for me only happens with 'channels_first'. Have you run my two small test scripts above? Do you get the problem with both?

This scripts above behave just as you say when running on GPU, but they both work on my laptop running on CPU.
The fact is my input shape is something like [?, 128, 6, 1] where 6 are my channels and 128 is the data along the time domain. My input data are sensor measurings and I am using axis=1 for the normalization.

Do you have any further suggestion? Thanks agan

Yes, the problem, as you say, happens only when running on GPU.

Sorry, @damnko , I haven't worked with temporal data. Why do you have the final 1 in [?, 128, 6, 1]? Can't your data be [?, 128, 6]?

That's because Conv2D expects a 4D tensor with shape (batch, rows, cols, channels). And I cant use Conv1D since it will convolve only along the temporal direction.

I will do some other tests and update the thread eventually. Thanks again for your support

Not sure what you are trying to do, but in any case, I think your data should have shape [?, 128, 1, 6], i.e. the channels should be last.

I'm still in the process of learning so I might be wrong, but if I reshape it as per your suggestion I will be able to convolve only on [128, 1] which is [rows, cols] which is not what I want. I want to be able to convolve along the temporal direction (for example with a [5,1] kernel) and do cross correlation among different sensors and time (for example with a [5,2] kernel).
As far as I understood, if I have an input shape like [?, 128, 1, 6] this last operation won't be possible since I won't be able to convolve along the channels. Hopefully this makes sense.

Aside of that, the workaround that seems to work for me at the moment is to use an input with shape [?, 1, 128, 6] and then

K.set_image_data_format('channels_first')
...
X = BatchNormalization(axis = 2, name='bn0')(X)

which hopefully is the same as having an input with shape [? 128, 6, 1] and using 'channels_last' with normalization on axis = 1

Is this still investigated?
Do I really need to go back to keras 2.1.6 to solve this?

Closing this issue since updated version of TensorFlow and Keras fixes it. Feel free to reopen if still running into problems. Thanks!

see-- commented

@ymodak Could you please say which version is fixed? I have the same problem with Keras 2.2.4 and tf 1.9.0-rc0. Thanks!

@see-- Try upgrading your TensorFlow version to 1.10 or 1.11

@ymodak still not working with tf 1.12 and keras 2.2.4

see-- commented

@ymodak Did not help. Could you show the commit?

import tensorflow as tf; print(tf.__version__)
# 1.11.0-rc1
import keras; print(keras.__version__)
# 2.2.4
from keras import backend as K
from keras.layers import Input, Conv2D, BatchNormalization
K.set_image_data_format('channels_first')
input = Input(shape=(3, 1001, 1001), dtype='float32')
x = Conv2D(filters=64, kernel_size=(3, 3), strides=1, padding='same')(input)
x = BatchNormalization(axis=1)(x)

ValueError: Shape must be rank 1 but is rank 0 for 'batch_normalization_1/cond/Reshape_4' (op: 'Reshape') with input shapes: [1,64,1,1], [].

see-- commented

Fixed in e3a2f7d @yaarsh The fix is not part of 2.2.4 (from pip). You can just install the latest keras master.

@see-- thx, fixed.

@rcasero @see-- @yaarsh @ymodak @davideboschetto @yanboliang @damnko thank you for your comments. how can i install keras master?

@kazemSafari just change your tensorflow_backend.py

just change your tensorflow_backend.py

users should do that too?

@iperov I did that,and it works.With the link from @see-- you can know how to change it.

@jacmule I can do that too. I am talking about users :D

how long wait new pip release?

Installing by master version worked for me, thanks !!

better to change network to channels_last

@jacmule How does one just change tensorflow_backend.py?

please fix that

@stiv-yakovenko why you need that?

https://github.com/kbardool/keras-frcnn fails with lastest keras, i have to rollback it

This one works correctly:

Name: Keras
Version: 2.1.2
Summary: Deep Learning for Python
Home-page: https://github.com/fchollet/keras
Author: Francois Chollet
Author-email: francois.chollet@gmail.com
License: MIT
Location: c:\users\steve\miniconda3\lib\site-packages
Requires: six, numpy, pyyaml, scipy
Required-by:

@stiv-yakovenko You cant use GPU ?

problem happens both on cpu and gpu

@kazemSafari just change your tensorflow_backend.py

How can I change the tensorflow_backend.py in Jupyter note book(anaconda) ?

I am getting this error with 2.2.4 (keras) and tensorflow 1.13.1 ,can anyone @iperov @see-- please help?

please change tensorflow_backend in keras by yourself as suggested by @see--
In keras version - 2.2.4 -> backend --> tesnorflow_backend.py : change "()" to "[ ]" in line no 1908,1910,1914, 1918.

@rcasero @see-- @yaarsh @ymodak @davideboschetto @yanboliang @damnko thank you for your comments. how can i install keras master?

pipenv install -e git+https://github.com/keras-team/keras.git@master#egg=keras

please change tensorflow_backend in keras by yourself as suggested by @see--
In keras version - 2.2.4 -> backend --> tesnorflow_backend.py : change "()" to "[ ]" in line no 1908,1910,1914, 1918.

This worked for me, thank you.

please change tensorflow_backend in keras by yourself as suggested by @see--
In keras version - 2.2.4 -> backend --> tesnorflow_backend.py : change "()" to "[ ]" in line no 1908,1910,1914, 1918.

This worked like magic LOOL hope this gets fixed

ntlex commented

please change tensorflow_backend in keras by yourself as suggested by @see--
In keras version - 2.2.4 -> backend --> tesnorflow_backend.py : change "()" to "[ ]" in line no 1908,1910,1914, 1918.

Thank you!! This worked like a charm. Although it's just super sad that you have to manually modify keras' core files to fix this.

Workaround: Reduce the keras version. CPU 2.2.0 can be, GPU 2.1.6 can.

It works well with "cuda=9.0, tensorflow=1.11.0, keras=2.2.5"

please change tensorflow_backend in keras by yourself as suggested by @see--
In keras version - 2.2.4 -> backend --> tesnorflow_backend.py : change "()" to "[ ]" in line no 1908,1910,1914, 1918.

why it can be worked?

[ ]

Thank you very much! Excellent ๐Ÿ‘