Bug with BatchNormalization(axis=1): ValueError: Shape must be rank 1 but is rank 4 for 'batch_normalization_1/cond/FusedBatchNorm'

Question

Bug with BatchNormalization(axis=1): ValueError: Shape must be rank 1 but is rank 4 for 'batch_normalization_1/cond/FusedBatchNorm'

Closed this issue 6 years ago · 43 comments

Current master version of keras (commit b3cb261), TensorFlow 1.8.0

BatchNormalization(axis=1) for 'channels_first' seems to fail.

import os
os.environ['KERAS_BACKEND'] = 'tensorflow'
import keras.backend as K
from keras.layers import Activation, Conv2D, Input
from keras.layers.normalization import BatchNormalization

# declare network model with channels first: ERROR
K.set_image_data_format('channels_first')
input = Input(shape=(3, 1001, 1001), dtype='float32')
x = Conv2D(filters=64, kernel_size=(3, 3), strides=1, padding='same')(input)
x = BatchNormalization(axis=1)(x)
x = Activation('relu')(x)

gives the error

Traceback (most recent call last):
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1567, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape must be rank 1 but is rank 4 for 'batch_normalization_1/cond/FusedBatchNorm' (op: 'FusedBatchNorm') with input shapes: [?,64,1001,1001], [1,64,1,1], [1,64,1,1], [1,64,1,1], [1,64,1,1].
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "<input>", line 11, in <module>
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/keras/engine/base_layer.py", line 459, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/keras/layers/normalization.py", line 204, in call
    training=training)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 3069, in in_train_phase
    x = switch(training, x, alt)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 3004, in switch
    else_expression_fn)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 432, in new_func
    return func(*args, **kwargs)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2072, in cond
    orig_res_f, res_f = context_f.BuildCondBranch(false_fn)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1913, in BuildCondBranch
    original_result = fn()
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/keras/layers/normalization.py", line 165, in normalize_inference
    epsilon=self.epsilon)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 1894, in batch_normalization
    is_training=False
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/nn_impl.py", line 904, in fused_batch_norm
    name=name)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 3429, in _fused_batch_norm
    is_training=is_training, name=name)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1734, in __init__
    control_input_ops)
  File "/home/rcasero/.conda/envs/cytometer_tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1570, in _create_c_op
    raise ValueError(str(e))
ValueError: Shape must be rank 1 but is rank 4 for 'batch_normalization_1/cond/FusedBatchNorm' (op: 'FusedBatchNorm') with input shapes: [?,64,1001,1001], [1,64,1,1], [1,64,1,1], [1,64,1,1], [1,64,1,1].

Meanwhile, BatchNormalization(axis=3) for 'channels_last' works.

import os
os.environ['KERAS_BACKEND'] = 'tensorflow'
import keras.backend as K
from keras.layers import Activation, Conv2D, Input
from keras.layers.normalization import BatchNormalization

# declare network model with channels last: NO ERROR
K.set_image_data_format('channels_last')
input = Input(shape=(1001, 1001, 3), dtype='float32')
x = Conv2D(filters=64, kernel_size=(3, 3), strides=1, padding='same')(input)
x = BatchNormalization(axis=3)(x)
x = Activation('relu')(x)

doesn't give any error.

Answer 1 · 2018-07-15T04:46:55.000Z

@rcasero For BatchNormalization layer, if you set axis = 1, the internal implementation would broadcast mean/var/beta/gama tensors to 4 dimension, but tf.nn.fused_batch_norm only accept them as 1 dimension tensor, that leads to the exception. I have sent #10684 to fix it, wish it can help you as well.

Answer 2 · 2018-08-12T06:44:58.000Z

Hi, I'm having the same problem. Is there any workaround while still using TF-GPU as backend? Thanks

@rcasero how did you handle it?

Answer 3 · 2018-08-12T10:52:14.000Z

@damnko , if you check the last part of my comment above, the bug doesn't happen for 'channels_last'.

Answer 4 · 2018-08-12T11:13:45.000Z

Thanks for your reply @rcasero , I am currently using 'channels_last' with axis = 1 . I am not sure what should I do to bypass the problem. It works normally on my local pc running on CPU.
Do you think I should try to reshape the input and use 'channels_first'?

Answer 5 · 2018-08-12T11:59:36.000Z

@damnko The problem for me only happens with 'channels_first'. Have you run my two small test scripts above? Do you get the problem with both?

Answer 6 · 2018-08-12T12:33:26.000Z

This scripts above behave just as you say when running on GPU, but they both work on my laptop running on CPU.
The fact is my input shape is something like [?, 128, 6, 1] where 6 are my channels and 128 is the data along the time domain. My input data are sensor measurings and I am using axis=1 for the normalization.

Do you have any further suggestion? Thanks agan

Answer 7 · 2018-08-12T13:17:17.000Z

Yes, the problem, as you say, happens only when running on GPU.

Sorry, @damnko , I haven't worked with temporal data. Why do you have the final 1 in [?, 128, 6, 1]? Can't your data be [?, 128, 6]?

Answer 8 · 2018-08-12T13:29:06.000Z

That's because Conv2D expects a 4D tensor with shape (batch, rows, cols, channels). And I cant use Conv1D since it will convolve only along the temporal direction.

I will do some other tests and update the thread eventually. Thanks again for your support

Answer 9 · 2018-08-12T13:49:36.000Z

Not sure what you are trying to do, but in any case, I think your data should have shape [?, 128, 1, 6], i.e. the channels should be last.

Answer 10 · 2018-08-12T14:38:02.000Z

I'm still in the process of learning so I might be wrong, but if I reshape it as per your suggestion I will be able to convolve only on [128, 1] which is [rows, cols] which is not what I want. I want to be able to convolve along the temporal direction (for example with a [5,1] kernel) and do cross correlation among different sensors and time (for example with a [5,2] kernel).
As far as I understood, if I have an input shape like [?, 128, 1, 6] this last operation won't be possible since I won't be able to convolve along the channels. Hopefully this makes sense.

Aside of that, the workaround that seems to work for me at the moment is to use an input with shape [?, 1, 128, 6] and then

K.set_image_data_format('channels_first')
...
X = BatchNormalization(axis = 2, name='bn0')(X)

which hopefully is the same as having an input with shape [? 128, 6, 1] and using 'channels_last' with normalization on axis = 1

Answer 11 · 2018-10-31T14:32:11.000Z

Is this still investigated?
Do I really need to go back to keras 2.1.6 to solve this?

Answer 12 · 2018-11-20T19:44:34.000Z

Closing this issue since updated version of TensorFlow and Keras fixes it. Feel free to reopen if still running into problems. Thanks!

Answer 13 · 2018-11-21T09:35:01.000Z

@ymodak Could you please say which version is fixed? I have the same problem with Keras 2.2.4 and tf 1.9.0-rc0. Thanks!

Answer 14 · 2018-11-21T17:54:38.000Z

@see-- Try upgrading your TensorFlow version to 1.10 or 1.11

Answer 15 · 2018-11-22T12:30:50.000Z

@ymodak still not working with tf 1.12 and keras 2.2.4

Answer 16 · 2018-11-22T12:59:17.000Z

@ymodak Did not help. Could you show the commit?

import tensorflow as tf; print(tf.__version__)
# 1.11.0-rc1
import keras; print(keras.__version__)
# 2.2.4

from keras import backend as K
from keras.layers import Input, Conv2D, BatchNormalization
K.set_image_data_format('channels_first')
input = Input(shape=(3, 1001, 1001), dtype='float32')
x = Conv2D(filters=64, kernel_size=(3, 3), strides=1, padding='same')(input)
x = BatchNormalization(axis=1)(x)

ValueError: Shape must be rank 1 but is rank 0 for 'batch_normalization_1/cond/Reshape_4' (op: 'Reshape') with input shapes: [1,64,1,1], [].

Answer 17 · 2018-11-22T17:27:35.000Z

Fixed in e3a2f7d @yaarsh The fix is not part of 2.2.4 (from pip). You can just install the latest keras master.

Answer 18 · 2018-11-26T12:09:49.000Z

@see-- thx, fixed.

Answer 19 · 2018-12-08T00:59:53.000Z

@rcasero @see-- @yaarsh @ymodak @davideboschetto @yanboliang @damnko thank you for your comments. how can i install keras master?

Answer 20 · 2019-01-18T06:59:32.000Z

@kazemSafari just change your tensorflow_backend.py

Answer 21 · 2019-02-16T12:33:59.000Z

just change your tensorflow_backend.py

users should do that too?

Answer 22 · 2019-02-16T13:01:40.000Z

@iperov I did that,and it works.With the link from @see-- you can know how to change it.

Answer 23 · 2019-02-16T13:23:54.000Z

@jacmule I can do that too. I am talking about users :D

Answer 24 · 2019-03-08T12:35:38.000Z

how long wait new pip release?

Answer 25 · 2019-03-22T17:41:55.000Z

Installing by master version worked for me, thanks !!

Answer 26 · 2019-03-22T17:44:36.000Z

better to change network to channels_last

Answer 27 · 2019-04-03T19:18:26.000Z

@jacmule How does one just change tensorflow_backend.py?

Answer 28 · 2019-05-14T11:48:34.000Z

please fix that

Answer 29 · 2019-05-14T12:10:38.000Z

@stiv-yakovenko why you need that?

Answer 30 · 2019-05-14T14:13:01.000Z

https://github.com/kbardool/keras-frcnn fails with lastest keras, i have to rollback it

Answer 31 · 2019-05-14T19:20:29.000Z

This one works correctly:

Name: Keras
Version: 2.1.2
Summary: Deep Learning for Python
Home-page: https://github.com/fchollet/keras
Author: Francois Chollet
Author-email: francois.chollet@gmail.com
License: MIT
Location: c:\users\steve\miniconda3\lib\site-packages
Requires: six, numpy, pyyaml, scipy
Required-by:

Answer 32 · 2019-05-14T20:35:15.000Z

@stiv-yakovenko You cant use GPU ?

Answer 33 · 2019-05-14T20:41:02.000Z

problem happens both on cpu and gpu

Answer 34 · 2019-06-11T15:04:59.000Z

@kazemSafari just change your tensorflow_backend.py

How can I change the tensorflow_backend.py in Jupyter note book(anaconda) ?

I am getting this error with 2.2.4 (keras) and tensorflow 1.13.1 ,can anyone @iperov @see-- please help?

Answer 35 · 2019-06-18T04:04:26.000Z

please change tensorflow_backend in keras by yourself as suggested by @see--
In keras version - 2.2.4 -> backend --> tesnorflow_backend.py : change "()" to "[ ]" in line no 1908,1910,1914, 1918.

Answer 36 · 2019-06-19T21:20:58.000Z

@rcasero @see-- @yaarsh @ymodak @davideboschetto @yanboliang @damnko thank you for your comments. how can i install keras master?

pipenv install -e git+https://github.com/keras-team/keras.git@master#egg=keras

Answer 37 · 2019-07-01T16:23:43.000Z

please change tensorflow_backend in keras by yourself as suggested by @see--
In keras version - 2.2.4 -> backend --> tesnorflow_backend.py : change "()" to "[ ]" in line no 1908,1910,1914, 1918.

This worked for me, thank you.

Answer 38 · 2019-07-08T03:44:04.000Z

please change tensorflow_backend in keras by yourself as suggested by @see--
In keras version - 2.2.4 -> backend --> tesnorflow_backend.py : change "()" to "[ ]" in line no 1908,1910,1914, 1918.

This worked like magic LOOL hope this gets fixed

Answer 39 · 2019-08-08T09:50:28.000Z

please change tensorflow_backend in keras by yourself as suggested by @see--
In keras version - 2.2.4 -> backend --> tesnorflow_backend.py : change "()" to "[ ]" in line no 1908,1910,1914, 1918.

Thank you!! This worked like a charm. Although it's just super sad that you have to manually modify keras' core files to fix this.

Answer 40 · 2019-08-15T09:57:10.000Z

Workaround: Reduce the keras version. CPU 2.2.0 can be, GPU 2.1.6 can.

Answer 41 · 2019-08-30T13:14:37.000Z

It works well with "cuda=9.0, tensorflow=1.11.0, keras=2.2.5"

Answer 42 · 2020-09-04T09:49:04.000Z

please change tensorflow_backend in keras by yourself as suggested by @see--
In keras version - 2.2.4 -> backend --> tesnorflow_backend.py : change "()" to "[ ]" in line no 1908,1910,1914, 1918.

why it can be worked?

Answer 43 · 2020-09-06T15:55:47.000Z

[ ]

Thank you very much! Excellent 👍