Arsey/keras-transfer-learning-for-oxford102

Error in weights file I guess...I am at a loss

shirishr opened this issue · 13 comments

ValueError: Input dimension mis-match. (input[0].shape[1] = 7, input[1].shape[1] = 64)

Here is the trace:

Creating model...
C:\Users\shiri\Documents\keras-transfer-learning-for-oxford102-master\models\resnet50.py:33: UserWarning: Update your `Model` call to the Keras 2 API: `Model(outputs=Softmax.0, inputs=/input_1)`
  self.model = Model(input=base_model.input, output=predictions)
Model is created
Fine tuning...
Freezing 80 layers
Found 6149 images belonging to 102 classes.
Found 1020 images belonging to 102 classes.
./models\base_model.py:43: UserWarning: Update your `fit_generator` call to the Keras 2 API: `fit_generator(<keras.pre..., validation_steps=1020, callbacks=[<keras.ca..., epochs=1000, steps_per_epoch=192, validation_data=<keras.pre..., class_weight={0: 2.4765...)`
  class_weight=self.class_weight)
Epoch 1/1000
Input dimension mis-match. (input[0].shape[1] = 7, input[1].shape[1] = 64)
Apply node that caused the error: Elemwise{add,no_inplace}(AbstractConv2d{convdim=2, border_mode='valid', subsample=(2, 2), filter_flip=True, imshp=(None, 3, 230, 230), kshp=(64, 3, 7, 7), filter_dilation=(1, 1)}.0, Reshape{4}.0)
Toposort index: 3158
Inputs types: [TensorType(float32, 4D), TensorType(float32, (True, False, True, True))]
Inputs shapes: [(32, 7, 114, 84), (1, 64, 1, 1)]
Inputs strides: [(1066128, 152304, 1336, 8), (256, 4, 4, 4)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[AbstractBatchNormInference{axes=(0, 2, 3)}(Elemwise{add,no_inplace}.0, Reshape{4}.0, Reshape{4}.0, Reshape{4}.0, Reshape{4}.0, TensorConstant{0.00100000..0474974513}), AbstractBatchNormTrain{axes=(0, 2, 3)}(Elemwise{add,no_inplace}.0, InplaceDimShuffle{x,0,x,x}.0, InplaceDimShuffle{x,0,x,x}.0, TensorConstant{0.00100000..0474974513}, TensorConstant{0.10000000149011612})]]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
  File "C:\Anaconda3\envs\py35\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)
  File "C:/Users/shiri/Documents/keras-transfer-learning-for-oxford102-master/train.py", line 50, in <module>
    model.train()
  File "./models\base_model.py", line 48, in train
    self._create()
  File "C:\Users\shiri\Documents\keras-transfer-learning-for-oxford102-master\models\resnet50.py", line 21, in _create
    base_model = KerasResNet50(include_top=False, input_tensor=self.get_input_tensor())
  File "C:\Anaconda3\envs\py35\lib\site-packages\keras\applications\resnet50.py", line 207, in ResNet50
    x = Conv2D(64, (7, 7), strides=(2, 2), name='conv1')(x)
  File "C:\Anaconda3\envs\py35\lib\site-packages\keras\engine\topology.py", line 554, in __call__
    output = self.call(inputs, **kwargs)
  File "C:\Anaconda3\envs\py35\lib\site-packages\keras\layers\convolutional.py", line 178, in call
    data_format=self.data_format)
  File "C:\Anaconda3\envs\py35\lib\site-packages\keras\backend\theano_backend.py", line 1938, in bias_add
    x += reshape(bias, (1, bias.shape[0], 1, 1))

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Traceback (most recent call last):
  File "C:\Anaconda3\envs\py35\lib\site-packages\theano\compile\function_module.py", line 884, in __call__
    self.fn() if output_subset is None else\
ValueError: Input dimension mis-match. (input[0].shape[1] = 7, input[1].shape[1] = 64)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/shiri/Documents/keras-transfer-learning-for-oxford102-master/train.py", line 50, in <module>
    model.train()
  File "./models\base_model.py", line 51, in train
    self._fine_tuning()
  File "./models\base_model.py", line 43, in _fine_tuning
    class_weight=self.class_weight)
  File "C:\Anaconda3\envs\py35\lib\site-packages\keras\legacy\interfaces.py", line 88, in wrapper
    return func(*args, **kwargs)
  File "C:\Anaconda3\envs\py35\lib\site-packages\keras\engine\training.py", line 1876, in fit_generator
    class_weight=class_weight)
  File "C:\Anaconda3\envs\py35\lib\site-packages\keras\engine\training.py", line 1620, in train_on_batch
    outputs = self.train_function(ins)
  File "C:\Anaconda3\envs\py35\lib\site-packages\keras\backend\theano_backend.py", line 1094, in __call__
    return self.function(*inputs)
  File "C:\Anaconda3\envs\py35\lib\site-packages\theano\compile\function_module.py", line 898, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "C:\Anaconda3\envs\py35\lib\site-packages\theano\gof\link.py", line 325, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "C:\Anaconda3\envs\py35\lib\site-packages\six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "C:\Anaconda3\envs\py35\lib\site-packages\theano\compile\function_module.py", line 884, in __call__
    self.fn() if output_subset is None else\
ValueError: Input dimension mis-match. (input[0].shape[1] = 7, input[1].shape[1] = 64)
Apply node that caused the error: Elemwise{add,no_inplace}(AbstractConv2d{convdim=2, border_mode='valid', subsample=(2, 2), filter_flip=True, imshp=(None, 3, 230, 230), kshp=(64, 3, 7, 7), filter_dilation=(1, 1)}.0, Reshape{4}.0)
Toposort index: 3158
Inputs types: [TensorType(float32, 4D), TensorType(float32, (True, False, True, True))]
Inputs shapes: [(32, 7, 114, 84), (1, 64, 1, 1)]
Inputs strides: [(1066128, 152304, 1336, 8), (256, 4, 4, 4)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[AbstractBatchNormInference{axes=(0, 2, 3)}(Elemwise{add,no_inplace}.0, Reshape{4}.0, Reshape{4}.0, Reshape{4}.0, Reshape{4}.0, TensorConstant{0.00100000..0474974513}), AbstractBatchNormTrain{axes=(0, 2, 3)}(Elemwise{add,no_inplace}.0, InplaceDimShuffle{x,0,x,x}.0, InplaceDimShuffle{x,0,x,x}.0, TensorConstant{0.00100000..0474974513}, TensorConstant{0.10000000149011612})]]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
  File "C:\Anaconda3\envs\py35\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)
  File "C:/Users/shiri/Documents/keras-transfer-learning-for-oxford102-master/train.py", line 50, in <module>
    model.train()
  File "./models\base_model.py", line 48, in train
    self._create()
  File "C:\Users\shiri\Documents\keras-transfer-learning-for-oxford102-master\models\resnet50.py", line 21, in _create
    base_model = KerasResNet50(include_top=False, input_tensor=self.get_input_tensor())
  File "C:\Anaconda3\envs\py35\lib\site-packages\keras\applications\resnet50.py", line 207, in ResNet50
    x = Conv2D(64, (7, 7), strides=(2, 2), name='conv1')(x)
  File "C:\Anaconda3\envs\py35\lib\site-packages\keras\engine\topology.py", line 554, in __call__
    output = self.call(inputs, **kwargs)
  File "C:\Anaconda3\envs\py35\lib\site-packages\keras\layers\convolutional.py", line 178, in call
    data_format=self.data_format)
  File "C:\Anaconda3\envs\py35\lib\site-packages\keras\backend\theano_backend.py", line 1938, in bias_add
    x += reshape(bias, (1, bias.shape[0], 1, 1))

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

I ran the train.py with these theano flags

optimizer=fast_compile
exception_verbosity=high

to get some more information:

Model is created
Fine tuning...
Freezing 80 layers
Found 6149 images belonging to 102 classes.
Found 1020 images belonging to 102 classes.
./models\base_model.py:43: UserWarning: Update your `fit_generator` call to the Keras 2 API: `fit_generator(<keras.pre..., epochs=1000, class_weight={0: 2.4765..., validation_data=<keras.pre..., validation_steps=1020, callbacks=[<keras.ca..., steps_per_epoch=192)`
  class_weight=self.class_weight)
Epoch 1/1000

*GpuDnnConv images and kernel must have the same stack size*

Apply node that caused the error: GpuDnnConv{algo='small', inplace=False}(GpuContiguous.0, GpuContiguous.0, GpuAllocEmpty.0, GpuDnnConvDesc{border_mode='valid', subsample=(2, 2), conv_mode='conv', precision='float32'}.0, Constant{1.0}, Constant{0.0})
Toposort index: 2266
Inputs types: [CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, 4D), <theano.gof.type.CDataType object at 0x000001E9F08BC3C8>, Scalar(float32), Scalar(float32)]
Inputs shapes: [(32, 3, 230, 230), (7, 7, 3, 64), (32, 7, 114, 84), 'No shapes', (), ()]
Inputs strides: [(158700, 52900, 230, 1), (1344, 192, 64, 1), (67032, 9576, 84, 1), 'No strides', (), ()]
Inputs values: ['not shown', 'not shown', 'not shown', <capsule object NULL at 0x000001EA0BD67540>, 1.0, 0.0]
Inputs type_num: ['', '', '', '', 11, 11]
Inputs name: ('image', 'kernel', 'output', 'descriptor', 'alpha', 'beta')
Arsey commented

@shirishr what Python and TensorFlow version do you use?

Sasha,
On the same machine I have a virtual machine with Ubuntu on it. In this I have installed Python 2.7 and Theano (No GPU) where everything seems to work but training is so slow ...it like watching grass grow slowly.

Do you have a trained model with Python 3.5? Should we ask @fcollet?

Arsey commented

@shirishr and what's the Kerase's version?
BTW don't think we should mention fcollet here, also weights issue is backend specific

I see. My Keras version is 2.0.2

Hello Sasha,

I upgraded my keras version 2.0.6 by using

pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps

That eliminated the error. So it is safe to close the issue

I suggest a minor code change at lines 65~66 as:
the current lines are:

def get_input_tensor(self):
    return Input(shape=(3,) + self.img_size)

which should be changed to:

def get_input_tensor(self):
	if keras.backend.backend()=='tensorflow':
		return Input(self.img_size + (3,))
	else:
		return Input(shape=(3,) + self.img_size)

I am assuming theano and CNTK use identical image_data_format

Arsey commented

@shirishr thanks for your patient. I'm just in process of adopting the code for
(Keras 1 or Keras 2) + (Theano or Tensorflow) + (Python 2 or Python 3)

Arsey commented

@shirishr I'm glad you like it.
Speaking about Theano, try to set device using env variable THEANO_FLAGS='device=gpu0' or THEANO_FLAGS='device=cuda', also see the warnings on running the train process

I have tried setting backend to "theano" in keras.json
and set device = "gpu" in .theanorc
and seen some compiler errors (gcc++ not found or minGW or cl.exe not found) I don't exactly remember. I have abandoned theano for the time being. I am afraid I may break something else if I change current setup. Maybe one of these days I will set a target to myself and fix that issue as well.
Thanks

Arsey commented

if TensorFlow works for you just use it as it's even faster and more perspective