Arsey/keras-transfer-learning-for-oxford102

train error:ValueError: Input arrays should have the same number of samples as target arrays

mikeylj opened this issue · 11 comments

Input arrays should have the same number of samples as target arrays. Found 8837 input samples and 297 target samples.
Traceback (most recent call last):
File "train.py", line 39, in
model_module.train(class_weight=class_weight)
File "/media/hlgdeep/_data/keras_project/flower-oxford102/models/vgg16.py", line 216, in train
train_top_model(class_weight=class_weight)
File "/media/hlgdeep/_data/keras_project/flower-oxford102/models/vgg16.py", line 146, in train_top_model
class_weight=class_weight)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1406, in fit
batch_size=batch_size)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1308, in _standardize_user_data
_check_array_lengths(x, y, sample_weights)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 229, in _check_array_lengths
'and ' + str(list(set_y)[0]) + ' target samples.')
ValueError: Input arrays should have the same number of samples as target arrays. Found 8837 input samples and 297 target samples.

Arsey commented

probably corrupted dataset

Did you manage to get this working? I have the following error in bootstrap.py:

move_files('train', labels[idx_test, :])
Index Error: too many indices

Any ideas? Do you have a fully working version you could send please?

check vgg16.py
ImageDataGenerator.flow_from_directory() has a default batch_size, predict_generator() take parameter steps, in the script steps=config.nb_train_samples, which means you are getting nb_train_samples*batch_size samples from the generator, you can set batch_size=1 or steps=config.nb_train_samples/batch_size
I guess in older versions of keras the input parameter acctually takes the whole number of samples to be generated. I'm using keras2.0.4

I'm using keras 2.0.4. We have this function in save_bottleneck_features() in vgg16.py:

generator = datagen.flow_from_directory(
config.train_dir,
target_size=config.img_size,
shuffle=False,
classes=config.classes)
bottleneck_features_train = model.predict_generator(generator, config.nb_train_samples)

What do I need to change? Do I do the same for validation?

@djones4487169
Add batch_size=1 to the parameters, do the same for validation. Also you may need to modify tune(). You can check keras.io for how those functions works.

Making these modifications will get you through save_bottleneck_features() & train_top_model(), there are some other problems when running tune(), I haven't looked into them. You may want to consider using a lower version of keras to run the scripts directly.

I got it working up to model fitting and 1st epoch but then:

ResourceExhaustedError: OOM when allocating tensor with shape [4096,4096]

Any ideas how to solve this or free up memory to allow it fit the model?

Got it training by changing 4096 in both dense layers to 1024 and then 256.

Error in tune():

Negative dimension size caused by subtracting 2 from 1 for 'block2_pool_1/MaxPool' with input shapes: [?,1,112,128]

I'm using tensorflow backend?

Changed:
basee_model = VGG16(weights='imagenet', include_top=False, input_tensor=Input(shape=(3,) + config.img_size))

To:
base_model = VGG16(weights='imagenet', include_top=False, input_tensor=Input(shape=(224, 224, 3)))

Any ideas about this error in function get_top_model_for_VGG16() in vgg16.py :

you called 'set_weights' on layer 'fc1' with a weight list of length 1, but it was expecting 2 weights

Arsey commented

@djones4487169, the code does not support TensorFlow, please use Theano.
@cottontail7, I didn't test the code with Keras 2, I would suggest using Keras 1.2 with it as Keras 2 has some issues with weights of pre-trained models