Arsey/keras-transfer-learning-for-oxford102

Value error: input samples have different number of arrays with target samples.

TriLoo opened this issue · 8 comments

Hi Arsey,
After I get the bottleneck features of oxford102 dataset, and go to use these features to train a small fully-connected model (bottlneck.py/ train_top_model() ), I got a value error:

Using Theano backend.
K.image_dim_ordering: th
Found 6149 images belonging to 102 classes.
Found 1020 images belonging to 102 classes.
Traceback (most recent call last):
File "train.py", line 38, in
bottlenecks.train_top_model()
File "/home/smher/Desktop/ML_Work/myVGG16_v1/bottlenecks.py", line 61, in train_top_model
callbacks=callbacks_list)
File "/home/smher/anaconda2/lib/python2.7/site-packages/keras/engine/training.py", line 1068, in fit
batch_size=batch_size)
File "/home/smher/anaconda2/lib/python2.7/site-packages/keras/engine/training.py", line 993, in _standardize_user_data
check_array_lengths(x, y, sample_weights)
File "/home/smher/anaconda2/lib/python2.7/site-packages/keras/engine/training.py", line 182, in check_array_lengths
'and ' + str(list(set_y)[0]) + ' target samples.')
ValueError: Input arrays should have the same number of samples as target arrays. Found 6149 input samples and 8660 target samples.

How can I fix this error ? Any advices ?

Arsey commented

ValueError: Input arrays should have the same number of samples as target arrays. Found 6149 input samples and 8660 target samples. - It's possible that your data structure has changed since save_bottleneck_features() was completed

Maybe I know the reason causing this error.
In bottlenecks.py, the way to get training labels is :
train_labels += [k] * len(os.listdir('{}/{}'.format(config.train_dir, i)))
So, if there are other not image files in train dir, this sentense would treat them as images leading to wrong target samples...
In fact, I just used these dataset in SIFT + SVM frame, so there are some .sift files in train data dir. Now, I have deleted them and try again...

Arsey commented

@TriLoo I've just fixed the issue

@Arsey Can you please tell me how did you fix the issue.

this is unfair, tell us how you fixed it so we could lesrn

Arsey commented

@stobasa basically the bottlenecks extraction step is not necessary and do not exist in the code anymore. Please read the code to get the idea. But simply the steps are:

  1. get pretrained model without the last fully-connected layer + its weights
  2. add a fully-connected layer for classification with a number of neurons that is equal to the number of classes you have
  3. freeze N layers of the original pretrained model
  4. start training with small learning rate

all these steps are already implemented

@Arsey can you please post the code here..i have been stucked on this for a whole day..this would help me greatly

Arsey commented

@stobasa code for what?