Low transfer learning accuracy

Question

Low transfer learning accuracy

mickes27 opened this issue 6 years ago · 2 comments

I have a task to classify seeds depending on the defect. I have around 14k images in 7 classes (they are not equal size, some classes have more photos, some have less). I tried to train inception V3 from scratch and I've got around 90% accuracy. Then tried transfer learning using pretrained model with imagenet weights. I imported inception_v3 from applications without top fc layers, than added my own like in documentation. I ended with code:

# Setting dimensions
img_width = 454
img_height = 227

###########################
# PART 1 - Creating Model #
###########################

# Creating InceptionV3 model without Fully-Connected layers
base_model = InceptionV3(weights='imagenet', include_top=False, input_shape = (img_height, img_width, 3))

# Adding layers which will be fine-tunned
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(7, activation='softmax')(x)

# Creating final model
model = Model(inputs=base_model.input, outputs=predictions)

# Plotting model
plot_model(model, to_file='inceptionV3.png')

# Freezing Convolutional layers
for layer in base_model.layers:
    layer.trainable = False

# Summarizing layers
print(model.summary())

# Compiling the CNN
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

##############################################
# PART 2 - Images Preproccessing and Fitting #
##############################################

# Fitting the CNN to the images

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   rotation_range=30,
                                   width_shift_range=0.2,
                                   height_shift_range=0.2,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True,
                                   preprocessing_function=preprocess_input,)

valid_datagen = ImageDataGenerator(rescale = 1./255,
                                   preprocessing_function=preprocess_input,)

train_generator = train_datagen.flow_from_directory("dataset/training_set",
                                                    target_size=(img_height, img_width),
                                                    batch_size = 4,
                                                    class_mode = "categorical",
                                                    shuffle = True,
                                                    seed = 42)

valid_generator = valid_datagen.flow_from_directory("dataset/validation_set",
                                                    target_size=(img_height, img_width),
                                                    batch_size = 4,
                                                    class_mode = "categorical",
                                                    shuffle = True,
                                                    seed = 42)

STEP_SIZE_TRAIN = train_generator.n//train_generator.batch_size
STEP_SIZE_VALID = valid_generator.n//valid_generator.batch_size

# Save the model according to the conditions  
checkpoint = ModelCheckpoint("inception_v3_1.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)
early = EarlyStopping(monitor='val_acc', min_delta=0, patience=10, verbose=1, mode='auto')

#Training the model
history = model.fit_generator(generator=train_generator,
                         steps_per_epoch=STEP_SIZE_TRAIN,
                         validation_data=valid_generator,
                         validation_steps=STEP_SIZE_VALID,
                         epochs=25,
                         callbacks = [checkpoint, early])

But I've got terrible results: 45% accuracy. I thought it would be better. I made a plot of training where I can see training accuracy raise up to ~80%, while validation accuracy jumps from 20-40% What's wrong?

Answer 1 · 2018-12-04T14:54:56.000Z

Same here....
I've done the same process with generator and I got same kind of results.

Seems weird, but I've tried to do this task in two steps and it seems to work:

first, I use InceptionV3 without prediction layer to extract feature
second, I use a simple layer Dense of size 2 with fit method

And finally, I build a final model in the same way as you. I'm just updating weights using those from the single dense layer fitted previously and it works....

Answer 2 · 2018-12-25T15:37:44.000Z

I'm experiencing the same using Keras version 2.2.3, do you have a solution for this problem?

@RomainCendre I got the same result as you. When trying to use a pre-trained model and freeze the layers and then add a simple dense layer the val_accuracy is really bad (in my case random). When first extracting the features and then adding those features to a new model the val_accuracy is more or less the same as the accuracy.

@mickes27 A workaround I'm using is to use the technic that @RomainCendre recommends and then save the weights from both models and load them again when I added the models together.