titu1994/keras-SRU

overfiting too fast

Closed this issue · 1 comments

I run the imdasru.py . But I found it was too easy to be overfitting. My log is here:

7s - loss: 0.6368 - acc: 0.6280 - val_loss: 0.5955 - val_acc: 0.6673
Epoch 2/100
5s - loss: 0.5224 - acc: 0.7412 - val_loss: 0.6085 - val_acc: 0.6791
Epoch 3/100
5s - loss: 0.4561 - acc: 0.7827 - val_loss: 0.6453 - val_acc: 0.6871
Epoch 4/100
5s - loss: 0.3931 - acc: 0.8183 - val_loss: 0.6873 - val_acc: 0.7012
Epoch 5/100
5s - loss: 0.3277 - acc: 0.8527 - val_loss: 0.7497 - val_acc: 0.7072
Epoch 6/100
5s - loss: 0.2661 - acc: 0.8853 - val_loss: 0.8440 - val_acc: 0.7120
Epoch 7/100
5s - loss: 0.2133 - acc: 0.9116 - val_loss: 0.9658 - val_acc: 0.7123
Epoch 8/100
5s - loss: 0.1696 - acc: 0.9330 - val_loss: 1.1144 - val_acc: 0.7143
Epoch 9/100
5s - loss: 0.1312 - acc: 0.9496 - val_loss: 1.3357 - val_acc: 0.7074
Epoch 10/100
5s - loss: 0.1020 - acc: 0.9623 - val_loss: 1.5486 - val_acc: 0.7066

As you can see ,val_loss is increasing. The model is overfitting.

To avoid the overfitting, I did something like these:
outputs = SRU(batch_size, dropout=0.2, recurrent_dropout=0.2)(prev_input)

opt=Adam(lr=0.001,clipnorm=0.03)
model.compile(loss='binary_crossentropy',
optimizer=opt,
metrics=['accuracy'])

As you can see, dropout and clipnorm are all failure to stop overfitting. Why ? Pls help me.

The imdb dataset is simply too small and too easy to overfit. It's not meant to be trained for that many epochs. Use very low batch size, and just 5-6 epochs and the model will fit sufficiently.

Any more and it will overfit.