qfgaohao/pytorch-ssd

Re-training SSD-Mobilenet - loss going up and down

Ufosek opened this issue · 1 comments

Hi,

I am using transfer learning with Re-training SSD-Mobilenet like here.
My dataset contains 8000+ images (annotated sport players) (I have grayscale camera so all images are in grayscale (edit: turned into RGB by copying channel)).

EDIT - learning size:

test: 827
train: 5947
trainval: 7434
val: 1488

I used this script to generate test data with:

trainval_percent = 0.9
train_percent = 0.8

I see that until 100 epochs loss is going down but then it is spiking and after exactly 200 epochs reaches new minimum.

  1. I am wondering what does it mean (overfitting? or maybe that's just normal optimization)?
  2. After each spike there is new minimum (100 - 1.47, 300 - 1.41, 500 - 1.39, 700 - 1.38) - Which one should I use? The lowest (at 700)? or at 100 (because later it may actually be not improving or even breaking)?

image

I would be glad for some help!
Regards

Change t-max if you are using the default scheduler.
The Cosine Scheduler varies the learning rate like a Cosine Curve.

parser.add_argument('--t_max', default=120, type=float,