Loss quickly stabilizes around 8 and does not converge
Opened this issue · 7 comments
Hi @markshih91
I have used your model to train COCO dataset. I just found 1 error in your script *ssdlite_praph.py one of the link definition were missing. Unfortunately it does quickly stabilize around 5000 steps. I also tested with scales modified in train.py as suggested by @pierluigiferrari but no real difference. Did you trained with the version you have posted? Can you share your training experience here? Thx
Hi @MirzaAnoush, can you help me with the missing link definition ? Did you define one more or remove one ?
Oh never mind I figured it out.
Hello @M4gicT0, sorry for my late arrival. Did you trained the model?
Yes I am training it (with transfer learning) on my custom dataset. I am unsure why I obtain a lot of predicted coordinates outside the image though...
@M4gicT0 I want to try train the model on my dataset that have two classes, can you tell me how to sub-sample the weight tensors of all the classification layers.
@M4gicT0 I want to try train the model on my dataset that have two classes, can you tell me how to sub-sample the weight tensors of all the classification layers.
The original writer of this SSD implementation has a very nice tutorial for that: https://github.com/pierluigiferrari/ssd_keras/blob/master/weight_sampling_tutorial.ipynb
@M4gicT0 I want to try train the model on my dataset that have two classes, can you tell me how to sub-sample the weight tensors of all the classification layers.
The original writer of this SSD implementation has a very nice tutorial for that: https://github.com/pierluigiferrari/ssd_keras/blob/master/weight_sampling_tutorial.ipynb
yes, I have tried it, and change the layers name from conv4_3_norm_mbox_conf
to ssd_cls1conv2
, but it show that bias:0
does not exist after running. and use ssd_cls1conv2_bn
it show kernel:0
is not exist. Is this name not written correctly?