How to initial the weights in the training of KD stage2?
Closed this issue · 6 comments
--weights=WSDDN/****.caffemodel (layer names are modified), kd_stage1/modesl_iter_400000.caffemodel
Check last issue, make sure the layer names of T-WDet model have been modified.
The size of kd_stage1/modesl_iter_400000.caffemodel is almost the twice of WSDDN/.caffemodel because it contains both detection and classify network. The detection network is exactly the same as WSDDN/.caffemodel, including the same layer name.
If reloading WSDDN/.caffemodel first, then the weight of detection network of kd_stage1/modesl_iter_400000.caffemodel will overwrite WSDDN/.caffemodel.
So the arguments should be shown as follwing which means reloading kd_stage1 first:
--weights=kd_stage1/modesl_iter_400000.caffemodel, WSDDN/****.caffemodel (layer names are modified)
Please check that the weight of T-WDet in stage1 is frozen. So, actually, the reloading result should be indifferent from the order of the arguments in --weights.
Actually, the pretrian model that training of stage2 needs is only stage1 model.
Is that right?
No, please check that there are two classifiers (fc_wsd_8_c and fc_wsd_8_d) in T-WDet in the prototxt of s2.