Yochengliu/MLIC-KD-WSD

How to initial the weights in the training of KD stage2?

Closed this issue · 6 comments

ztwe commented

How to fill the arguments when call the command?
./caffe-MLIC/build/tools/caffe train --solver=... --weights= --gpu 0

image

I try initializing the weights using the following argument. Is that right?
--weights=kd_stage1/modesl_iter_400000.caffemodel, pretrain/VGG_ILSVRC_16_layers.caffemodel

--weights=WSDDN/****.caffemodel (layer names are modified), kd_stage1/modesl_iter_400000.caffemodel

Check last issue, make sure the layer names of T-WDet model have been modified.

ztwe commented

The size of kd_stage1/modesl_iter_400000.caffemodel is almost the twice of WSDDN/.caffemodel because it contains both detection and classify network. The detection network is exactly the same as WSDDN/.caffemodel, including the same layer name.
If reloading WSDDN/.caffemodel first, then the weight of detection network of kd_stage1/modesl_iter_400000.caffemodel will overwrite WSDDN/.caffemodel.

So the arguments should be shown as follwing which means reloading kd_stage1 first:
--weights=kd_stage1/modesl_iter_400000.caffemodel, WSDDN/****.caffemodel (layer names are modified)

ztwe commented

The initial loss:
image

The loss after 38000 iter with batchsize=3:
image

Please check that the weight of T-WDet in stage1 is frozen. So, actually, the reloading result should be indifferent from the order of the arguments in --weights.

ztwe commented

Actually, the pretrian model that training of stage2 needs is only stage1 model.
Is that right?

No, please check that there are two classifiers (fc_wsd_8_c and fc_wsd_8_d) in T-WDet in the prototxt of s2.