Reproduce MSDNet paper's results
Opened this issue · 1 comments
Hi @gaohuang,
I have been trying lately to reproduce the results from the MSDNet paper, but I could not get the pretrained model's accuracy. I would very much appreciate if you could check if the following setup (for k=4) is correct:
th main.lua -dataset imagenet -data <imagenet_dir> -gen gen -nGPU 4 -nBlocks5 -base 7 -step 4 -batchSize 256 -nEpochs 90
vs.
th main.lua -dataset imagenet -data <imagenet_dir> -gen gen -nGPU 4 -nBlocks5 -base 7 -step 4 -batchSize 256 -retrain msdnet--step\=4--block\=5 --growthRate\=16.t7 -testOnly true
The discrepancy in accuracy ranges from -1% (1st classifier) to -6% (last classifier) in top-1.
According to your paper:
On ImageNet, we use MSDNets with four scales, and the ith classifier operates on the (k×i+3)th layer (with i=1, . . . , 5 ), where k=4, 6 and 7. For simplicity, the losses of all the classifiers are weighted equally during training.
[...]
We apply the same optimization scheme to the ImageNet dataset, except that we increase the mini-batch size to 256, and all the models are trained for 90 epochs with learning rate drops after 30 and 60 epochs.
Am I missing something in the training parameters for Imagenet?
Hi @gaohuang,
I was wondering if you have had any ideas on what could have gone wrong in the reproduction of the paper's results.
Is there a way to double-check the hyper-parameters of the distributed pretrained model? Could this be a consequence of training on multiple GPUs?
I would greatly appreciate your reply.