gaohuang/MSDNet

Reproduce MSDNet paper's results

Opened this issue · 1 comments

Hi @gaohuang,

I have been trying lately to reproduce the results from the MSDNet paper, but I could not get the pretrained model's accuracy. I would very much appreciate if you could check if the following setup (for k=4) is correct:

th main.lua -dataset imagenet -data <imagenet_dir> -gen gen -nGPU 4 -nBlocks5 -base 7 -step 4 -batchSize 256 -nEpochs 90

vs.

th main.lua -dataset imagenet -data <imagenet_dir> -gen gen -nGPU 4 -nBlocks5 -base 7 -step 4 -batchSize 256 -retrain msdnet--step\=4--block\=5 --growthRate\=16.t7 -testOnly true 

The discrepancy in accuracy ranges from -1% (1st classifier) to -6% (last classifier) in top-1.

According to your paper:

On ImageNet, we use MSDNets with four scales, and the ith classifier operates on the (k×i+3)th layer (with i=1, . . . , 5 ), where k=4, 6 and 7. For simplicity, the losses of all the classifiers are weighted equally during training.

[...]

We apply the same optimization scheme to the ImageNet dataset, except that we increase the mini-batch size to 256, and all the models are trained for 90 epochs with learning rate drops after 30 and 60 epochs.

Am I missing something in the training parameters for Imagenet?

Hi @gaohuang,

I was wondering if you have had any ideas on what could have gone wrong in the reproduction of the paper's results.
Is there a way to double-check the hyper-parameters of the distributed pretrained model? Could this be a consequence of training on multiple GPUs?

I would greatly appreciate your reply.