cifar10 baseline performance
bobye opened this issue · 10 comments
Hi,
I was trying to reproduce the baseline performance of ConvNet (reported in your NIPS paper). But I tried the default setting (cifar10_full_multistep_solver.prototxt
) as the provided one in the directory and only got around 79~80% accuracy. Can you give me some advices on how to boost for the extra 2 percentages on the test data?
I also tried to train the SSL from scratch (without initiating from a baseline model) and still got 79~80% accuracy.
Jianbo
@wenwei202 Hi, which script should I run? create_cifar10.sh or create_padded_cifar10.sh? I used the former one to create lmdb.
create_cifar10.sh
is good.
@wenwei202 Should I use very small learning rate the later phase of training? I observed that the test accuracy does not improve for 200000 steps based on cifar10_full_multistep_solver.prototxt
@wenwei202 thanks in advance, In your paper“learning structured sparsity in dnns”, you conduct cifar10 experiment with "ConvNet" on caffe, is it the same "cifar10_full" net architecture in caffe?
@upupnoel No, it's different, we added a dropout layer. @bobye Did you finally duplicate the results? cifar10 should be easily duplicated.
@wenwei202 dropout is added following the last layer, while the last layer only has 10 neurons, which seems a little bit strange. Does this boost your performance in practice?
@wenwei202 Not yet :< can you give a standalone script that reproduces its result on your computer? I tried cifar10_full_multistep_solver.prototxt
but had not waited for anther 100000 steps. It has been good that you wrote some documentations on the usages.
@bobye I tried those scripts, and found the net and solver were both for SSL.
Please use cifar10_full_baseline_multistep_solver.prototxt
to duplicate the baseline, which is the same with the one in https://github.com/BVLC/caffe/ but whose iterations were prolonged a bit.
@upupnoel The dropout helped for SSL while seems it didn't help the baseline much.
@wenwei202 Thx, I will try them out!