CNN
carpedm20 opened this issue · 12 comments
I've been working on an implementation of the CNN portion of this paper, and I may be able to help w/ the CNN model and cell searches if you're interested.
One issue I don't think they address in the paper is how they're handling spatial dimensions -- do you have any thoughts on this? I'm guessing they pad s.t. the input and output of each layer is the same?
Sure! I'll take some rest for now so any help would be appreciated. Yes, I guess they used padding to make dimension consistent like:
pool = nn.MaxPool2d(kernel_size=3, stride=1, padding=1)
Great -- I'll fork and do some work over the weekend. By the way -- how long does the RNN experiment run for, and what's the final PPL you're getting? Is it similar to what they report in the paper?
40 epochs (train PPL=56) took 6 hours with gpu980 and will take 22 hours for 150 epoch. I didn't reach the end yet and I think the scale of reward and loss might need some changes.
I've implemented some of the micro-CNN search space, though in a different project that's not totally compatible with this one. I'm going to clean it up over the next couple of days and I'll post a link here when it'd be reasonable for other people to take a look at it.
I'm currently having trouble reproducing the results from the paper -- the ENAS CNN training seems very unstable. I need to do some further experiments to understand how weight sharing affects the convergence of the individual architectures.
@bkj Did you manage to reproduce the results? I too implemented from scratch but am getting around 82% accuracy.
No -- I have not been able to reproduce the results. I moved from using a RL controller to something simpler (random search, basically) and have trained models w/ ENAS-style parameter sharing to 92% test accuracy, while my baseline preactivation ResNet18 gets > 93% when trained w/ the same settings.
~ Ben
Thanks! I am doing the same but getting ~82%. Have you open-sourced your code (or can you please share your code)?
Yes it's here -- https://github.com/bkj/ripenet
No documentation yet, open an issue if you have questions.
~ Ben
@carpedm20 @bkj @karandwivedi42 @dukebw ,Hi,Can you run this code successfilly? When I run it by : python main.py --network_type cnn --dataset cifar10 --controller_optim momentum --controller_lr_cosine=True --controller_lr_max 0.05 --controller_lr_min 0.0001 --entropy_coeff 0.1,I met some errors. What I want to do is find cnn arvhitectures and make them visualized. Would you please tell me what changes Ishould do to the code before I run it. Thanks for your reply.
@karandwivedi42 ,Thank you ,the code linked in the README on this repo I have run successfully.But now what I want to do is make the CNN architectures searched visualized.