Question about arbitrary width in a universally slimmable network.
sseung0703 opened this issue · 8 comments
Hi, thank you for your great works. :)
I have a question about arbitrary width in a universally silmmable network.
In your paper, you mentioned that you sample a random width ratio for each sub-network, which is worthy than your previous works because it is not discrete. However, I think that the universally slimmable network has still discrete with a fine step (0.025).
Can you explain why you didn't use a continuous random ratio?
It is continuous. Please check the code:
Line 414 in 4bb2a62
Don't look at the yaml file as they are used for inference, i.e., we show some sub-networks with finite step 0.025.
Thank you for the rapid reply. I missed that line. Thanks. :)
No problem! Usually it takes longer and longer to get the reply on my GitHub issues as I graduated and started to work full-time. I was trying to clean up the issues just now. :)
So good news to me :D. Now I'm struggling to implement your work via TF, and your reply will be very helpful to me.
I have another question about an arbitrary width. In my understanding, you use only one width ratio for each sub-network in the training phase because the default value of FLAGS.nonuniform is False.
I wonder that do you ever try to use a fully arbitrary width ratio for each layer?
Implementing with TF will cost a lot of time as TF uses static graph.
I already implemented Autoslim with TF2, but I just want to make sure all the configuration is right. In the case of MobileNetv2 on CIFAR10, the overall training time is less than 4 hour on a single GPU, which look not so heavy.
If you want to visit my repository you can find it in the below link :).
https://github.com/sseung0703/Autoslim_TF2
By the way, would you kindly reply to my above question?
@sseung0703 nonuniform is true for AutoSlim. See branch v3.0.0.
Thanks, I found what you said :).