JiahuiYu/slimmable_networks

Question about arbitrary width in a universally slimmable network.

sseung0703 opened this issue · 8 comments

Hi, thank you for your great works. :)
I have a question about arbitrary width in a universally silmmable network.
In your paper, you mentioned that you sample a random width ratio for each sub-network, which is worthy than your previous works because it is not discrete. However, I think that the universally slimmable network has still discrete with a fine step (0.025).

Can you explain why you didn't use a continuous random ratio?

It is continuous. Please check the code:

random.uniform(min_width, max_width))

Don't look at the yaml file as they are used for inference, i.e., we show some sub-networks with finite step 0.025.

Thank you for the rapid reply. I missed that line. Thanks. :)

No problem! Usually it takes longer and longer to get the reply on my GitHub issues as I graduated and started to work full-time. I was trying to clean up the issues just now. :)

So good news to me :D. Now I'm struggling to implement your work via TF, and your reply will be very helpful to me.

I have another question about an arbitrary width. In my understanding, you use only one width ratio for each sub-network in the training phase because the default value of FLAGS.nonuniform is False.
I wonder that do you ever try to use a fully arbitrary width ratio for each layer?

Implementing with TF will cost a lot of time as TF uses static graph.

I already implemented Autoslim with TF2, but I just want to make sure all the configuration is right. In the case of MobileNetv2 on CIFAR10, the overall training time is less than 4 hour on a single GPU, which look not so heavy.

If you want to visit my repository you can find it in the below link :).
https://github.com/sseung0703/Autoslim_TF2

By the way, would you kindly reply to my above question?

@sseung0703 nonuniform is true for AutoSlim. See branch v3.0.0.

Thanks, I found what you said :).