How to select the subset of input channels in each head?

Question

How to select the subset of input channels in each head?

Opened this issue 4 years ago · 1 comments

Thanks for your perfect work! But in the paper, the input channels will be re-weighted by the SE block firstly. And then select the top-k subset to the normal convolution to get the output of each head. But your code just pass the whole re-weighted input channels to the normal convolution, whose shape is (C_out // num_heads, C_in, k, k). If so, the amount of calculation and parameters will not decrease. Therefore, I don't notice the select progress. Could you please explain to me?

Answer 1 · 2020-12-02T18:57:52.000Z

Yes, all the weights were passed to the normal convolution, but in these weights, some of them were assigned as 0. Please see https://github.com/zhuogege1943/dgc/blob/ba074863dc289f5875202288aa286ca22b94e15b/layers.py#L123
Those weights didn't contribute to the output. This is only convenient for training.
In testing, we can prune those 0 weight filters without affecting the results.