Official Review
micronet-challenge-submissions opened this issue · 11 comments
Hi! Thanks for the updates!
Our only outstanding question is about the counting of the mask overhead for sparse weight matrices (1-bit per parameter, including zero valued parameters). Unless I'm missing something, it doesn't look like this is taken into account in your counting script.
Thanks!
Trevor
Thanks for your feedback.
Through your feedback, we could fix our mistake
As you can see in the (revised)Score_MicroNet.ipynb, we change the method of score.
we believe we resolve the overhead issue you mentioned before.
The main counting method is in the 'Counting', and 'count_hooks.py' is the main file.
Then, maybe this issue could be related to conv counting.
def count_convNd(m, x, y):
x = x[0]
kernel_ops = m.weight.size()[2:].numel() * m.in_channels // m.groups
bias_ops = 1 #if m.bias is not None else 0
total_add_ops = y.nelement() * (kernel_ops * non_sparsity(m.weight) - 1) + y.nelement() * bias_ops
total_mul_ops = y.nelement() * kernel_ops * non_sparsity(m.weight)
total_params = m.weight.numel() * non_sparsity(m.weight) + m.weight.shape[0]
m.total_add_ops += torch.Tensor([total_add_ops])
m.total_mul_ops += torch.Tensor([total_mul_ops])
m.total_params += torch.Tensor([total_params])
The only overhead issue could occur in the bias operation. We did not use the bias in conv, but in batchnorm. So, we add the bias counting into conv term. The 'y.nelement() * bias_ops' and ' m.weight.shape[0]' in the above is for bias.
We do not consider sparsity in this bias part during training. In details, during pruning process, we did not prune the bias (including 1-bit parameter). Therefore, we thought there would be no sparsity in 1-bit parameter terms.
Thanks,
Taehyeon kim
Thanks for quick reply.
But, we wonder that we've trained the network parameters with FP32.
Because the precision of all parameters is the same,
we decide the bitmask is not needed here.
For freebie, we apply this in the jupyter notebook file.
Then, you mean that
should we also apply some bitmask in the counting file even this is freebie?
We don't know if we understand this well, but once we uploaded a new jupyter notebook file for scoring.
In this code, we add this term
def bitmask(net):
num = 0
for module in net.parameters():
if module.ndimension() != 1:
num += module.numel()
#1-bit per parameter
return num/32
This function is for bitmask, and
def micro_score(net, precision = 'Freebie'):
input = torch.randn(1, 3, 32, 32).to(net.device)
addflops, multflops, params = count(net, inputs=(input, ))
#use fp-16bit
if precision == 'Freebie':
multflops = multflops / 2
params = params / 2
#add bit-mask
params += bitmask(net)
score = (params/36500000 + (addflops + multflops)/10490000000)
#print('Non zero ratio: {}'.format(non_zero_ratio))
print('Score: {}, flops: {}, params: {}'.format(score, addflops + multflops, params))
return score
Through bit mask, score function is changed like above.
The new score is 0.0054.
Looks good! Thanks for the fix! Two quick questions:
- Do you still want to submit your "ver2" model? I ran it & checked the score in your revised colab and got .0056, which is an excellent score.
- When I run your updated colab I get an error passing "expansion = 3" to the MicroNet class. When I remove this, everything appears to work fine. Just want to make sure this isn't important.
Thanks!
Trevor
Also, what name would you like your entries posted under when the results are revealed?
Trevor
Thanks for reply.
First, if ver2 network also could be accepted, we want to submit. But, ver1 has better score. If only one of ver1 and ver2 needs to be submitted, we will submit ver1.
If not, we want to submit both.
Second, expansion isn’t important. Sorry for confusion.
You mean the team name?
Our team name is ‘KAIST AI’
We prefer this name when the results are revealed.
If you don't mind me, can you give an approximate current ranking of cifar100?
Taehyeon Kim