YuxianMeng/Matrix-Capsules-pytorch

there may be some mistakes in your code

Closed this issue · 1 comments

  1. for the loss funtion in 'class CapsNet' (train.py)
    you use mask = u.ge(0).float() to complete the max operation. but after a check about this function(tensor.ge) i think we should not us this function. because we want to preserve the value which >=0 and change the value which <0 into 0. If we use tensor.ge, we would turn all the value which is >=0 into 1. It's fault i think.
    2.loss = ((masku)2).sum()/b - m2 #float
    for this code, why we should mask
    u and why we should - m**2. I could not find an introduction in the equ5 in the paper.

Thanks for your kind help!

So, addressing the second question, the point of dividing by b-m**2 is to normalize the loss. You can remove that part and the code will run just fine, but you'll find your loss giving much larger values. As for the first point, we use the tensor.ge() operation because when we multiply our our value u by the mask, it in effect calculates equation (3) in the paper Matrix Capsules With EM Routing.