The annealing optimization strategy for A-Softmax loss

Question

The annealing optimization strategy for A-Softmax loss

Closed this issue 7 years ago · 2 comments

Thanks for your nice repo.
I'm trying to your codes.
My question is
the paper said about the annealing optimization strategy for A-Softmax loss with introducing lambda.
here, your implementation is

self.lamb = max(self.LambdaMin,self.LambdaMax/(1+0.1*self.it ))
output = cos_theta * 1.0
output[index] -= cos_theta[index]*(1.0+0)/(1+self.lamb)
output[index] += phi_theta[index]*(1.0+0)/(1+self.lamb)

but, i think the cos term is to be scaled by a factor of lambda such that

output = cos_theta * self.lamb
output[index] -= cos_theta[index]*(self.lamb)/(1+self.lamb)
output[index] += phi_theta[index]*(1.0)/(1+self.lamb)

Please, give me your idea
Thanks

Answer 1 · 2018-03-19T15:36:54.000Z

the code is right pls have a double check. @taey16

Answer 2 · 2018-03-20T01:39:36.000Z

@vzvzx ,
Yes, you are right. my mistake.
Thanks for your reply.