Jianlong-Fu/Recurrent-Attention-CNN

The effect of softmax loss and rank loss

Opened this issue · 21 comments

When training the APN layers, it seems that rank loss tries to make the three scale softmax losses listed in descending order and enlarge the gap among them. On the contrary, when training the convolutional/classification layers only with sum of softmax losses, softmax loss of every scale tends to be equal, which means that the gap among them is narrowed.

Is this reasonable for training? Although every training stage only updates corresponding parameters, I still have doubt whether these two stages will cancel each other.

The source code do not have rank loss, can you tell me where I can get the rank loss, or can you send me the rank loss ? Thanks a lot.

@zanghao2 Here is a simple implementation. Hope it helps. https://gist.github.com/QQQYang/e535f336813b44d72d3b1d6184bf4586

@QQQYang Hello,I want to retrain the project, but my ability is very low. Are you willing to help me?Sharing your train.prototxt? Thank you very much!

@chenfeima This is my train_cnn.prototxt. But I have not achieved good performance on my own dataset. Maybe it needs some fixes. If you find some errors in the prototxt file, please keep me informed. Thank you.
https://gist.github.com/QQQYang/3b8b564554c02fc55325dc026747bdb6

@QQQYang Thank you very much! The RankLoss whether is https://gist.github.com/QQQYang/e535f336813b44d72d3b1d6184bf4586 ? If not I also need your RankLoss. My own train_prototxt and rankloss is very bad, only get 77% accuary on cub200 by scale1+2.

@chenfeima I have updated the prototxt file to keep consistent with the RankLoss above. You can check the train.prototxt again.

@QQQYang Tank you very much!

@QQQYang I have down this: 1. Fix the apn net, optimize by softmaxloss. 2. Fix conv/fc, optimize by your RankLoss. 3. Fix the apn net, optimize by softmaxloss. I only get 0.8% acc improvement in scale2. Whether my strategy is wrong? What about your strategy and result? Whether the RankLoss is not perfect?

@chenfeima My strategy is the same with you. I did not test on public datasets, but got poor performance on my own dataset. The RankLoss is written according to the original paper and passed the gradient test. Maybe there is something wrong with my RankLoss. I did not debug this project for a time.

@QQQYang Is it necessary that I compile the attentioncrop layer and rank loss in my own caffe firstly ,secondly I can use train.prototxt?

@cocowf Yes, you have to compile them first on Linux. Feel free to use the train.prototxt.

@QQQYang Hello!How you adjust the parameters when you optimize APN by RankLoss?(ek, margins, learning rate). When stop optimize APN by RankLoss, and change to optimize scale2 by softmax loss?

@chenfeima I did not spend much time on adjusting hyperparameters. So I cannot give any advice. What I have done is preparing two train.prototxt with different learning rate respectively. In each prototxt file, I adopted similar parameters and strategy as traditional networks, like learning rate decay, fixed margin. When training the whole network, these two files are used alternately.

@QQQYang when I trian my own data,rank loss is increasing ,loss1,2,3 and accuracy shake steady.I want to know your learning rate and how to change margin .In addtion,is it comvenient for you to leave another contact,such as QQ .my qq is 597512150

@QQQYang HELLO, why I think your rankloss is not consistent with the original paper? Pred[label[i]+i×dim+dim/3×j]- Pred[label[i]+i×dim+dim/3×(j+1)]

@QQQYang
Thanks for your contribution to implementing the Rank Loss. Have your re-implement the result on the paper? Or you just trained on your own dataset?

@lhCheung1991 I just tested on my own dataset.

@QQQYang
OK. Could you share the alternating-training script for RA-CNN. I will very appreciate.

@QQQYang I am trying to train the RACNN, could you send me the rank_loss?
I think the loss is not correct. https://gist.github.com/QQQYng/e535f336813b44d72d3b1d6184bf4586

Hello, I have added the rank_loss2_layer you provided to the RACNN provided by the original author, but even after training many times, the loss has not changed. Have you solved this problem?

I can't download the source code,can you send me the source code with caffe?