Add batch norm for a nice boost

Question

Add batch norm for a nice boost

Closed this issue 6 years ago · 4 comments

pommedeterresautee commented 6 years ago

As explained in previous messages, my training stops learning quite rapidly (after seeing few thousands queries).
Performance are good but I had the feeling I got some overfitting and regularization may help.
I tried many tricks including, as said in previous message, dropout.
Finally I added batch norm on each CNN -> +8 abs points on MAP!! (I manually checked the results after)
I have of course absolutely no idea how well it generalizes to other dataset, just saw a small boost on Wikiqa dataset but training loss reached 0 during the first epoch, so I think it s a slightly too strong regul for such a small dataset.
May be you want to try it on SOGOU and Bing logs...

Answer 1 · 2018-09-03T17:45:23.000Z

2 things :

in my case, increasing momentum of batch norm stabilize the CNN version (slow increase of the map on the dev set, then reach the max, and very slow decrease). I set Momentum to 0.5. That way i increased the step variable (less measures) because I don't fear to miss the max. So increased learning.
I get much better/stable result when the embedding is frozen

If you try dont forget to user eval() mode on the model before evaluation

Answer 2 · 2018-09-04T01:44:37.000Z

Thank you for your opinion. I will do some experience about it. And welcome to pay attention to our work continuously.

Answer 3 · 2018-09-04T11:38:29.000Z

You will publish your next work on live in this repo ?

Answer 4 · 2018-09-12T08:07:16.000Z

Made more experiences. Used 2 other of our website logs both using relevancy measures (BM25 from ES), normalization didn't help for both. I think it was very specific to the first dataset (lots of noise in the results, very rapid learning). I am closing the issue as it seems to not generalize.