gngdb opened this issue 7 years ago · 0 comments
Should train with SGD and two decay steps, in the traditional way.