LindgeW opened this issue 3 years ago · 2 comments
优化器必须采用SGD吗,用Adam可以吗?(将L2R用于其他任务)
另外一个是,算法图的第10行Lg对eps求梯度,这两个之间没有联系吧,不会报One of the differentiated Tensors appears to not have been used in the graph错误吗
@sailist