Ablation study of auxiliary losses?
joeyz0z opened this issue · 1 comments
joeyz0z commented
Hello,
I was wondering about the role of auxiliary losses on each intermediate decoder layer. Can it help to accelerate the model convergence or for other purposes?
Thanks!
ttengwang commented
The intermediate loss splits the learning into multiple steps and may ease the learning process. I observed it improves both localization and captioning performance, but I didn't remember it helps convergence.
The design follows the DETR and Deformable-DETR and you may find more analysis in these papers.