train.sh
November198 opened this issue · 2 comments
May I ask that the backbone of the reproduction is DINO, so how are the two loss functions proposed in the paper reflected in the code? If MODEL.TWO_STREAM is set to False in the train.sh file, then the two loss functions proposed in the paper are not reflected. Thank you very much for your consideration.
@Dorothy-h357h thank you for your interest in our work.
In our code, during training we pass our two global views through teacher (here) and use as targets to guide the student outputs. The self-distillation loss is defined entirely in loss class following the DINO loss (from their paper).
The two different losses in our paper (local to global and global to global) are implemented together in that single for loop inside the DINO loss class forward function. We generate these multiple views in our dataset class here and here.
Hope this clarifies your question.
@Dorothy-h357h thank you for your interest in our work.
In our code, during training we pass our two global views through teacher (here) and use as targets to guide the student outputs. The self-distillation loss is defined entirely in loss class following the DINO loss (from their paper).
The two different losses in our paper (local to global and global to global) are implemented together in that single for loop inside the DINO loss class forward function. We generate these multiple views in our dataset class here and here.
Hope this clarifies your question.
Thank you very much! I understand a lot.