baaivision/EVA

different weight decay between code and paper for clip 18b

Closed this issue · 1 comments

In the paper, wd was 0, while in the code base wd is set to default value which is 0.02

@Novestars Appreciate you bringing this to our attention. The wd is set to 0 during the training of both EVA-CLIP-18B and EVA-CLIP-8B models.