questions about text adapter
zhangbw21 opened this issue · 1 comments
zhangbw21 commented
hello, you said in your paper that you use adapter in both visual and text stream, but in your code i just find the visual one, which one is correct? Thanks a lot.
vishaal27 commented
I believe this is the first variant of CLIP-Adapter that they describe in Section 4.1.1 of the paper (Training Settings). The exact quote from the paper is:
"The first variant of CLIP-Adapter is adopted by default if not specified, which fine-tunes the image feature while freezes the classifier weight. In other words, it only implements CLIP-Adapter for the visual adapter".