mlfoundations/wise-ft

Does fine-tune only tweak image encoder?

yushuinanrong opened this issue · 2 comments

First of all, thanks for sharing the codebase.
I briefly went through the codes and it seems like you only fine-tune the image encoder part, is that right? If yes, I'm curious have you tried tweaking both image and text encoders?

Thank you for your question, we have only tried fine-tuning the image encoder and linear classifier output by the text encoder.

Thanks for the clarification.