chenyangzhu1/MultiBooth

A question

Closed this issue · 1 comments

At the position of the red arrow, S* has become an embedding, do we need to enter the CLIP text encoder again for encoding?
5H``K)$YV}U@_(`)X@~CM43

Yes.
As indicated in the paper, S* functions as a placeholder string, and it is utilized in combination with other words to create a prompt (e.g., “a photo of a S* dog”) before being sent into the text encoder.