wenwenyu/TCM

Where is the implementation of Meta Query in the codes?

Opened this issue · 0 comments

I'm confused about the implementation of Language Prompt Module. According to Figure 4 and Sec. 3.2.3, a Meta Query is learned to genearate implicit conditional cue cc via Language Prompt Module. However, according to Fig. 5 and the codes below, it seems that the conditional cue cc is generated based on the global image feature, instead of Meta Query. These two findings appear mutually contradictory to me.

So my question is where can I find the implementation of Meta Query in the codes? By the way, what is the difference between the CoOp-like learnable prompts described in Sec. 3.2.2 and Meta Query?

# texts is set of classes name embedding, contexts is learnable prompt embedding
prompt_gen = None
# text prompting
if self.prompt_generator is not None: # text prompt generator
prompt_gen = self.prompt_generator(global_feat) # (B, C)
contexts = self.contexts if self.use_learnable_prompt else None # (1, N, C)
# (B, K, C), last time step t as output, (BKLC->BKC)
# (1, K, D) -> (B, K, D)
text_embeddings = self.text_encoder(
self.texts.to(global_feat.device),
contexts,
use_learnable_prompt_only=self.use_learnable_prompt_only,
prompt_gen=prompt_gen).expand(B, -1, -1)

image

image