ShengcaiLiao/TransMatcher

Questions about input of TransMatcher and Variable Naming in implementation

turtleman99 opened this issue · 1 comments

Hi @ShengcaiLiao ,

Thank you so much for the impressive work!

I'm not familiar with person identification tasks, but I found another your paper that says, "The detection sub-task is to determine the presence of the probe subject in the gallery, and the identification sub-task is to determine which person in the gallery has the same identity as the accepted probe." So I assume the memory here (in the TransMatcher instance initialization) should be the gallery feature.

Let's look at the forward function of the TransMatcher,

    def forward(self, features):
        score = self.decoder(self.memory, features)
        return score

The first input is memory, and the second is features. However, in the TransformerDecoder definition, it go as follow

    def forward(self, tgt: Tensor, memory: Tensor) -> Tensor:
        r"""Pass the inputs through the decoder layer in turn.
        Args:
            tgt: the sequence to the decoder (required).
            memory: the sequence from the last layer of the encoder (required).
        Shape:
            tgt: [q, h, w, d*n], where q is the query length, d is d_model, n is num_layers, and (h, w) is feature map size
            memory: [k, h, w, d*n], where k is the memory length
        """

The tgt and memory variables here confuse me. Which should be probe (query), and which should be gallery features?

Thank you for your reply in advance.

Hi Guangyuan,

Thank you for the interest. I'm sorry for the confusion, but don't worry, they are mostly exchangeable. You can forget about gallery and probe/query. Basically, TransMatcher here is doing pairwise image matching, or pairwise distance learning, so by definition the two input images should be able to be swapped, or at least, they are in the same standing but not something that should be differentiated.

What are you targeting for, with TransMatcher, by the way?