Question about epipolar_fusion algorithm and code.

Question

Question about epipolar_fusion algorithm and code.

Opened this issue 6 months ago · 0 comments

NanCheng2001 commented 6 months ago

Generally speaking, aren't the keys and values in Transformer derived from image features? As the key you are using comes from src_feature. But why does the value here come from nn. pos_embed (embedding)? This seems to contradict common patterns. And it's also described in your paper like this, which makes me a bit puzzled.