Question about Cross-view Hybrid attention

Thanks for sharing the great work.

Regarding to Cross-view Hybrid attention, is it only apllied for the HW top plane?

TPVFormer/tpvformer04/modules/tpvformer_layer.py

Line 172 in 2073589

query[0],

The query is itself, key and value are both none while later in cross-view hybrid attention the value is set to be the concatenation of queries

TPVFormer/tpvformer04/modules/cross_view_hybrid_attention.py

Line 163 in 2073589

value = torch.cat([query, query], 0)

I have the same question, it looks like there is no interaction between the features of the three planes.

Thanks for your interest in our work.
Your understanding of the code is correct. That is, in TPVFormer04, cross-view hybrid attention is enabled only in the HW plane, thus degrading to self-attention.

@huang-yh Thanks for your reply. May i further ask the idea behind this? Similar performance when disabling the attention in the other two planes?