SHI-Labs/Neighborhood-Attention-Transformer

Is it possible to do upsampling using NAT ?

jimmysue opened this issue · 2 comments

As the title saying, how can i apply NAT to upsample the feature map?

Hello and thank you for your interest.

Just to make a clarification, NA is the attention pattern, NAT is the hierarchical transformer model.

If you're asking how to upsample feature maps using NA, we have not yet explored direct upsampling using NA.

One very easy approach to upsample with attention involved is to upsample the query tensor in self attention (via interpolation or any other upsampling operation). However, this will still have a quadratic time and space complexity.
While it's unlikely that there's a theoretical limit to doing something similar with NA, NA is bound to the implementation in NATTEN, which means the input tensors (query, key and value) have to be of the same shape. This means it's not possible to upsample the query feature map and apply NA, because there would have to be a mapping between the feature map sizes, and the implementation would have to support it.

If we were to recommend anything, it would be either upsampling everything before or after applying NA, which is pretty trivial. You could even replace the QKV projections or the final projection with an upsampling layer as well.

I hope this answers your question.

Closing due to inactivity. Feel free to reopen if you still have questions.