yinjunbo/3DVID

Questions regarding the Spatial Attention

Opened this issue · 0 comments

Hi @yinjunbo ,

First of all, congratulations and thanks a lot for the great work!

I am trying to implement Spatial Attention layer, but I am running into CUDA out of memory issues. I want to confirm if the attention weight that is computed in the paper, does it mean every pixel in the downsampled BEV grid is compared against every other pixel in the grid? If yes, won't this be quite heavy in gradient computation particularly if using the VoxelNet model? Did you have similar memory issues in training it?

Could you please clarify my queries?

Thanks!