Question of the learnable matrix of 3D Fourier positional embeddings

Question

Question of the learnable matrix of 3D Fourier positional embeddings

TianhangXiang opened this issue a year ago · 2 comments

Thank you for the great work!
In the LL3DA paper, the matrix B of the click position is notated as learnable. But in the code, the positional embeddings are within "with torch.no_grad", it seems that there is no gradient for the matrix, do I miss something, or does keeping the matrix frozen lead to a better result?

Answer 1 · 2024-04-27T12:40:35.000Z

You are right, the gaussian B matrix are randomly initialized ($\mathcal{N}(0, 1)$), and kept frozen the whole time: https://github.com/Open3DA/LL3DA/blob/main/models/ll3da/position_embedding.py#L36. You are welcome to try training these parameters and see whether they could lead to better performance.

Answer 2 · 2024-04-27T12:48:14.000Z

You are right, the gaussian B matrix are randomly initialized ($\mathcal{N}(0, 1)$), and kept frozen the whole time: https://github.com/Open3DA/LL3DA/blob/main/models/ll3da/position_embedding.py#L36. You are welcome to try training these parameters and see whether they could lead to better performance.

Thank you for your response!