zhyever/SimIPU

Have you tried not to crop gradient of f^{\alpha} in eq7?

Hiusam opened this issue · 1 comments

Hi, I like your good work!
I am wondering have you tried not to crop the gradient of $f^{\alpha}$ in eq7?
If you crop the gradient, it seems like the pertaining of the point branch cannot learn anything from the image branch.

Hi,

Thanks for your attention to our work.

Please remember the goal of SimIPU is to learn spatial-aware visual representation. The point branch is only used to capture spatial information and transfer it to the image encoder during the pre-training stage. After that, the point branch is abandoned, and we only use the image encoder to fine-tune on downstream tasks. So during pre-training, there is no need for the point branch to learn anything from the image branch. The knowledge learned from the equivalent of global transformation is the key insight of the point branch.

Indeed, the point branch can learn some semantic, color, or texture information from the image branch, which can benefit the point branch pre-training. While this is not our goal in SimIPU, it is interesting to explore some LIDAR encoder pre-training strategies based on our SimIPU pipeline. Feel free to further discussion.