Questions about the training mechanis of AST
Taron001 opened this issue · 2 comments
AST is very useful. I would like to ask the authors how you came up with it? Thank you!
Hi, thanks for your interest.
In my early experiments, I found that time-interpolation task in real-world dataset, such as HyperNeRF/Nerfies, has a lot of jitter frames. This issue essentially exposes the inaccuracy of camera poses in dynamic real-world scenes (since COLMAP is used for estimation). To mitigate frame-to-frame jitter, I introduced AST to alleviate overfitting to a specific timestamp during training through smoothing. This also, to some extent, reveals the sensitivity of explicit modeling methods (like 3D-GS) to inaccurate poses. In contrast, HyperNeRF, which uses NeRF and is an implicit representation, does not exhibit such pronounced jitter.
Using 3D-GS (or other explicit representations) to robustly counteract inaccurate poses in real-world scenes and improve the accuracy of camera pose estimation will undoubtedly be an important research direction in the future.