nerfstudio-project/nerfstudio

Transform conversion from colmap `transform.json` to `camera_path.json`

Opened this issue · 1 comments

For the same frame in a dataset:

Result from transforms_train.json (ns-export camera)

        {
            "file_path": "images/frame_00001.jpg",
            "transform_matrix": [
                [
                    0.9837186500788239,
                    -4.931517566535353e-07,
                    0.17971537910499882,
                    -3.0586481911644254
                ],
                [
                    0.17957862323929175,
                    -0.039001623760136724,
                    -0.9829701884693925,
                    -6.690857223027059
                ],
                [
                    0.0070096763532386825,
                    0.999239147223441,
                    -0.03836653611669037,
                    -0.8800965851079441
                ],
                [
                    0.0,
                    0.0,
                    0.0,
                    1.0
                ]
            ],
            "colmap_im_id": 1
        },

Result from colmap transform.json:

    {
        "file_path": "processed/ARRecording_DCF861EF-2ACC-4834-81FF-11E558FC4F83/images_2/frame_00001.jpg",
        "transform": [
            [
                0.9834007024765015,
                -0.03150152042508125,
                0.17869171500205994,
                -0.48375824093818665
            ],
            [
                0.18097257614135742,
                0.09910257160663605,
                -0.9784823060035706,
                -1.0
            ],
            [
                0.013114869594573975,
                0.99457848072052,
                0.10315844416618347,
                -0.0028430684469640255
            ]
        ]
    },

Do transforms_train.json (from ns-export camera) share the same coordinates with camera_path.json from splatfacto?
What's the conversion from colmap to camera_path.json coordinates? Is there any scale difference?

My understanding is there's an additional layer of transformation between pose in transform.json and pose actually used by nerfstudio while training. That transformation is applied in nerfstudio_dataparser.py:L236-249. In my understanding these lines does following to the original colmap camera pose.

    camera_pose = transform@camera_pose
    camera_pose[:3,3] *= scale_factor

The transformation matrix and scale factor can be found in dataparser_transforms.json in the output folder. The applied_transform field in transform.json is not used I think.