Issues with extrinsics
Closed this issue · 8 comments
Hello, love your work on this repo.
I have an issue where i use a modified version of stanford render script for my car obj but when i predict on cars with your pre trained model, i dont see any prediction in the gt_compare.
Is this occuring because the coordinate system of blender is not opencv? How do we approach this issue?
Intrinsics and extrinsics are taken from this script https://blender.stackexchange.com/questions/38009/3x4-camera-matrix-from-blender-camera
I also tried colmap_wrapper you have provided in deepvoxels. This estimates the pose but after i feed pose with the corresponing images to pretrained cars, this renders the car out of focus (ie the center of the car is not the center of rotation of the car)
Similarly, I have also generated intrinsics and extrinsics using the above script. My understanding is that the contents of the intrinsics directory (9 vector) is the K matrix flattened, and the contents of the pose directory (16 vector) is the RT matrix flattened, followed by (0,0,0,1). However, so far the predicted normals look wrongly rotated. The cars should be on a turntable, with 0 elevation.
Hey @ebartrum, from my study of the code. It actually only uses the intrinsics.txt file and not the intrinsics folder. The extrinsics seem to be RT matrix flattened but the intrinsics are written like below:
[alpha_u, u_0, v_0, 0.],
[0., 0., 0.],
[1.],
[resolution_x_in_px, resolution_y_in_px].
My issue is that my object is not at the focus of the camera while prediction.
Thanks @feemthan. I think I now see another problem that we are having. From the README, 'Camera poses are assumed to be in a "camera2world" format, i.e., they denote the matrix transform that transforms camera coordinates to world coordinates.' However, the matrices that we are using are transforming world coordinates to camera coordinates: 'An image pixel (u,v) is generated from world (x,y,z) coordinates through a 3x4 matrix using projective coordinates: kx=PX'
@feemthan the results seem to be working now for me! Here's what I did: take the RT matrix from the script you linked. append [0,0,0,1] so it is 4x4. This is world2cam. Now invert this matrix, to make cam2world. Flatten this to a 16 vector, and write in the pose directory, for each image. For intrinsics.txt, I followed your instructions.