Point triangulation solving
nisace opened this issue · 8 comments
Hi all,
I'm trying to use the triangulation method on my own dataset but I struggle to understand how can the third component of the 2D homogeneous keypoints be ignored.
The triangulation problem aims to solve the equation AX=0
where A is constructed from the known projection matrices P
and the image coordinates of the 2D keypoints x
.
The matrix A
is constructed from the fact that we want X
such that x ^ PX = 0
for ^
denotes the cross product between two vectors.
As far as I understand, x=[u, v, w]
where u
and v
are the image coordinates of the 2D keypoints but I don't understand how to get the value of w
.
From the code, I understand that w
is assumed to be equal to 1 but in that case, u
and v
are only known up to a factor from the 2D keypoints, is that correct? Is so, how can I consider w
as an additional unknown and solve for it alongside with the 3D keypoint coordinates?
Thanks
Hi, @nisace! Thank you for thoughtful question.
You're completely right, except that X=[u, v, w]
(3-dim vector). Actually it's a 4-dim vector X=[x, y, z, w]
(or 3D homogenous vector).
You can refer to this function for more details about algebraic triangulation.
Thanks @karfly for your answer.
I agree that X=[x, y, z, w]
is the solution of AX=0
and that we can get the 3D point coordinates as [x/w, y/w, z/w]
.
My question is about the 2D keypoint coordinates x
(or the points
argument of the function). These coordinates are of dimension 3 [u, v, w]
. However, the function takes points
of shape (n ,2)
because it assumes that x
can be built from the image coordinates u
and v
and assumes that w=1
.
It's the w=1
that I don't understand. Where does this assumption come from?
Thanks again for your quick response.
Yes I saw that but the question then is: why would the pixel coordinates of the 2D keypoints on the images correspond to a w
value of 1?
I made an experiment where I simulate multiple images by projecting a 3D point with different matrices which gives me a set of "2D" points of dimension 3. Then I give the matrices and the 2D points as input to your function to see if I can reconstructed the 3D point.
If I understand correctly, the first two components of the vector given by the projection are the pixel coordinates. The third component can be ignored (see https://www.khronos.org/registry/OpenGL-Refpages/gl2.1/xhtml/gluProject.xml). Thus I thought your function would expect the first two components as inputs. However the result is wrong if I do so.
I need to divide the first two components of the 2D points by the third component in order to get the correct result.
Let's look at the equation x = PX
. Here X
is a 3D homogenous X=[x,y,z,1]
. Element P[-1][-1]
always equals to 1
=> the last component of x
is always 1
(x=[u,v,w]=[u,v,1]
).
If you want to use some w
not equal to 1
in the triangulation method above, then you should also change the w
in 3D point X
.
Why is P[-1][-1]
equal to 1
? I understand that P
is constructed as follows (from http://ksimek.github.io/2013/08/13/intrinsic/) and your code seems to follow this convention.
And even P[-1][-1] = 1
does not imply w = 1
, we need the P[-1] = [0, 0, 0, 1]
for that.
Or maybe you have some assumptions on the matrix [ R | t ]
?
Thanks