hbb1/2d-gaussian-splatting

Questions About Function `compute_aabb()`

xbillowy opened this issue · 4 comments

Thanks for open sourcing such great work!

I have some question about how the compute_aabb() function works. If I understand correctly, the T = glm::transpose(splat2world) * world2ndc * ndc2pix; calculated in the compute_transmat() function should be the transformation matrix from the 2D Gaussian tangent plane to the 2D image plane. Hence, the T3 extracted in compute_aabb() should be the projection of the center of the Gaussian to the 2D image plane, which I suppose would be the point_image? I'm somewhat confused about the computational logic within compute_aabb(), or is there any underlying mathematical derivation that I might have missed?

Specifically, I do not quite understand what the variable float3 temp_point = {1.0f, 1.0f, -1.0f}; is for. Which coordinate system is it defined under and does its value [1.0, 1.0, -1.0] come with any special meanings? What does the distance derived from it denote, and why is it calculated in this manner? Furthermore, the calculation processes for point_image and extent are both somewhat baffling to me.

hbb1 commented

Hi, the named variables are indeed somewhat confusing and I believe the the mathematical calculation is much more clear. Please see here for explanation. You can also check the accuracy of the bbox in our python script.

I'm not quite sure about this step in the derivation: $\frac{Ax+By+Cz+D}{\sqrt{A^2+B^2+C^2}}=\frac{h_3}{\sqrt{(h_0)^2+(h_1)^2}}=1$. What do $h_3$, $h_0$, and $h_1$ represent respectively, and why $\sqrt{A^2+B^2+C^2}$ -> $\sqrt{(h_0)^2+(h_1)^2}$?

hbb1 commented

This is distance of point-to-plane. The third entry is always zero.

So $h_0, h_1, h_2, h_3$ are entries of the transformed plane (now in Gaussian tangent plane, originally defined in the image plane). Since the third entry of the originally defined x- or y-plane is always 0, it remains 0 after transformation.