Question about Semantic-guided Spatial-temporal Consistency Regularization
fang196 opened this issue · 0 comments
fang196 commented
Thanks for the great work!
I have three questions about Semantic-guided Spatial-temporal Consistency Regularization.
- What is the reason for dividing the complete stitched point cloud into regular grids rather than using short-term temporality directly?
- What does the symbol * represent in Equation 3? Does it indicate a cross product operation?
- It is stated that the image is matched to the first frame of the point cloud $P_1$ using pixel-point correspondences ${\hat{x}i^1, \hat{p}i^1}{i=1}^{\hat{M}}$. This implies that for values of $k$ ranging from 1 to $K$, we have $t{\hat{i}}^k = t_{\hat{i}}^1$ and $\hat{x}{\hat{i}}^k = \hat{x}{\hat{i}}^1$. However, in Equation 4, the text embeddings are denoted as
$t_{\hat{i}}^1$ , while the image embeddings are denoted as$\hat{x}_{\hat{i}}^{\hat{k}}$ . Why is this the case?