POSTECH-CVLab/point-transformer

A few queries regarding the code in `pointops.py`

JeS24 opened this issue · 0 comments

JeS24 commented

I went through your code and had a few queries regarding some segments. I have compiled the queries below and I would really appreciate it, if you can clarify them:

  1. In the queryandgroup() function in pointops.py, there are these lines (93 - 94):
grouped_xyz = xyz[idx.view(-1).long(), :].view(m, nsample, 3) # (m, nsample, 3)
grouped_xyz -= new_xyz.unsqueeze(1) # (m, nsample, 3)

My question here is, why is the second line where the difference is taken necessary? As I understand it, grouped_xyz comes from xyz, which is the point-set in the current layer / stage and new_xyz is the point-set for the next layer / stage. grouped_xyz is only ever returned when the parameter use_xyz for queryandgroup() is set to True, which is the case in one call to queryandgroup() in PointTransformerLayer and another call in TransitionDown. In PointTransformerLayer, this represents displacement vectors between all points and their respective neighbours. But in TransitionDown, it represents displacement vectors between points obtained via Furthest Point Sampling (in new_xyz) and points obtained via kNN (in xyz[idx...] and hence grouped_xyz). I am unable to understand why the latter operation is performed and what is its significance?

  1. As I understand it, the interpolation() function in pointops.py works as follows:
    • For each point in new_xyz with n (> m) points, 3 neighbours are found using kNN in xyz, that has m points.
    • Then, the inverse of distance is taken and normalized over the 3 neighbours to get to weight.
    • Finally in these lines (176 - 177), we multiply features for each neighbour with the corresponding weights and sum over the neighbours to get new_feat.

I want to know, if this is a correct summary of the function or if I am missing or misunderstanding something here? Also, why was k set to 3 neighbours in this case? Is it an arbitrary choice? Thanks and regards.