slothfulxtx/cxtrack3d

Runtime Error

StrivedTye opened this issue · 5 comments

Hi,
Thanks for your great work!
I have a question when reproducting the project on my own PC.
image
I replaced the knn_points and knn_gather of torch3d with pointnet2_utils.knn_point and grouping_operation.
Thanks!

Hi, thanks for your interest. My suggestion is replacing pointnet2_utils with pytorch3d😀. Compared with pointnet-ops, pytorch3d has wide and long-term support. For inplace modification problem, I think these blogs 1 2 are helpful. Could you please provide more info to indicate which line the error is raised?

Hi, thanks for your interest. My suggestion is replacing pointnet2_utils with pytorch3d😀. Compared with pointnet-ops, pytorch3d has wide and long-term support. For inplace modification problem, I think these blogs 1 2 are helpful. Could you please provide more info to indicate which line the error is raised?

I modified the code in the backbone.py, and only made changes in the function of get_graph_feature. The ERROR only encountered in the training phase, and the testing can run correctly after loading your pre-trained weights.
The modified codes are as follows:

    def get_graph_feature(self, new_xyz, new_feat, xyz, feat, k, use_xyz=False):
        bs = xyz.size(0)
        device = torch.device('cuda')
        feat = feat.permute(0, 2, 1).contiguous() if feat is not None else None
        new_feat = new_feat.permute(0, 2, 1).contiguous() if new_feat is not None else None
        if use_xyz:
            feat = torch.cat([feat, xyz], dim=-1) if feat is not None else xyz
            new_feat = torch.cat([new_feat, new_xyz], dim=-1) if new_feat is not None else new_xyz # b, n, c
        
        # Authors with pytorch3d
        # _, knn_idx, _ = pytorch3d.ops.knn_points(new_xyz, xyz, K=k, return_nn=True)
        # knn_feat = pytorch3d.ops.knn_gather(feat, knn_idx)  # b,n1,k,c
        # feat_tiled = new_feat.unsqueeze(-2).repeat(1, 1, k, 1)
        # edge_feat = torch.cat([knn_feat-feat_tiled, feat_tiled], dim=-1)
        # return edge_feat.permute(0, 3, 1, 2).contiguous()

        # tye with pointnet2_ops
        knn_idx = pointnet2_utils.knn_point(k, new_xyz, xyz)  # (B, npoint, k)
        knn_feat = pointnet2_utils.grouping_operation(feat.permute(0, 2, 1).contiguous(), knn_idx) #[b, c, n1, k]
        feat_tiled = new_feat.unsqueeze(-2).repeat(1, 1, k, 1).permute(0, 3, 1, 2).contiguous()
        edge_feat = torch.cat([knn_feat-feat_tiled, feat_tiled], dim=1)
        return edge_feat

I think the pointnet2_ops has no impact on the training phase, because this error still occurred when I use the backbone of pointNet++ instead of DGCNN.
In addition, I found the in-place operation in the Line 107 of the cxtrack_task.py, I am not sure whether this line results in the Runtime Error. The following refined_bboxes[:, :, :4] seems it is using the in-place operation.
Thanks!! Look forward to the further communication.

        loss_refined = loss_func(refined_bboxes[:, :, :4], search_bbox_gt[:, None, :4].expand_as(
            refined_bboxes[:, :, :4]), reduction='none')

you should modify the dropout(inplace=True) to dropout(inplace=False) in transformer layer

@AlexWang1900 Thank you very much!!