offset loss, angle loss are all zero
Closed this issue · 14 comments
Hi, I am running the training process while all losses are zero except the vote_loss and cls_loss. Is there any setting I need to tweak in the config?
@zye1996 I have same issue.
In the process of calculating some losses, pmask is all zero.
There may be problems in pmask. The sum of ones of pmask and nmask is not correct.
@zye1996 I have same issue.
In the process of calculating some losses, pmask is all zero.
I am not sure this is due to the bug in labeling or it is because all sampled points are out of objects and I am still in the process of find a solution to this. I will keep you updated. Please also let me know once you find any clues.
@zye1996 I have same issue.
In the process of calculating some losses, pmask is all zero.
get some progress here, changing line 244 in target_assigner.py to
pmask = torch.logical_and(pmask.unsqueeze(-1), dist_mask)
recovered losses.
After the model is trained, the evaluation results are not as expected:
Car AP@0.70, 0.70, 0.70:
bbox AP:10.6542, 10.0903, 10.0903
bev AP:9.9624, 10.7949, 10.7949
3d AP:3.4869, 9.0909, 9.0909
aos AP:10.65, 10.09, 10.09
Car AP_R40@0.70, 0.70, 0.70:
bbox AP:5.2180, 4.1630, 4.1630
bev AP:6.5877, 4.1919, 4.1919
3d AP:1.1803, 0.8444, 0.8444
aos AP:5.22, 4.16, 4.16
Car AP@0.70, 0.50, 0.50:
bbox AP:10.6542, 10.0903, 10.0903
bev AP:18.2366, 14.7968, 14.7968
3d AP:10.8402, 10.2320, 10.2320
aos AP:10.65, 10.09, 10.09
Car AP_R40@0.70, 0.50, 0.50:
bbox AP:5.2180, 4.1630, 4.1630
bev AP:13.5625, 9.3519, 9.3519
3d AP:6.4555, 5.5059, 5.5059
aos AP:5.22, 4.16, 4.16
@zye1996 I have same issue.
In the process of calculating some losses, pmask is all zero.get some progress here, changing line 244 in target_assigner.py to
pmask = torch.logical_and(pmask.unsqueeze(-1), dist_mask)
recovered losses.After the model is trained, the evaluation results are not as expected:
Car AP@0.70, 0.70, 0.70: bbox AP:10.6542, 10.0903, 10.0903 bev AP:9.9624, 10.7949, 10.7949 3d AP:3.4869, 9.0909, 9.0909 aos AP:10.65, 10.09, 10.09 Car AP_R40@0.70, 0.70, 0.70: bbox AP:5.2180, 4.1630, 4.1630 bev AP:6.5877, 4.1919, 4.1919 3d AP:1.1803, 0.8444, 0.8444 aos AP:5.22, 4.16, 4.16 Car AP@0.70, 0.50, 0.50: bbox AP:10.6542, 10.0903, 10.0903 bev AP:18.2366, 14.7968, 14.7968 3d AP:10.8402, 10.2320, 10.2320 aos AP:10.65, 10.09, 10.09 Car AP_R40@0.70, 0.50, 0.50: bbox AP:5.2180, 4.1630, 4.1630 bev AP:13.5625, 9.3519, 9.3519 3d AP:6.4555, 5.5059, 5.5059 aos AP:5.22, 4.16, 4.16
Hi~Do you know the reason for this low AP value?
@zye1996 I have same issue.
In the process of calculating some losses, pmask is all zero.get some progress here, changing line 244 in target_assigner.py to
pmask = torch.logical_and(pmask.unsqueeze(-1), dist_mask)
recovered losses.
After the model is trained, the evaluation results are not as expected:Car AP@0.70, 0.70, 0.70: bbox AP:10.6542, 10.0903, 10.0903 bev AP:9.9624, 10.7949, 10.7949 3d AP:3.4869, 9.0909, 9.0909 aos AP:10.65, 10.09, 10.09 Car AP_R40@0.70, 0.70, 0.70: bbox AP:5.2180, 4.1630, 4.1630 bev AP:6.5877, 4.1919, 4.1919 3d AP:1.1803, 0.8444, 0.8444 aos AP:5.22, 4.16, 4.16 Car AP@0.70, 0.50, 0.50: bbox AP:10.6542, 10.0903, 10.0903 bev AP:18.2366, 14.7968, 14.7968 3d AP:10.8402, 10.2320, 10.2320 aos AP:10.65, 10.09, 10.09 Car AP_R40@0.70, 0.50, 0.50: bbox AP:5.2180, 4.1630, 4.1630 bev AP:13.5625, 9.3519, 9.3519 3d AP:6.4555, 5.5059, 5.5059 aos AP:5.22, 4.16, 4.16
Hi~Do you know the reason for this low AP value?
There are plenty of bugs existing in the repo, and I am trying to find out the reason. One bug I found so far is here where the last function should be
int furthest_point_sampling_with_dist_wrapper(int b, int n, int m,
at::Tensor points_tensor, at::Tensor temp_tensor, at::Tensor idx_tensor) {
const float *points = points_tensor.data<float>();
float *temp = temp_tensor.data<float>();
int *idx = idx_tensor.data<int>();
cudaStream_t stream = THCState_getCurrentStream(state);
furthest_point_sampling_kernel_with_dist_launcher(b, n, m, points, temp, idx, stream);
return 1;
}
My understanding is that the repo is pretty much unfinished so be careful.
@zye1996 I have same issue.
In the process of calculating some losses, pmask is all zero.get some progress here, changing line 244 in target_assigner.py to
pmask = torch.logical_and(pmask.unsqueeze(-1), dist_mask)
recovered losses.
After the model is trained, the evaluation results are not as expected:Car AP@0.70, 0.70, 0.70: bbox AP:10.6542, 10.0903, 10.0903 bev AP:9.9624, 10.7949, 10.7949 3d AP:3.4869, 9.0909, 9.0909 aos AP:10.65, 10.09, 10.09 Car AP_R40@0.70, 0.70, 0.70: bbox AP:5.2180, 4.1630, 4.1630 bev AP:6.5877, 4.1919, 4.1919 3d AP:1.1803, 0.8444, 0.8444 aos AP:5.22, 4.16, 4.16 Car AP@0.70, 0.50, 0.50: bbox AP:10.6542, 10.0903, 10.0903 bev AP:18.2366, 14.7968, 14.7968 3d AP:10.8402, 10.2320, 10.2320 aos AP:10.65, 10.09, 10.09 Car AP_R40@0.70, 0.50, 0.50: bbox AP:5.2180, 4.1630, 4.1630 bev AP:13.5625, 9.3519, 9.3519 3d AP:6.4555, 5.5059, 5.5059 aos AP:5.22, 4.16, 4.16
Hi~Do you know the reason for this low AP value?
There are plenty of bugs existing in the repo, and I am trying to find out the reason. One bug I found so far is here where the last function should be
int furthest_point_sampling_with_dist_wrapper(int b, int n, int m, at::Tensor points_tensor, at::Tensor temp_tensor, at::Tensor idx_tensor) { const float *points = points_tensor.data<float>(); float *temp = temp_tensor.data<float>(); int *idx = idx_tensor.data<int>(); cudaStream_t stream = THCState_getCurrentStream(state); furthest_point_sampling_kernel_with_dist_launcher(b, n, m, points, temp, idx, stream); return 1; }
My understanding is that the repo is pretty much unfinished so be careful.
OK, thanks , I'll check the original implementation of 3DSSD and try to find out what's wrong.
@zye1996 I have same issue.
In the process of calculating some losses, pmask is all zero.get some progress here, changing line 244 in target_assigner.py to
pmask = torch.logical_and(pmask.unsqueeze(-1), dist_mask)
recovered losses.
After the model is trained, the evaluation results are not as expected:Car AP@0.70, 0.70, 0.70: bbox AP:10.6542, 10.0903, 10.0903 bev AP:9.9624, 10.7949, 10.7949 3d AP:3.4869, 9.0909, 9.0909 aos AP:10.65, 10.09, 10.09 Car AP_R40@0.70, 0.70, 0.70: bbox AP:5.2180, 4.1630, 4.1630 bev AP:6.5877, 4.1919, 4.1919 3d AP:1.1803, 0.8444, 0.8444 aos AP:5.22, 4.16, 4.16 Car AP@0.70, 0.50, 0.50: bbox AP:10.6542, 10.0903, 10.0903 bev AP:18.2366, 14.7968, 14.7968 3d AP:10.8402, 10.2320, 10.2320 aos AP:10.65, 10.09, 10.09 Car AP_R40@0.70, 0.50, 0.50: bbox AP:5.2180, 4.1630, 4.1630 bev AP:13.5625, 9.3519, 9.3519 3d AP:6.4555, 5.5059, 5.5059 aos AP:5.22, 4.16, 4.16
Hi~Do you know the reason for this low AP value?
There are plenty of bugs existing in the repo, and I am trying to find out the reason. One bug I found so far is here where the last function should be
int furthest_point_sampling_with_dist_wrapper(int b, int n, int m, at::Tensor points_tensor, at::Tensor temp_tensor, at::Tensor idx_tensor) { const float *points = points_tensor.data<float>(); float *temp = temp_tensor.data<float>(); int *idx = idx_tensor.data<int>(); cudaStream_t stream = THCState_getCurrentStream(state); furthest_point_sampling_kernel_with_dist_launcher(b, n, m, points, temp, idx, stream); return 1; }
My understanding is that the repo is pretty much unfinished so be careful.
OK, thanks , I'll check the original implementation of 3DSSD and try to find out what's wrong.
many thanks if you can let me know what's going on with the rest. After I fixed the bug I mentioned above, the loss is unstable and fluctuating.
made some progress here and I will update everything here, since pull request may not make sense if the author is not responding
I reimplement 3DSSD with TensorFlow v2 here.
It might be easier to debug than Tensorflow v1.
I cannot thank @shuto-keio enough😂
I am getting this right now and still trying to fix some small problems:
Car AP@0.70, 0.70, 0.70:
bbox AP:88.3067, 87.4015, 87.4015
bev AP:87.1555, 83.6703, 83.6703
3d AP:81.8364, 75.6352, 75.6352
aos AP:88.31, 87.40, 87.40
Car AP_R40@0.70, 0.70, 0.70:
bbox AP:92.6265, 89.2507, 89.2507
bev AP:89.0312, 84.9429, 84.9429
3d AP:83.2703, 74.8559, 74.8559
aos AP:92.63, 89.25, 89.25
Car AP@0.70, 0.50, 0.50:
bbox AP:88.3067, 87.4015, 87.4015
bev AP:88.4671, 87.8447, 87.8447
3d AP:88.4409, 87.7542, 87.7542
aos AP:88.31, 87.40, 87.40
Car AP_R40@0.70, 0.50, 0.50:
bbox AP:92.6265, 89.2507, 89.2507
bev AP:92.9291, 91.6493, 91.6493
3d AP:92.8714, 91.3892, 91.3892
aos AP:92.63, 89.25, 89.25