linxuewu/Sparse4D

NMS implementation for multiple false positive

samueleruffino99 opened this issue · 30 comments

Hello, really nice work!
I have tested your demo, and actually I have seen it is really dependant on the confidence threshold.
Would you suggest to also implement a nms ? Or it is already handle in some way during tracking?
For eample in this image you have multiple predictions for the same pedestrian.
Screenshot 2024-03-18 174908

Is there a threshold when visualizing? I suggest setting a threshold of score >= 0.3. The repeated boxes are all from low score.

This is actually the visual for threshold=0.35, so i was wondering how it is handle during tracking.

Please refer to the paper of sparse4dv3, we don't need to handle duplicate boxes.

Additionally, I haven't observed such severe duplicate boxes issues; significant duplication indicates that the detection model is not well-trained.
image

Btw, running the model I get these results, where the especially the rotation are wrong (I think the closer the object is the bigger the error is):
Screenshot 2024-03-19 093817
Is it correct like that or I am doing something wrong?

Yes i have simply clones the repo and i am using that code (uncommenting the out_dir="./vis" in config

Is the mmdet3d version >= 1.0.0?

I have mmdet==2.28.2 as in requirements.txt. Do you get a more accurate visualization in your vis?

Which version of the code are you using?

If you're using the v2 code, please note that the required version of mmdet3d >=1.0.0rc0.

If you're using the v3 code, please proceed to the new repository.

The visualization on my end is all correct.

Actually I am using V3 on the new repository (I wrongly opened an issue here). But I am facing this problems in visualization/detection. Have you run visualization on the new repository as well?
Should I open an issue on that repo?

image
I just tried it, and it worked correctly

Is it with the new repo ? Another strange thing is that I am getting lot of different colors in visualizations. Maybe I have older packages or something.

Yes, new repo

Different colors is for tracking visualizations.

image

Ok make sense, so it definitely works for you.
That is how I have setup my env:

sparse4d_path="path/to/sparse4d"
cd ${sparse4d_path}
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
pip3 install --upgrade pip
pip3 install -r requirement.txt

I had to use older version of torch for CUDA .
Then I am using sparse4dv3_r50.pth checkpoint as model (Sparse4Dv3 in the github table).
Might be a problem of anchors?

It's difficult to find where the problem lies. Please first align the packages' versions.

By the way, what evaluation metric did you use for testing?

Do you mean torch/torchvision or mmdet3d? Because I cannot updgarde torch due to CUDA limitations.
This is the bash file I am launching (--eval bbox):

export PYTHONPATH=$PYTHONPATH:./
export CUDA_VISIBLE_DEVICES=0
export PORT=29532

gpus=(${CUDA_VISIBLE_DEVICES//,/ })
gpu_num=${#gpus[@]}

config=projects/configs/$1.py
checkpoint=$2

echo "number of gpus: "${gpu_num}
echo "config file: "${config}
echo "checkpoint: "${checkpoint}

if [ ${gpu_num} -gt 0 ]; then
    echo "CUDA GPU is available."
else
    echo "CUDA GPU is not available."
fi

if [ ${gpu_num} -gt 1 ]
then
    bash ./tools/dist_test.sh \
        ${config} \
        ${checkpoint} \
        ${gpu_num} \
        --eval bbox \
        $@
else
    python ./tools/test.py \
        ${config} \
        ${checkpoint} \
        --eval bbox \
        $@
fi

I have rerun everything pulling the last repo, but still I have all wrong rotations. I really do not know what might be.
But I have seen that there are no specification on mmdet3d in the new repo becuse you have mmdet3d_plugin, might be the issue?

Sparse4D v3 does not rely on mmdet3d.

How about the validation results?

I have seen that I am receiving this message:

/home/samuele/anaconda3/envs/mmsparse4d/lib/python3.8/site-packages/mmdet/models/backbones/resnet.py:401: UserWarning: DeprecationWarning: pretrained is deprecated, please use "init_cfg" instead
  warnings.warn('DeprecationWarning: pretrained is deprecated, '

Maybe I should change mmdet version ?

Here the results instead:

### Final results ###

Per-class results:
                AMOTA   AMOTP   RECALL  MOTAR   GT      MOTA    MOTP    MT      ML      FAF     TP      FP      FN      IDS     FRAG    TID     LGD
bicycle         0.278   1.132   0.244   1.000   41      0.244   0.306   1       5       0.0     10      0       31      0       0       1.00    1.00
bus             1.000   0.912   1.000   1.000   33      1.000   0.912   1       0       0.0     33      0       0       0       0       0.00    0.00
car             0.670   0.728   0.767   0.811   2188    0.620   0.489   58      23      391.4   1674    317     510     4       9       1.68    1.85
motorcy         0.617   1.059   0.652   0.944   224     0.603   0.714   3       0       16.3    143     8       78      3       2       3.05    3.25
pedestr         0.623   0.901   0.692   0.832   1088    0.563   0.642   32      18      187.9   737     124     335     16      7       1.02    1.52
trailer         nan     nan     nan     nan     nan     nan     nan     nan     nan     nan     nan     nan     nan     nan     nan     nan     nan
truck           0.613   0.930   0.684   1.000   95      0.684   0.344   2       2       0.0     65      0       30      0       0       1.50    1.50

Aggregated results:
AMOTA   0.633
AMOTP   0.944
RECALL  0.673
MOTAR   0.931
GT      611
MOTA    0.619
MOTP    0.568
MT      97
ML      48
FAF     99.3
TP      2662
FP      449
FN      984
IDS     23
FRAG    18
TID     1.37
LGD     1.52
Eval time: 51.9s

Here instead nndet and mmcv versions:

mmcv-full                 1.7.1                    pypi_0    pypi
mmdet                     2.28.2                   pypi_0    pypi

How about the detection validation results? Especially the mAOE metric.

I am not sure whethere these are the orrect metrics:

Saving metrics to: /tmp/tmpd3xqvrad/results/img_bbox
mAP: 0.4219
mATE: 0.6111
mASE: 0.4501
mAOE: 0.8322
mAVE: 0.4943
mAAE: 0.2908
NDS: 0.4431
Eval time: 4.5s

Per-class results:
Object Class    AP      ATE     ASE     AOE     AVE     AAE
car     0.717   0.360   0.159   0.667   0.118   0.075
truck   0.582   0.243   0.144   0.720   0.043   0.000
bus     0.596   0.857   0.130   0.392   1.025   0.058
trailer 0.000   1.000   1.000   1.000   1.000   1.000
construction_vehicle    0.000   1.000   1.000   1.000   1.000   1.000
pedestrian      0.638   0.492   0.251   0.631   0.292   0.193
motorcycle      0.577   0.578   0.273   1.393   0.058   0.000
bicycle 0.306   0.460   0.198   0.686   0.418   0.000
traffic_cone    0.804   0.121   0.348   nan     nan     nan
barrier 0.000   1.000   1.000   1.000   nan     nan

Moreovr, here the fresult of results[0]['img_bbox']['boxes_3d'][0] inside single_gpu_test on v1.0-mini:

tensor([  0.1711, -20.7155,  -1.9648,   4.7794,   1.9362,   1.5364,   1.5752,
         -0.0595,   7.6563,  -0.0904])

All other metrics are normal, only this yaw is incorrect. This issue occurred previously due to an incorrect version of mmdet3d, but the last repo no longer relies on mmdet3d.

This is too strange... How about you visualize the ground truth to see if it's normal?

Please ensure that the versions of other packages are consistent with requirement.txt, such as numpy/pyquaternion/torch.

torch==1.13.0
numpy==1.23.5
mmcv_full==1.7.1
mmdet==2.28.2
pyquaternion==0.9.9

I have torch==1.8.0 because I have support for CUDA 11.1 only and I have issue for CUDA 11.3 and CUDA 11.6. Other packages have the correct version.
So you are sayin that the prediction are wrong (yaw only)?
I had computed the anchor using nuscenes-mini, now I am using the train split, maybe it was that.
Also computing the anchors with train split I am getting this vis, that is not the same as in your tutorial:
Screenshot 2024-03-19 164046
And here my data["projection_mat"][0][0] is

tensor([[ 5.4692e+02,  3.6989e+02,  1.4416e+01, -1.5591e+02],
        [-6.3702e+00,  9.6405e+01, -5.4680e+02, -2.2414e+02],
        [-1.1703e-02,  9.9847e-01,  5.4022e-02, -4.2520e-01],
        [ 0.0000e+00,  0.0000e+00,  0.0000e+00,  1.0000e+00]], device='cuda:0')

And also here you ccan see that in the tutorial I am not getting the same bboxes (with same threshold = 0.35):
Screenshot 2024-03-19 163912

I have also printed the GT (in tutorial.ipynb) like this:

plt.figure(figsize=(20, 10))
gt_bboxes = data["gt_bboxes_3d"][0]
num_gt = gt_bboxes.shape[0]
img = draw_lidar_bbox3d(
    gt_bboxes,
    raw_imgs, data["projection_mat"][0],
    color=[(0, 255, 0)] * num_gt + [(255, 0, 0)] * num_gt
)
plt.imshow(img) 

output
I really do not know which the problem might be, it seem impossible to have all the models and dataset the same but different results.

I might know the reason. I updated the code a few days ago, but the old checkpoint doesn't match the new code. Try commenting out this line of code.
https://github.com/HorizonRobotics/Sparse4D/blob/main/projects/mmdet3d_plugin/models/detection3d/detection3d_blocks.py#L297

Yes, it worked! Thank you very much for your help, really appreciated!!
Screenshot 2024-03-20 082433
Actually in the tutorial are still different. I think you also changed the way you computed the anchors ? here
Moreover, maybe it is due to anchors, I am not getting the same detection in tutorial:
Screenshot 2024-03-20 082928