Lower results when evaluating released BEVDet checkpoint
Closed this issue · 11 comments
Hello, I have tried to evaluate released BEVDet checkpoint as-is on my setup, but I get
mAP: 0.2751
mATE: 0.7179
mASE: 0.2738
mAOE: 0.5512
mAVE: 0.8747
mAAE: 0.2205
NDS: 0.3737
Eval time: 107.4s
Per-class results:
Object Class AP ATE ASE AOE AVE AAE
car 0.441 0.631 0.167 0.131 1.037 0.254
truck 0.197 0.757 0.225 0.125 0.828 0.227
bus 0.283 0.680 0.185 0.139 1.895 0.350
trailer 0.132 1.053 0.224 0.463 0.547 0.068
construction_vehicle 0.066 0.795 0.484 1.174 0.095 0.358
pedestrian 0.301 0.788 0.305 1.320 0.848 0.412
motorcycle 0.235 0.704 0.262 0.612 1.437 0.090
bicycle 0.182 0.607 0.265 0.875 0.310 0.006
traffic_cone 0.445 0.616 0.333 nan nan nan
barrier 0.468 0.547 0.287 0.122 nan nan
which is lower than the expected 30.8/40.4 mAP/NDS.
I am using A6000 GPUs, torch 1.10.1, cudatoolkit 11.3. Do you know what might be the issue?
I find that I have the exact same numbers as #15 @BoLang615, but I believe I am using the latest version. I would appreciate any pointers for this.
Thank you!
@Divadi you train this with 4 gpus, total 8x4=32 batch size and lr=1e-4?
This is without re-training; I just loaded & evaluated the released checkpoint.
I'm running a separate training job with 4 gpus, 16x4=64 batch size, original lr, but it has not completed yet.
Hmm...
When I load your pkl and compare it with mine:
>>> a = pickle.load(open("check.pkl", 'rb')); b = pickle.load(open("check_divadi.pkl", 'rb'))
>>> a.keys()
dict_keys(['points', 'pred_bboxes', 'out_dir', 'file_name', 'bbox_pts', 'img_metas'])
>>> a['file_name']
'n015-2018-07-11-11-54-16+0800__LIDAR_TOP__1531281629949213'
>>> b['img_metas'][0]['pts_filename']
'datasets/nuscenes/samples/LIDAR_TOP/n015-2018-07-11-11-54-16+0800__LIDAR_TOP__1531281439800013.pcd.bin'
The first file path itself is different; the predictions are different as well. Is what you sent me the first sample as loaded by the pipeline?
Also, for reference I saw that nuscenes_converter
was not different from mmdetection3d's pre-coordinate change version, so I had just used those pkl files.
I set the workers_per_gpu=0
Here is the md5sum of my test pkl, you can check this as well:
efd90b7e93c43fc18e98a0cf0ec8b1c4 /nuscenes_infos_val.pkl
emm, I apologize for my mistaken 'test.pkl' for 'check.pkl' and 'img_feats' for 'img_metas'
here is the modified pkl:
check.zip.zip
I will check the pkl & zip further when I get home.
The results of training myself are as follows:
mAP: 0.3050
mATE: 0.6869
mASE: 0.2754
mAOE: 0.5599
mAVE: 0.8782
mAAE: 0.2481
NDS: 0.3876
Eval time: 120.7s
Per-class results:
Object Class AP ATE ASE AOE AVE AAE
car 0.503 0.542 0.160 0.109 0.929 0.228
truck 0.209 0.721 0.224 0.172 0.813 0.228
bus 0.300 0.731 0.188 0.093 1.747 0.440
trailer 0.170 1.048 0.242 0.385 0.617 0.112
construction_vehicle 0.055 0.894 0.485 1.118 0.106 0.392
pedestrian 0.325 0.743 0.302 1.343 0.861 0.495
motorcycle 0.262 0.678 0.259 0.670 1.680 0.075
bicycle 0.218 0.544 0.275 1.030 0.272 0.015
traffic_cone 0.503 0.501 0.332 nan nan nan
barrier 0.506 0.468 0.288 0.119 nan nan
@Divadi mAVE and mAAE is a bit low. Some 'abnormal' examples (I think the others will not report their result when it is seem ok- - ) can be found in issue#21.
I will check the pkl & zip further when I get home.
The results of training myself are as follows:
mAP: 0.3050 mATE: 0.6869 mASE: 0.2754 mAOE: 0.5599 mAVE: 0.8782 mAAE: 0.2481 NDS: 0.3876 Eval time: 120.7s Per-class results: Object Class AP ATE ASE AOE AVE AAE car 0.503 0.542 0.160 0.109 0.929 0.228 truck 0.209 0.721 0.224 0.172 0.813 0.228 bus 0.300 0.731 0.188 0.093 1.747 0.440 trailer 0.170 1.048 0.242 0.385 0.617 0.112 construction_vehicle 0.055 0.894 0.485 1.118 0.106 0.392 pedestrian 0.325 0.743 0.302 1.343 0.861 0.495 motorcycle 0.262 0.678 0.259 0.670 1.680 0.075 bicycle 0.218 0.544 0.275 1.030 0.272 0.015 traffic_cone 0.503 0.501 0.332 nan nan nan barrier 0.506 0.468 0.288 0.119 nan nan
may be epoch18 is better……
@HuangJunJie2017
Whew... I think I found the issue; I had Pillow 9.2.0 installed, probably causing some of the operations in image transforms (loading.py) to be slightly different from your Pillow 8.4.0. As a consequence, your loaded images' differences with mine looked like this:
After downgrading to Pillow 8.4.0, the difference is nil:
Updated results:
mAP: 0.3082
mATE: 0.6648
mASE: 0.2729
mAOE: 0.5330
mAVE: 0.8287
mAAE: 0.2052
NDS: 0.4036
Eval time: 98.1s
Per-class results:
Object Class AP ATE ASE AOE AVE AAE
car 0.508 0.535 0.159 0.127 0.947 0.232
truck 0.222 0.671 0.216 0.123 0.834 0.220
bus 0.311 0.760 0.195 0.086 1.592 0.301
trailer 0.150 0.987 0.229 0.443 0.518 0.054
construction_vehicle 0.073 0.720 0.482 1.093 0.103 0.342
pedestrian 0.336 0.738 0.301 1.326 0.861 0.409
motorcycle 0.262 0.704 0.262 0.595 1.450 0.075
bicycle 0.213 0.525 0.270 0.885 0.325 0.009
traffic_cone 0.506 0.518 0.331 nan nan nan
barrier 0.502 0.490 0.284 0.119 nan nan
Thank you for your help!
@Divadi nice job! thank you so much for your information!