filaPro/oneformer3d

Metrics for released scannet checkpoint for test is much lower than that of the paper?

yxchng opened this issue · 3 comments

I download the scannet checkpoint and eval using the command

python tools/test.py configs/oneformer3d_1xb4_scannet.py \
    work_dirs/oneformer3d_1xb4_scannet/epoch_512.pth
+----------------+---------+---------+--------+-----------+----------+
| classes        | AP_0.25 | AP_0.50 | AP     | Prec_0.50 | Rec_0.50 |
+----------------+---------+---------+--------+-----------+----------+
| cabinet        | 0.8088  | 0.7122  | 0.4896 | 0.7542    | 0.7258   |
| bed            | 0.8378  | 0.8131  | 0.6108 | 0.9848    | 0.8025   |
| chair          | 0.9826  | 0.9584  | 0.8291 | 0.9571    | 0.9305   |
| sofa           | 0.8960  | 0.8518  | 0.5913 | 0.9146    | 0.7732   |
| table          | 0.8969  | 0.8560  | 0.6588 | 0.8814    | 0.7880   |
| door           | 0.8745  | 0.7704  | 0.5673 | 0.8491    | 0.7124   |
| window         | 0.8096  | 0.6196  | 0.4199 | 0.7511    | 0.6099   |
| bookshelf      | 0.8565  | 0.7026  | 0.3991 | 0.7176    | 0.7922   |
| picture        | 0.8570  | 0.7959  | 0.6218 | 0.8571    | 0.7385   |
| counter        | 0.8100  | 0.6985  | 0.3877 | 0.7037    | 0.7451   |
| desk           | 0.8724  | 0.7428  | 0.4567 | 0.7619    | 0.7559   |
| curtain        | 0.8346  | 0.7158  | 0.4743 | 0.9375    | 0.6716   |
| refrigerator   | 0.7564  | 0.7375  | 0.6045 | 0.9024    | 0.6491   |
| showercurtrain | 0.8418  | 0.7446  | 0.6091 | 0.6857    | 0.8571   |
| toilet         | 1.0000  | 1.0000  | 0.9423 | 1.0000    | 1.0000   |
| sink           | 0.9501  | 0.8310  | 0.6196 | 0.8791    | 0.8333   |
| bathtub        | 0.9176  | 0.9032  | 0.8190 | 1.0000    | 0.9032   |
| otherfurniture | 0.7987  | 0.7230  | 0.5724 | 0.8027    | 0.6856   |
+----------------+---------+---------+--------+-----------+----------+
| Overall        | 0.8667  | 0.7876  | 0.5930 | 0.8522    | 0.7763   |
+----------------+---------+---------+--------+-----------+----------+
03/21 04:01:15 - mmengine - INFO - Epoch(test) [312/312]    miou: 0.7643  all_ap: 0.5930  all_ap_50%: 0.7876  all_ap_25%: 0.8667  pq: 0.7074  data_time: 0.0371  time: 0.1451

v.s.
Screenshot from 2024-03-21 11-04-35

What am I doing wrong?

This part of the table is for hidden test leaderboard, the checkpoint and the first part of the table are for validation split.

so python tools/test.py is actually doing the same thing as validation during training?