coco iou
ProjectDisR opened this issue · 11 comments
by using the provided checkpoint on coco to generate cam on train_1250_txt, the miou is only ~7%. The iou is nearly 0% start from the class "street sign".
wondering why is that?
already used the script to prepare data
and used the command
python evaluation.py --list coco/train_1250_id.txt --data-path data --type npy --predict_dir WeakTr_results_coco/WeakTr/attn-patchrefine-npy-ms --out-dir WeakTr_results_coco/WeakTr/pseudo-mask-ms-crf --num_classes 91 --start 40 --t 42 &
should num_classes be set to 91 ?
is it because this line " c = self.CAT_LIST.index(cat)" which should be "c=cat" ?
and the performance seems not reaching the reported miou
Thank you for your interest in our research. Regarding the COCO-CAM results in our paper, you can reproduce them using the following code through pseudo masks:
python evaluation.py --list coco/train_id.txt \
--data-path data \
--type png \
--predict_dir data/coco/voc_format/WeakTr_CAMlb_wCRF_COCO \
--num_classes 91
As for your difficulty in reproducing our results with the provided COCO-CAM checkpoint, the reason is that our latest version of WeakTr-CAM includes optimizations to the model architecture. We have already updated the VOC checkpoint corresponding to the newest architecture. And we anticipate updating the COCO CAM checkpoint corresponding to the latest code within this week.
Once again, we appreciate your interest and issues.
Greetings! We have uploaded the newest checkpoints to the cloud. You can download the COCO-CAM checkpoints and try the following commands to get a mIoU result of 42.6% at the train_id
split :
# Generate CAM
python main.py --model deit_small_WeakTr_patch16_224 \
--data-path data \
--data-set COCOMS \
--img-ms-list coco/train_id.txt \
--scales 1.0 0.8 1.2 \
--gen_attention_maps \
--cam-npy-dir WeakTr_results_coco/WeakTr/attn-patchrefine-npy-ms \
--output_dir WeakTr_results_coco/WeakTr \
--reduction 8 \
--pool-type max \
--resume WeakTr_results_coco/WeakTr/checkpoint_best_mIoU.pth
# CRF post-processing
python evaluation.py --list coco/train_id.txt \
--data-path data \
--type npy \
--predict_dir WeakTr_results_coco/WeakTr/attn-patchrefine-npy-ms \
--out-dir WeakTr_results_coco/WeakTr/pseudo-mask-ms-crf \
--t 42 \
--num_classes 91 \
--out-crf
During the training of Coco using the following parameters, the average IOU verified on the ‘train_1250_txt’ was around 8%. Similar to the issue mentioned above, the IOU after the 'street sign' category is basically 0%. Can you tell me any structural changes in the model compared to before?
Training
python main.py --model deit_small_WeakTr_patch16_224
--data-path data
--data-set COCO
--img-ms-list coco/train_1250_id.txt
--gt-dir voc_format/class_labels
--cam-npy-dir WeakTr_results_coco/WeakTr/attn-patchrefine-npy
--output_dir WeakTr_results_coco/WeakTr
--reduction 8
--pool-type max
--lr 2e-4
--weight-decay 0.03 \
@SecretplayeRava Thanks for your interest!
- In fact, we achieved a 23% mIoU on the
train_1250_id.txt
after training for 4 epochs, as shown below. We also provide the complete history training curve. - The IoU for the
street sign
category being 0% and marked as NaN% is reasonable because there are nostreet sign
labels in the ground truth (GT) of the COCO 2014 dataset. - It is suggested to update the WeakTr code to the latest version and try again.
Feel free to communicate with us if there are more problems.
Thank you for your reply!! I will try again. Another issue is that the WeakTr_CAMlb_wCRF_COCO verification results downloaded from Step1 COCO14 CAM_Label are as follows:
background: 78.471% person: 52.280%
bicycle: 47.058% car: 34.611%
motorcycle: 62.461% airplane: 60.815%
bus: 60.625% train: 54.598%
truck: 44.861% boat: 31.966%
traffic light: 30.339% fire hydrant: 70.073%
street sign: 0.000% stop sign: 0.000%
parking meter: 0.016% bench: 0.042%
bird: 0.008% cat: 0.139%
dog: 0.046% horse: 0.007%
sheep: 0.175% cow: 0.000%
elephant: 0.000% bear: 0.002%
zebra: 0.027% giraffe: 0.000%
hat: 0.000% backpack: 1.238%
umbrella: 0.000% shoe: 0.000%
eye glasses: 0.000% handbag: 0.000%
tie: 0.000% suitcase: 0.000%
frisbee: 0.000% skis: 0.000%
snowboard: 0.000% sports ball: 0.001%
kite: 0.005% baseball bat: 0.000%
baseball glove: 0.001% skateboard: 0.000%
surfboard: 0.000% tennis racket: 0.000%
bottle: 0.075% plate: 0.000%
wine glass: 0.024% cup: 0.041%
fork: 0.005% knife: 0.059%
spoon: 0.006% bowl: 0.192%
banana: 0.009% apple: 0.000%
sandwich: 0.025% orange: 0.000%
broccoli: 0.011% carrot: 0.000%
hot dog: 0.000% pizza: 0.001%
donut: 0.002% cake: 1.067%
chair: 0.003% couch: 0.117%
potted plant: 0.004% bed: 0.001%
mirror: 0.000% dining table: 0.001%
window: 0.000% desk: 0.000%
toilet: 0.000% door: 0.000%
tv: 0.002% laptop: 0.000%
mouse: 0.507% remote: 0.010%
keyboard: 0.006% cell phone: 0.021%
microwave: 0.002% oven: 0.000%
toaster: 0.000% sink: 0.000%
refrigerator: 0.000% blender: nan%
book: 0.000% clock: 0.000%
vase: 0.000% scissors: 0.000%
teddy bear: 0.000% hair drier: 0.000%
toothbrush: 0.000%
mIoU: 7.023%
FP = 50.92535027974361, FN = 42.05178966478575
Prediction = 9.996101270785202, Recall = 10.579217918915957
The evaluation code used is
python evaluation.py --list coco/train_id.txt \
--data-path data \
--type png \
--predict_dir data/coco/voc_format/WeakTr_CAMlb_wCRF_COCO \
--num_classes 91
Thanks for your question!
- We try to evaluate it again using the
CAM Label
andGround Truth
provided in WeakTr_CAMlb_wCRF_COCO and class_labels, respectively.
Note: We apologize for the issue with the way to get
class_labels
for COCO 2014. In fact, theGround Truth
we used for the COCO 2014 dataset is sourced from PMM, and you can download it from the link we provided above.
- The command we used for evaluation is shown below:
python evaluation.py --list coco/train_id.txt \
--data-path data \
--type png \
--predict_dir data/coco/voc_format/WeakTr_CAMlb_wCRF_COCO \
--num_classes 91
- The results we obtained is shown below:
background: 78.471% person: 52.291%
bicycle: 46.570% car: 34.370%
motorcycle: 61.962% airplane: 60.793%
bus: 59.976% train: 54.657%
truck: 44.976% boat: 32.052%
traffic light: 30.331% fire hydrant: 70.038%
street sign: nan% stop sign: 75.582%
parking meter: 62.229% bench: 38.021%
bird: 56.910% cat: 72.963%
dog: 72.499% horse: 64.872%
sheep: 64.672% cow: 68.417%
elephant: 75.607% bear: 77.029%
zebra: 74.445% giraffe: 71.189%
hat: nan% backpack: 21.465%
umbrella: 63.845% shoe: nan%
eye glasses: nan% handbag: 12.160%
tie: 23.986% suitcase: 48.190%
frisbee: 34.093% skis: 11.729%
snowboard: 24.124% sports ball: 7.579%
kite: 41.729% baseball bat: 3.229%
baseball glove: 2.157% skateboard: 11.165%
surfboard: 39.573% tennis racket: 9.564%
bottle: 30.714% plate: nan%
wine glass: 27.689% cup: 29.457%
fork: 10.022% knife: 14.906%
spoon: 8.765% bowl: 33.181%
banana: 67.220% apple: 52.411%
sandwich: 54.550% orange: 67.917%
broccoli: 58.257% carrot: 51.784%
hot dog: 59.786% pizza: 68.727%
donut: 60.234% cake: 49.545%
chair: 22.941% couch: 41.213%
potted plant: 33.433% bed: 53.363%
mirror: nan% dining table: 24.115%
window: nan% desk: nan%
toilet: 49.828% door: nan%
tv: 49.309% laptop: 41.132%
mouse: 12.514% remote: 27.475%
keyboard: 39.447% cell phone: 42.907%
microwave: 38.491% oven: 30.251%
toaster: 17.178% sink: 19.368%
refrigerator: 48.354% blender: nan%
book: 36.281% clock: 39.714%
vase: 31.402% scissors: 36.389%
teddy bear: 63.098% hair drier: 19.815%
toothbrush: 28.875%
======================================================
mIoU: 42.563%
FP = 35.34603837587358, FN = 22.09138763809436
Prediction = 55.79337973497383, Recall = 63.39359639675423
it works! thx
I will close this issue. If there are more questions, you are welcome to raise issues :)