facebookresearch/MaskFormer

Result in log file.

chhluo opened this issue · 9 comments

The test result (PQ = 40.5) in https://dl.fbaipublicfiles.com/maskformer/panoptic-coco/maskformer_panoptic_R50_bs64_554k/metrics.json is different from the result (PQ=46.5)in table.

Duplicate with issue #12
"These metric files mainly serve as a reference for training losses. Please always refer to PQ numbers in the table."

I have another question: in the last page of mask2former, use a config batch_size=16, epoch=75 for maskformer, would get a similar result to a config of batch_size=64, epoch=300 for maskformer, How big is the gap between these two results?

I don't get the question. Table XI (c) of the Mask2Former paper reports the results with batch size 16.

So, Is first row in Table XI (c) of the Mask2Former paper the result of maskformer using config batch_size=16, epoch=75 rather than using config batch_size=64, epoch=300?

Yes, Table XI (c) uses the parameters in Table XI (a), which is batch_size=16, epoch=75. Note that the total number of iterations is the same, we simply decreases the batch_size and that's why number of epochs decreases by a factor of 4. So you can use the same MaskFormer config but only changing batchsize from 64 to 16 without other modification.

Thank you.

I am reimplementing maskformer based on mmdetection, see pr. When training with config r50 batch_size=16 epoch=75, I get a result: PQ=46.9, which is 0.4 better than the result reported on paper Mask2former. Is this a normal fluctuation range?

Yes, Table XI (c) uses the parameters in Table XI (a), which is batch_size=16, epoch=75. Note that the total number of iterations is the same, we simply decreases the batch_size and that's why number of epochs decreases by a factor of 4. So you can use the same MaskFormer config but only changing batchsize from 64 to 16 without other modification.

Beside the different batch_size, the weight_decay is also different. In maskformer(r50), weight_decay is 0.0001

, in Table XI (a) in paper mask2former, weight_decay is 0.0005 for maskformer.

Thanks a lot for reimplementing it on mmdetection!

I used weight decay 0.0001, the 0.0005 in the paper is a typo (thanks for pointing it out). If you trained the model using weight decay 0.0005, it could lead to a slight increase in the performance.

I think the reimplementation should be good as long as it performs no worse than the original MaskFormer (https://github.com/facebookresearch/MaskFormer/blob/main/MODEL_ZOO.md#panoptic-segmentation-models). Please remember to document the difference of parameters (batch size, weight decay, etc) in the README.