The ap of lvis measured by the public model is inconsistent with the ap in the paper.

Question

The ap of lvis measured by the public model is inconsistent with the ap in the paper.

yaohusama opened this issue a year ago · 1 comments

When testing the ap of the lvis data set, whether to combine the output of hqsam and the output mask of sam itself to test the ap. Is sam's prediction used to select the box combined with the prediction score on the box? When using vit-det to get the detection frame, is the detection head mask rcnn or cascade rcnn? I use mask rcnn as the detection head, and the ap of hqsam-l is 45.289. The one in the paper is 43.9. Why is my measurement different from the one in the paper?
Thanks.

Answer 1 · 2023-12-26T23:37:23.000Z

Hi, we use cascade rcnn with this config.
And for evaluation, we simply use all pred bbox as prompt without combining score or using output mask as another prompt.