RefCOCO training / evaluation details

Question

RefCOCO training / evaluation details

j-min opened this issue 4 years ago · 2 comments

Hello,
I have some questions regarding RefCOCO/+/g training / evaluation details.

Are you going to upload RefCOCO/+/g training/evaluation codes?
Which boxes did you finetune UNITER on?
Which boxes did you use to evaluate on val, test, val^d, and test^d evaluation respectively? Did you use Mask R-CNN boxes from MattNet?

Table from UNITER

It seems ViLBERT-MT authors finetuned their model on 100 BUTD boxes + Mask R-CNN boxes from MattNet-> code.
Then they used 100 BUTD boxes during evaluation -> code

I calculated oracle scores on RefCOCOg val split: "if there exists a candidate box with iou(candidate,target) > 0.5 => correct"

Mask R-CNN boxes from MAttNet -> 86.10%
MS COCO GT boxes -> 99.6%
VilBERT-MT's 100 BUTD boxes on RefCOCOg -> 96.53%

Since BUTD boxes have better coverage on Mask R-CNN boxes from MAttNet, I don't think this is fair comparison to MattNet. Also this is not consistent with the ViLBERT-MT paper.

Paragraph from ViLBERT-MT

ViLBERT-MT authors compared ViLBERT-MT and UNITER on test^d. I wonder which boxes you used for UNITER finetuning and evaluation.

Table from ViLBERT-MT

Answer 1 · 2020-11-26T08:47:06.000Z

We finetuned on ground-truth (COCO's) annotated boxes whose features are extracted using butd, and ran inference on

ground-truth boxes
mattnet's detected boxes

Answer 2 · 2020-12-02T03:52:46.000Z

Thank you for the clarification!