
Question about MViT pre-training.

Closed this issue · 0 comments

In table 1 of the paper said that the LMDet dataset used for MViT pre-training does not contain COCO. However, the original paper of MViT claims that LMDet contains data from Flickr30k, MS-COCO and Visual Genome (VG). That means the pre-trained proposal detector MViT has already seen the novel classes during training. I doubt whether it is reasonable to use a pre-trained model that has already seen the label and box information of novel classes during training.