pzzhang/VinVL

Pre-exacted image features

Opened this issue · 2 comments

Hi!

I'm trying to use VinVL model and scripts described in https://github.com/pzzhang/VinVL/blob/main/DOWNLOAD.md to extract image features from COCO, but I found that extracted COCO features (detections & labels) differ from pre-exacted image features.

Can you provide pretrained VinVL model and/or config (which is used to make pre-exacted features) for proper extraction?

rgtjf commented

+1

Please refer to the repo https://github.com/microsoft/scene_graph_benchmark for feature extraction.

If you only find minor difference in the predictions (by looking at the visualization tools/demo/demo_image.py), then the extracted features should be fine.