jnhwkim/ban-vqa

How to get labels for objects?

qinzzz opened this issue · 2 comments

Hi, I am very interested in your BAN model on flickr30k. I am wondering do you provide labels for detected objects together with bounding boxes and features, just like what faster-rcnn or bottom-up attention would do? Since I am not so sure about how you prepared your dataset, I'm afraid if I use pre-trained models to predict labels myself, the dataloader pipeline would have some problem. Thanks!

Could you specify what kind of labels do you mean? I need some elaboration. For the visual features of Flickr30kEntities, we just use the pre-trained bottom-up attention model as for VQA.

Let me redirect you to this project we used.