ChenRocks/UNITER

Vocabulary and single image-question pair prediction

foxm79 opened this issue · 2 comments

  1. Is the vocabulary available that takes the words of the questions and converts them to 'input_ids'?
  2. Is there a function that does this for an input question?
  3. Is there a code that take a single image-question pair and predicts the answer?
  1. Refer to the prepro.py in scripts

Yes, that is what I followed eventually. Thanks for replying !