Evaluation Metrics

Question

Evaluation Metrics

Closed this issue 6 years ago · 1 comments

Hi, thanks for your sharing.
But, I'm a little confused. Do you evaluate with only 20 candidate answers?

Answer 1 · 2018-11-02T03:16:39.000Z

No, for the quantitative metrics we used the entire validation set of Visual Dialog v0.9 dataset (~40k images), while for the quantitative human evaluation we used ~100 samples. Any part of the code using only 20 answers for evaluation is probably a product of last-minute debugging being done.