what is the meaning of frame_num and answer_num?

Question

what is the meaning of frame_num and answer_num?

aixiaodewugege opened this issue 2 years ago · 6 comments

aixiaodewugege commented 2 years ago

Thanks for your brilliant work!

I can't find explanations about these two configuration : frame_num and answer_num . Could you please help me?

Answer 1 · 2023-05-26T04:12:46.000Z

Thanks for your interest in our work! Here are explanations for those parameters

model.frame_num: num of selected keyframes
datasets.nextqa.vis_processor.train.n_frms: num of frames for selection
model. answer_num: num of multi-choice options (e.g. NeXT-QA has 5 options for each QA, STAR has 4 options for each QA)

Answer 2 · 2023-05-26T04:22:19.000Z

Thanks for your relay!

I have tested a lot on your web demo. But I found the zero shot result is not very good on my dataset.

I find the model will always output option1. Any idea about what is the problem? I only have one GPU, is there any way that I can test it not on the web demo?

Answer 3 · 2023-05-26T05:00:16.000Z

We have instructions for running the Gradio demo locally and running the evaluation in this repo.
SeViLA requires at least 12 GB of memory to load the model and run an inference with batch size 1.

Answer 4 · 2023-05-26T05:05:13.000Z

Sorry for my wrong expression. I have made it run locally with Gradio. I mean does it support model.predict_answers() function like BLIP2 to do inference? So that I can test on a dataset and see the output.

Besides, could you please give me some help about how to use your sevila without setting options? Should I change the sevila.generate_demo to sevila.generate or sevila.predict_answers ?

Answer 5 · 2023-05-26T05:16:00.000Z

Yes, you can check and use generate() function to test on multi-choice QA datasets.
For open-ended answer generation, you can input with only questions and decode the FlanT5 output check here.

Answer 6 · 2023-09-10T17:33:46.000Z

The same question.when I feed into models in nextqa datasets.I always get option1 in response.