Unable to Reproduce videochatgpt Benchmark Results

Question

Unable to Reproduce videochatgpt Benchmark Results

Opened this issue 3 months ago · 0 comments

Hello,

Thank you for your open-source contribution. I have trained a model using the code you provided. However, I am seeing different results on the videochatgpt benchmark compared to what is reported in the paper. My scores across the five metrics are 2.72, 2.47, 3.11, 2.29, and 2.71, with an average of 2.66, which is different from the reported average of 2.89.
Considering that different versions of ChatGPT might affect the outcomes, could you please provide a pretrained model for testing? This would help verify the results. Thank you for your assistance.