lucas-ventura/CoVR

Question about the Increase from 1.2M Paired Videos to 1.6M Triplets

Closed this issue · 2 comments

Thank you for sharing your research results.
I have a question related to the data generation process.

According to the paper, after going through the "Filtering caption pairs" step, 1.2M paired videos remained, and modifications were created using them. Subsequently, after filtering the video pairs, a total of 1.6M triplets were produced.

The count has increased by 0.4M compared to the paired videos. Could you explain how this happened?
My guess is that the pairs were used bidirectionally to create triplets (1.2M → 2.4M), and then decreased after filtering (2.4M → 1.6M).

Your clarification on this would be greatly appreciated!

Hi @leeesangwon, thank you for your question!

You're right, the increase in the triplet count from the initial paired videos is due to the bidirectional use of those pairs (1.2M → 2.4M), followed by a subsequent decrease after video filtering as explained in section "Filtering video pairs".

I apologize if this wasn't clear in the paper, we will clarify in the next version. If you have any more questions or need further clarification, please don't hesitate to ask.

Thank you for the quick response! It was very helpful.
Have a great day! 😄