microsoft/UniVL

How to run captioning task on my own video datasets?

Kevinkaiyan opened this issue · 1 comments

Hi,
Impressive work! I want to ask how to extract features from my own video-text datasets for finetuning model?

Hi @17321010162, plz see here.