Extension to Video Datasets

Question

Extension to Video Datasets

kanji95 opened this issue 2 years ago · 1 comments

How do we extend x-decoder to video datasets? In the appendix, it is mentioned that the model generalizes to generic segmentation and referring segmentation on videos.

Answer 1 · 2023-01-01T09:37:43.000Z

Thanks for your question, the evaluated video dataset is simply evaluated frame by frame. For referring segmentation, the referring phrase is a natural tracking id, for segmentation we didn't apply any tracking.