To extract the video swin features please refer to: https://github.com/SwinTransformer/Video-Swin-Transformer
To extract LaViLA features, follow: https://github.com/facebookresearch/LaViLa
To extract the video swin features please refer to: https://github.com/SwinTransformer/Video-Swin-Transformer
To extract LaViLA features, follow: https://github.com/facebookresearch/LaViLa