video-text-retrieval

There are 16 repositories under video-text-retrieval topic.

ArrowLuo/CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
Language:Python823 12 110117
Paranioar/Awesome_Matching_Pretraining_Transfering
The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
378 11 548
microsoft/UniVL
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
Language:Python335 10 4454
whwu95/Cap4Video
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Language:Python220 9 3216
salesforce/ALPRO
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
Language:Python184 7 1618
m-bain/CondensedMovies
Story-Based Retrieval with Contextual Embeddings. Largest freely available movie video dataset. [ACCV'20]
Language:Python156 10 927
xuguohai/X-CLIP
An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"
Language:Python125 2 615
alipay/Ant-Multi-Modal-Framework
Research Code for Multimodal-Cognition Team in Ant Group
Language:Python73 3 174
amazon-science/crossmodal-contrastive-learning
CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations, ICCV 2021
Language:Python55 4 311
LeapLabTHU/Cross-Modal-Adapter
[arXiv] Cross-Modal Adapter for Text-Video Retrieval
51 5 42
RenShuhuai-Andy/TESTA
[EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
Language:Python42 2 03
shufangxun/MAC
An end-to-end masked contrastive video-and-language pre-training framework
23 1 00
unitaryai/VTC
VTC: Improving Video-Text Retrieval with User Comments
Language:Python11 3 00
rn-snehapriya/Automatic-Note-Taking-From-Video-Using-Tesseract-OCR
Text from the video is extracted and saved into a .docx file in the form of notes.
Language:Jupyter Notebook9 1 03
Jazz1996/tech_review
Survey of state-of-art video-text retrieval methods.
00
unitaryai/VTC-dataset
Language:Python0 2 10

video-text-retrieval

ArrowLuo/CLIP4Clip

Paranioar/Awesome_Matching_Pretraining_Transfering

microsoft/UniVL

whwu95/Cap4Video

salesforce/ALPRO

m-bain/CondensedMovies

xuguohai/X-CLIP

alipay/Ant-Multi-Modal-Framework

amazon-science/crossmodal-contrastive-learning

LeapLabTHU/Cross-Modal-Adapter

RenShuhuai-Andy/TESTA

shufangxun/MAC

unitaryai/VTC

rn-snehapriya/Automatic-Note-Taking-From-Video-Using-Tesseract-OCR

Jazz1996/tech_review

unitaryai/VTC-dataset