video-captioning

There are 86 repositories under video-captioning topic.

YehLi/xmodaler
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
Language:Python1k 36 62111
xiadingZ/video-caption.pytorch
pytorch implementation of video captioning
Language:Python402 11 50131
scopeInfinity/Video2Description
Video to Text: Natural language description generator for some given video. [Video Captioning]
Language:Python338 7 2069
tomchang25/whisper-auto-transcribe
Auto transcribe tool based on whisper
Language:Python221 6 4815
antoyang/VidChapters
[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale
Language:Jupyter Notebook180 3 2221
jayleicn/recurrent-transformer
[ACL 2020] PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
Language:Jupyter Notebook168 10 1225
vijayvee/video-captioning
This repository contains the code for a video captioning system inspired by Sequence to Sequence -- Video to Text. This system takes as input a video and generates a caption in English describing the video.
Language:Python166 9 3367
JasonYao81000/MLDS2018SPRING
Machine Learning and having it Deep and Structured (MLDS) in 2018 spring
Language:Python142 5 1047
jpthu17/EMCL
[NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations
Language:Python124 3 49
jssprz/video_captioning_datasets
Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Perspective: A Comprehensive Review*
Language:Jupyter Notebook117 3 112
terry-r123/Awesome-Captioning
A curated list of Multimodal Captioning related research(including image captioning, video captioning, and text captioning)
108 4 110
bytedance/Shot2Story
A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.
Language:Python104 6 176
jayleicn/TVCaption
[ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset
Language:Python90 6 211
Kamino666/Video-Captioning-Transformer
这是一个基于Pytorch平台、Transformer框架实现的视频描述生成 (Video Captioning) 深度学习模型。视频描述生成任务指的是：输入一个视频，输出一句描述整个视频内容的文字（前提是视频较短且可以用一句话来描述）。本repo主要目的是帮助视力障碍者欣赏网络视频、感知周围环境，促进“无障碍视频”的发展。
Language:Python81 1 1917
nasib-ullah/video-captioning-models-in-Pytorch
A PyTorch implementation of state of the art video captioning models from 2015-2019 on MSVD and MSRVTT datasets.
Language:Python71 3 717
UARK-AICV/VLTinT
[AAAI 2023 Oral] VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning
Language:Jupyter Notebook66 4 156
ParitoshParmar/MTL-AQA
What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]
Language:Python62 4 515
amazon-science/crossmodal-contrastive-learning
CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations, ICCV 2021
Language:Python59 4 311
jacobswan1/Video2Commonsense
Video captioning baseline models on Video2Commonsense Dataset.
Language:Python57 4 1112
lvapeab/ABiViRNet
Attention Bidirectional Video Recurrent Net
Language:Python57 7 627
imshaikot/srt-webvtt
Convert SRT formatted subtitle to WebVTT on the fly over HTML5/browser environment
Language:TypeScript51 3 16
pochih/Video-Cap
🎬 Video Captioning: ICCV '15 paper implementation
Language:Python49 6 920
LuoweiZhou/densecap
Dense video captioning in PyTorch
Language:Jupyter Notebook41 3 910
tsujuifu/pytorch_empirical-mvm
A PyTorch implementation of EmpiricalMVM
Language:Python39 2 92
TXH-mercury/COSA
[ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
Language:Python39 2 43
WingsBrokenAngel/delving-deeper-into-the-decoder-for-video-captioning
Source code for Delving Deeper into the Decoder for Video Captioning
Language:Jupyter Notebook38 6 814
acherstyx/CoCap
[ICCV 2023] Accurate and Fast Compressed Video Captioning
Language:Python36 2 174
xiadingZ/video-caption-openNMT.pytorch
implement video caption based on openNMT
Language:Python36 5 112
willyfh/awesome-video-text-datasets
A curated list of video-text datasets in a variety of languages. These datasets can be used for video captioning (video description) or video retrieval.
33 2 03
mlvlab/MELTR
MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models (CVPR 2023)
Language:Python32 7 56
jssprz/visual_syntactic_embedding_video_captioning
Source code of the paper titled *Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding*
Language:Python30 2 128
zjr2000/LLMVA-GEBC
Winner solution to Generic Event Boundary Captioning task in LOVEU Challenge (CVPR 2023 workshop)
Language:Python29 2 42
UARK-AICV/VLCAP
[ICIP 2022] VLCap: Vision-Language with Contrastive Learning for Coherent Video Paragraph Captioning
Language:Jupyter Notebook28 3 115
rohit-gupta/Video2Language
Generating video descriptions using deep learning in Keras
Language:Python25 2 1316
yangbang18/CARE
(TIP'2023) Concept-Aware Video Captioning: Describing Videos with Effective Prior Information
Language:Jupyter Notebook25 1 20
thtang/ADLxMLDS2017
Deep learning works for ADLxMLDS (CSIE 5431) in NTU
Language:Python23 3 08

video-captioning

YehLi/xmodaler

xiadingZ/video-caption.pytorch

scopeInfinity/Video2Description

tomchang25/whisper-auto-transcribe

antoyang/VidChapters

jayleicn/recurrent-transformer

vijayvee/video-captioning

JasonYao81000/MLDS2018SPRING

jpthu17/EMCL

jssprz/video_captioning_datasets

terry-r123/Awesome-Captioning

bytedance/Shot2Story

jayleicn/TVCaption

Kamino666/Video-Captioning-Transformer

nasib-ullah/video-captioning-models-in-Pytorch

UARK-AICV/VLTinT

ParitoshParmar/MTL-AQA

amazon-science/crossmodal-contrastive-learning

jacobswan1/Video2Commonsense

lvapeab/ABiViRNet

imshaikot/srt-webvtt

pochih/Video-Cap

LuoweiZhou/densecap

tsujuifu/pytorch_empirical-mvm

TXH-mercury/COSA

WingsBrokenAngel/delving-deeper-into-the-decoder-for-video-captioning

acherstyx/CoCap

xiadingZ/video-caption-openNMT.pytorch

willyfh/awesome-video-text-datasets

mlvlab/MELTR

jssprz/visual_syntactic_embedding_video_captioning

zjr2000/LLMVA-GEBC

UARK-AICV/VLCAP

rohit-gupta/Video2Language

yangbang18/CARE

thtang/ADLxMLDS2017