video-understanding
There are 182 repositories under video-understanding topic.
open-mmlab/mmaction2
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
jinwchoi/awesome-action-recognition
A curated list of action recognition and related area resources
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
showlab/Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
mit-han-lab/temporal-shift-module
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
open-mmlab/mmaction
An open-source toolbox for action understanding based on PyTorch
yjxiong/temporal-segment-networks
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016
PaddlePaddle/PaddleVideo
Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video tagging and sport action detection.
MCG-NJU/VideoMAE
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
yjxiong/tsn-pytorch
Temporal Segment Networks (TSN) in PyTorch
OpenGVLab/InternVideo
Video Foundation Models & Data for Multimodal Understanding
TheShadow29/awesome-grounding
awesome grounding: A curated list of research papers in visual grounding
PKU-YuanGroup/Chat-UniVi
[CVPR 2024 Highlightđ„] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
yjxiong/action-detection
temporal action detection with SSN
henghuiding/MeViS
[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
yoosan/video-understanding-dataset
A collection of recent video understanding datasets, under construction!
chihyaoma/Activity-Recognition-with-CNN-and-RNN
Temporal Segments LSTM and Temporal-Inception for Activity Recognition
OpenGVLab/VideoMAEv2
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
Vision-CAIR/MiniGPT4-video
Official code for MiniGPT4-video
MCG-NJU/TDN
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
v-iashin/SpecVQGAN
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
movienet/movienet-tools
Tools for movie and video research
JunweiLiang/Multiverse
Dataset, code and model for the CVPR'20 paper "The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction". And for the ECCV'20 SimAug paper.
NVlabs/STEP
STEP: Spatio-Temporal Progressive Learning for Video Action Detection. CVPR'19 (Oral)
hustvl/TeViT
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral
rohitgirdhar/ActionVLAD
ActionVLAD for video action classification (CVPR 2017)
alibaba-mmai-research/TAdaConv
[ICLR 2022] TAda! Temporally-Adaptive Convolutions for Video Understanding. This codebase provides solutions for video classification, video representation learning and temporal detection.
rlleshi/phar
deep learning sex position classifier
whwu95/Cap4Video
ăCVPR'2023 Highlight & TPAMIăCap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
whwu95/Text4Vis
ăAAAI'2023 & IJCVăTransferring Vision-Language Models for Visual Recognition: A Classifier Perspective
wangheda/youtube-8m
The 2nd place Solution to the Youtube-8M Video Understanding Challenge by Team Monkeytyping (based on tensorflow)
SoccerNet/sn-gamestate
SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap (CVPR24 - CVSports workshop)
fabienbaradel/object_level_visual_reasoning
Pytorch Implementation of "Object level Visual Reasoning in Videos", F. Baradel, N. Neverova, C. Wolf, J. Mille, G. Mori , ECCV 2018
chinancheng/awesome-activity-prediction
Paper list of activity prediction and related area
antoyang/TubeDETR
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers
antoyang/VidChapters
[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale