video-recognition

There are 75 repositories under video-recognition topic.

  • kenshohara/3D-ResNets-PyTorch

    3D ResNets for Action Recognition (CVPR 2018)

    Language:Python4k57269935
  • jinwchoi/awesome-action-recognition

    A curated list of action recognition and related area resources

  • PaddlePaddle/PaddleVideo

    Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video tagging and sport action detection.

    Language:Python1.6k39331383
  • SwinTransformer/Video-Swin-Transformer

    This is an official implementation for "Video Swin Transformers".

    Language:Python1.6k996210
  • subho406/OmniNet

    Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain

    Language:Python51318757
  • edenai/edenai-apis

    Eden AI: simplify the use and deployment of AI technologies by providing a unique API that connects to the best possible AI engines

    Language:Python45781469
  • apoorva-dave/LicensePlateDetector

    Detects license plate of car and recognizes its characters

    Language:Python3551227113
  • autovideo

    datamllab/autovideo

    AutoVideo: An Automated Video Action Recognition System

    Language:Python339151537
  • Atze00/MoViNet-pytorch

    MoViNets PyTorch implementation: Mobile Video Networks for Efficient Video Recognition;

    Language:Jupyter Notebook27983952
  • tea1528/Non-Local-NN-Pytorch

    PyTorch implementation of Non-Local Neural Networks (https://arxiv.org/pdf/1711.07971.pdf)

    Language:Python2513657
  • whwu95/Text4Vis

    【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective

    Language:Python19552415
  • whwu95/GPT4Vis

    GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?

    Language:Python1849219
  • whwu95/BIKE

    【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models

    Language:Python152102621
  • cooperdk/YAPO-e-plus

    YAPO e+ - Yet Another Porn Organizer (extended)

    Language:Python15084915
  • kenshohara/3D-ResNets

    3D ResNets for Action Recognition

    Language:Lua1195221
  • ldkong1205/TranSVAE

    [NeurIPS 2023] Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective

    Language:Jupyter Notebook11971511
  • rohitgirdhar/CATER

    CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning

    Language:Python10752719
  • Ha0Tang/HandGestureRecognition

    [Neurocomputing 2019] Fast and Robust Dynamic Hand Gesture Recognition via Key Frames Extraction and Feature Fusion

    Language:C++10271026
  • DmitryRyumin/WACV-2024-Papers

    WACV 2024 Papers: Discover cutting-edge research from WACV 2024, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!

    Language:Python963113
  • yanbeic/CCL

    PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning

    Language:Python895811
  • turron

    Fl1s/turron

    A search system that analyzes short video snippets (2–5 secs) and finds highly accurate matches using keyframe-based perceptual hashing. Selfhosted Video Shazam.

    Language:Java59
  • BeSpontaneous/FFN-pytorch

    Frame Flexible Network (CVPR2023)

    Language:Python56135
  • fmahoudeau/MiCT-Net-PyTorch

    Video Recognition using Mixed Convolutional Tube (MiCT) on PyTorch with a ResNet backbone

    Language:Python562812
  • karolzak/conv3d-video-action-recognition

    My experimentation around action recognition in videos. Contains Keras implementation for C3D network based on original paper "Learning Spatiotemporal Features with 3D Convolutional Networks", Tran et al. and it includes video processing pipelines coded using mPyPl package. Model is being benchmarked on popular UCF101 dataset and achieves results similar to those reported by authors

    Language:Python5441010
  • bytedance/Portrait-Mode-Video

    Video dataset dedicated to portrait-mode video recognition.

    Language:Python52461
  • MrinalJain17/Human-Activity-Recognition

    Recognizing human activities using Deep Learning

    Language:Jupyter Notebook5131031
  • Nasdin/VideoRecognition-realtime-autotrainer-alerts

    State of the art object detection in real-time using YOLOV3 algorithm. Augmented with a process that allows easy training of the classifier as a plug & play solution . Provides alert if an item in an alert list is detected.

    Language:Python498224
  • mgalushka/pedestrians-traffic-calc

    Tracking and counting pedestrians from webcamera video stream #douhack Donetsk

    Language:Processing4413123
  • martinetoering/ViCC

    [WACV'22] Code repository for the paper "Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting", https://arxiv.org/abs/2106.10137.

    Language:Python36248
  • gorjanradevski/revisiting-spatial-temporal-layouts

    Codebase for "Revisiting spatio-temporal layouts for compositional action recognition" (Oral at BMVC 2021).

    Language:Python25171
  • ZJCV/TSM

    [ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding

    Language:Python22212
  • ZJCV/X3D

    [CVPR 2020] X3D: Expanding Architectures for Efficient Video Recognition

    Language:Python21034
  • JunweiLiang/MultiTrain

    Code and model for "Multi-dataset Training of Transformers for Robust Action Recognition", NeurIPS 2022 Spotlight

    Language:Python20531
  • reddyav1/RoCoG-v2

    RoCoG-v2 (Robot Control Gestures) is a dataset intended to support the study of synthetic-to-real and ground-to-air video domain adaptation.

    Language:Python16401
  • fcogidi/X3D-tf

    An implementation of the X3D video recognition architecture in TensorFlow/Keras

    Language:Python15263
  • wjun0830/MOVE

    Official PyTorch Repository of "Minority-Oriented Vicinity Expansion with Attentive Aggregation for Video Long-Tailed Recognition" (AAAI 2023 Oral Paper) and Imbalanced-MiniKinetics200 dataset.

    Language:Python13102