/pytorch-video-understanding

This codebase provides a comprehensive video understanding solution for video classification and temporal detection. It is the official PyTorch implementation of TAda! Temporally-Adaptive Convolutions for Video Understanding.

Primary LanguagePythonApache License 2.0Apache-2.0

pytorch-video-understanding

This codebase provides a comprehensive video understanding solution for video classification and temporal detection.

Key features:

  • Video classification: State-of-the-art video models, with self-supervised representation learning approaches for pre-training, and supervised classification pipeline for fine-tuning.
  • Video temporal detection: Strong features ready for both feature-level classification and localization, as well as standard pipeline taking advantage of the features for temporal action detection.

The approaches implemented in this repo include but are not limited to the following papers:

Latest

[2021-10] Codes and models are released!

Model Zoo

We include our pre-trained models in the MODEL_ZOO.md.

Feature Zoo

We include strong features for HACS and Epic-Kitchens-100 in our FEATURE_ZOO.md.

Guidelines

The general pipeline for using this repo is the installation, data preparation and running. See GUIDELINES.md.

Contributors

This codebase is written and maintained by Ziyuan Huang, Zhiwu Qing and Xiang Wang.

If you find our codebase useful, please consider citing the respective work :).

Upcoming

ParamCrop: Parametric Cubic Cropping for Video Contrastive Learning.