Awesome Action Recognition:

A curated list of action recognition and related area (e.g. object recognition, pose estimation) resources, inspired by awesome-computer-vision.

Action Recognition

Deep Learning for Videos: A 2018 Guide to Action Recognition - Summary of major landmark action recognition research papers till 2018

HACS
Moments in Time, paper
AVA, paper, [INRIA web] for missing videos
Kinetics, paper, download toolkit
YouTube-8M, technical report
YouTube-BB, technical report
DALY Daily Action Localization in Youtube videos. Note: Weakly supervised action detection dataset. Annotations consist of start and end time of each action, one bounding box per each action per video.
20BN-JESTER, 20BN-SOMETHING-SOMETHING
ActivityNet Note: They provide a download script and evaluation code here .
Charades
Charades-Ego, paper - First person and third person video aligned dataset
EPIC-Kitchens, paper - First person videos recorded in kitchens. Note they provide download scripts and a python library here
Sports-1M - Large scale action recognition dataset.
THUMOS14 Note: It overlaps with UCF-101 dataset.
THUMOS15 Note: It overlaps with UCF-101 dataset.
HOLLYWOOD2: Spatio-Temporal annotations
UCF-101, annotation provided by THUMOS-14, and corrupted annotation list, UCF-101 corrected annotations and different version annotaions. And there are also some pre-computed spatiotemporal action detection results
UCF-50.
UCF-Sports, note: the train/test split link in the official website is broken. Instead, you can download it from here.
HMDB
J-HMDB
LIRIS-HARL
KTH
MSR Action Note: It overlaps with KTH datset.
Sports Videos in the Wild

Deformable Convolutional Networks - J. Dai et al., ICCV2017. [official code]
Detectron - Open Source Object Detection Framework from Facebook AI Research. Includes Mask R-CNN, FPN, and etc. Caffe2 implementation.
Mask R-CNN - K. He et al, [Detectron], [TensorFlow + Keras], [MXNet], [TensorFlow], [PyTorch] - State-of-the-art object detection/instance segmentation algorithm.
Faster R-CNN - S. Ren et al, NIPS2015. [official MatCaffe code], [PyCaffe], [TensorFlow], [Another TF implementation] [Keras] - State-of-the-art object detector.
YOLO - J. Redmon et al, CVPR2016. [official code], [TensorFLow] - Fast object detector.
YOLO9000 - J. Redmon and A. Farhadi, CVPR2017. [official code] - State-of-the-art object detector which can detect 9000 objects in realtime.
SSD - W. Liu et al, ECCV2016. [official PyCaffe code], [TensorFlow], [Keras] - State-of-the-art object detector with realtime processing speed.
RetinaNet - Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He and Piotr Dollár, Facebook AI Research FAIR & ICCV 2017.[Keras] - State-of-the-art object detector with realtime processing speed.

[Detect to Track and Track to Detect] - C. Feichtenhofer et al., ICCV2017. [code], [project web]
[Flow-Guided Feature Aggregation for Video Object Detection] - X. Zhu et al., ICCV2017. [code], aka FGFA

AlphaPose - PyTorch based realtime and accurate pose estimation and tracking tool from SJTU.
Detect-and-Track: Efficient Pose Estimation in Videos - R. Girdhar et al., arXiv2017.
OpenPose Library - Caffe based realtime pose estimation library from CMU.
Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields - Z. Cao et al, CVPR2017. [code] depends on the [caffe RT pose] - Earlier version of OpenPose from CMU.
DensePose [code] - Dense pose human estimation in the wild implemented in the Detectron framework.
MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network - M. Kocabas et al, ECCV2018. [code]

To the extent possible under law, Jinwoo Choi has waived all copyright and related or neighboring rights to this work.

Please read the contribution guidelines. Then please feel free to send me pull requests or email (jinchoi@vt.edu) to add links.