Collection of papers and other resources for object detection and tracking using deep learning
- Region Proposal
- RCNN
- Faster R-CNN Towards Real-Time Object Detection with Region Proposal Networks (tpami17) (pdf, notes)
- RFCN - Object Detection via Region-based Fully Convolutional Networks (nips16) (pdf, notes) [Microsoft Research]
- Mask R-CNN (iccv17) (pdf, (notes, arxiv, code (keras), code (tensorflow)) [Facebook AI Research]
- YOLO
- SSD
- RetinaNet
- Misc
- Tubelet
- FGFA
- RNN
- Deep Learning
- Tracking The Untrackable: Learning To Track Multiple Cues with Long-Term Dependencies (ax1704/iccv17) (Stanford) (pdf, arxiv, project page, notes)
- Reinforcement Learning
- Learning to Track: Online Multi-object Tracking by Decision Making (iccv15) (Stanford) (pdf, code (Matlab), project page, notes)
- Network Flow
- Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor (iccv15) (NEC Labs) (pdf, author page, notes)
- Deep Network Flow for Multi-Object Tracking (cvpr17) (NEC Labs) (pdf, supplementary, notes)
- Graph Optimization
- A Multi-cut Formulation for Joint Segmentation and Tracking of Multiple Objects (arxiv July 2016) (highest MT on MOT2015) (University of Freiburg, Germany) (pdf, arxiv, author page, notes)
- Baseline
- Reinforcement Learning
- Deep Reinforcement Learning for Visual Object Tracking in Videos (arxiv April 2017) (USC-Santa Barbara, Samsung Research) (pdf, arxiv, author page, notes)
- Visual Tracking by Reinforced Decision Making (arxiv February 2017) (Seoul National University, Chung-Ang University) (pdf, arxiv, author page, notes)
- Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning (cvpr17) (Seoul National University) (pdf, supplementary, project page, notes)
- End-to-end Active Object Tracking via Reinforcement Learning (arxiv 30 May 2017) (Peking University, Tencent AI Lab) (pdf, arxiv)
- Siamese
- Video Frame Interpolation via Adaptive Convolution (cvpr17 / iccv17) (pdf (cvpr17), (pdf (iccv17), ppt)
- Variational
- Multi Object Tracking
- IDOT
- UA-DETRAC Benchmark Suite
- GRAM Road-Traffic Monitoring
- Stanford Drone Dataset
- Ko-PER Intersection Dataset
- TRANCOS
- Urban Tracker
- DARPA VIVID / PETS 2005 (Non stationary camera)
- KIT-AKS (No ground truth)
- CBCL StreetScenes Challenge Framework (No top down viewpoint)
- MOT 2015 (mostly street level camera viewpoint)
- MOT 2016 (mostly street level camera viewpoint)
- MOT 2017 (mostly street level camera viewpoint)
- PETS 2009 (No vehicles)
- PETS 2017 (Low density; mostly pedestrians)
- KITTI Tracking Dataset (No top down viewpoint; non stationary camera)
- The WILDTRACK Seven-Camera HD Dataset (pedestrian detection and tracking)
- 3D Traffic Scene Understanding from Movable Platforms (intersection traffic/stereo setup/moving camera)
- Video Understanding / Activity Recognition
- Video Detection
- Static Detection
- Static Segmentation
- Video Segmentation
- Classification
- Optical Flow
- Datasets
- Single Object Tracking
- Multi Object Tracking
- Deep Compressed Sensing
- Misc
- Static Detection
- Deep Learning for Object Detection: A Comprehensive Review
- Review of Deep Learning Algorithms for Object Detection
- A Simple Guide to the Versions of the Inception Network
- R-CNN, Fast R-CNN, Faster R-CNN, YOLO - Object Detection Algorithms
- A gentle guide to deep learning object detection
- The intuition behind RetinaNet
- YOLO—You only look once, real time object detection explained
- Understanding Feature Pyramid Networks for object detection (FPN)
- Fast object detection with SqueezeDet on Keras
- Region of interest pooling explained
- Video Detection
- Deep RL
- Autoencoders
- Multi Object Tracking
- Globally-optimal greedy algorithms for tracking a variable number of objects [cvpr11] [matlab] [author]
- Continuous Energy Minimization for Multitarget Tracking [cvpr11 / iccv11 / tpami 2014] [matlab]
- Discrete-Continuous Energy Minimization for Multi-Target Tracking [cvpr12] [matlab] [project]
- The way they move: Tracking multiple targets with similar appearance [iccv13] [matlab]
- 3D Traffic Scene Understanding from Movable Platforms [2d_tracking] [pami14/kit13/iccv13/nips11] [C++/matlab]
- Multiple target tracking based on undirected hierarchical relation hypergraph [cvpr14] [C++] [author]
- Robust online multi-object tracking based on tracklet confidence and online discriminative appearance learning [cvpr14] [matlab] (project)
- Learning to Track: Online Multi-Object Tracking by Decision Making [iccv15] [matlab]
- Joint Tracking and Segmentation of Multiple Targets [cvpr15] [matlab]
- Multiple Hypothesis Tracking Revisited [iccv15] [highest MT on MOT2015 among open source trackers] [matlab]
- Simple Online and Realtime Tracking [icip 2016] [python]
- Deep SORT : Simple Online Realtime Tracking with a Deep Association Metric [icip 2017] [python]
- Combined Image- and World-Space Tracking in Traffic Scenes [icra 2017] [c++]
- High-Speed Tracking-by-Detection Without Using Image Information [avss 2017] [python]
- Single Object Tracking
- A collection of common tracking algorithms (2003-2012) [c++/matlab]
- SenseTime Research platform for single object tracking, implementing algorithms like SiamRPN and SiamMask [pytorch]
- In Defense of Color-based Model-free Tracking [cvpr15] [c++]
- Hierarchical Convolutional Features for Visual Tracking [iccv15] [matlab]
- Visual Tracking with Fully Convolutional Networks [iccv15] [matlab]
- DeepTracking: Seeing Beyond Seeing Using Recurrent Neural Networks [aaai 2016] [torch 7]
- Learning Multi-Domain Convolutional Neural Networks for Visual Tracking [cvpr16] [vot2015 winner] [matlab/matconvnet]
- Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking [eccv 2016] [matlab]
- Fully-Convolutional Siamese Networks for Object Tracking [eccvw 2016] [matlab/matconvnet] [project]
- DCFNet: Discriminant Correlation Filters Network for Visual Tracking [arxiv1704] [matlab/matconvnet] [pytorch]
- End-to-end representation learning for Correlation Filter based tracking [cvpr17] [matlab/matconvnet] [tensorflow/inference_only] [project]
- RATM: Recurrent Attentive Tracking Model [cvprw17] [python]
- ROLO : Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking [iscas 2017] [tensorfow]
- ECO: Efficient Convolution Operators for Tracking [cvpr17] [matlab]
- Detect to Track and Track to Detect [iccv17] [matlab]
- High Performance Visual Tracking with Siamese Region Proposal Network [cvpr18] [pytorch] [pytorch/reimplementation]
- Distractor-aware Siamese Networks for Visual Object Tracking [eccv18] [vot18 winner] [pytorch]
- Fast Online Object Tracking and Segmentation: A Unifying Approach [cvpr19] [pytorch] [project]
- Video Detection
- Flow-Guided Feature Aggregation for Video Object Detection [nips 2016 / iccv17] [python/cuda]
- T-CNN: Tubelets with Convolution Neural Networks [cvpr16] [python]
- TPN: Tubelet Proposal Network [cvpr17] [python]
- Mobile Video Object Detection with Temporally-Aware Feature Maps [cvpr18] [Google] [tensorflow]
- Static Detection and Matching
- Frameworks
- Region Proposal
- MCG : Multiscale Combinatorial Grouping - Object Proposals and Segmentation (project) [tpami16/cvpr14] [python]
- COB : Convolutional Oriented Boundaries (project) [tpami18/eccv16] [matlab/caffe]
- FPN
- Feature Pyramid Networks for Object Detection [caffe/python]
- RCNN
- RFCN (author) [caffe/matlab]
- RFCN-tensorflow [tensorflow]
- PVANet: Lightweight Deep Neural Networks for Real-time Object Detection
- Mask R-CNN - TensorFlow, Keras
- Light-head R-CNN [cvpr18] [TensorFlow]
- Evolving Boxes for Fast Vehicle Detection [icme18] [Caffe/Python]
- Cascade R-CNN (cvpr18) - Detectron, Caffe
- SSD
- SSD-Tensorflow [tensorflow]
- SSD-Tensorflow (tf.estimator) [tensorflow]
- SSD-Tensorflow (tf.slim) [tensorflow]
- SSD-Keras [keras]
- SSD-Pytorch [pytorch]
- Enhanced SSD with Feature Fusion and Visual Reasoning [NCA18] [TensorFlow]
- RefineDet - Single-Shot Refinement Neural Network for Object Detection [cvpr18] [caffe]
- YOLO
- Darknet: Convolutional Neural Networks [c/python]
- YOLO9000: Better, Faster, Stronger - Real-Time Object Detection. 9000 classes! [c/python]
- Darkflow [tensorflow]
- Pytorch Yolov2 [pytorch]
- Yolo-v3 and Yolo-v2 for Windows and Linux [c/python]
- YOLOv3 in PyTorch [pytorch]
- pytorch-yolo-v3 [pytorch] [no training] [tutorial]
- YOLOv3_TensorFlow [tensorflow]
- tensorflow-yolo-v3 [tensorflow slim]
- tensorflow-yolov3 [tensorflow slim]
- keras-yolov3 [keras]
- Relation Networks for Object Detection [cvpr18] [MXNet]
- DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling [iccv17(poster)] [theano]
- SNIPER: Efficient Multi-Scale Training [cvpr18 / nips18] [mxnet]
- Multi-scale Location-aware Kernel Representation for Object Detection [cvpr18] [caffe/python]
- Matching
- Boundary Detection
- Optical Flow
- FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks (cvpr17) - caffe, pytorch/nvidia
- SPyNet: Spatial Pyramid Network for Optical Flow (cvpr17) - lua, pytorch
- Guided Optical Flow Learning (cvprw17) - caffe, tensorflow
- Fast Optical Flow using Dense Inverse Search (DIS) [eccv16] [C++]
- A Filter Formulation for Computing Real Time Optical Flow [ral16] [c++/cuda - matlab,python wrappers]
- PatchBatch - a Batch Augmented Loss for Optical Flow [cvpr16] [python/theano]
- Piecewise Rigid Scene Flow [iccv13/eccv14/ijcv15] [c++/matlab]
- DeepFlow v2 (iccv13) - c++/python/matlab, project
- An Evaluation of Data Costs for Optical Flow [gcpr13] [matlab]
- Instance Segmentation
- Fully Convolutional Instance-aware Semantic Segmentation [cvpr17] [coco16 winner] [mxnet]
- Autoencoders
- β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework [iclr 2017] [deepmind] [tensorflow] [tensorflow] [pytorch]
- Disentangling by Factorising [arxiv 2018/06] [pytorch]
- Classification
- Learning Efficient Convolutional Networks Through Network Slimming [iccv17] [pytorch]
- Deep RL
- Misc