/Deep-Learning-for-Tracking-and-Detection

Collection of papers and other resources for object tracking and detection using deep learning

Collection of papers and other resources for object detection and tracking using deep learning

Static Detection

  • Region Proposal
    • Scalable Object Detection Using Deep Neural Networks (cvpr14) (pdf, notes)
    • Selective Search for Object Recognition (ijcv2013) (pdf, notes)
  • RCNN
    • Faster R-CNN Towards Real-Time Object Detection with Region Proposal Networks (tpami17) (pdf, notes)
    • RFCN - Object Detection via Region-based Fully Convolutional Networks (nips16) (pdf, notes) [Microsoft Research]
    • Mask R-CNN (iccv17) (pdf, (notes, arxiv, code (keras), code (tensorflow)) [Facebook AI Research]
  • YOLO
    • You Only Look Once Unified, Real-Time Object Detection (ax1605) (pdf, notes)
    • YOLO9000 Better, Faster, Stronger (ax1612) (pdf, notes)
    • YOLOv3 An Incremental Improvement (ax1804) (pdf, notes)
  • SSD
    • SSD Single Shot MultiBox Detector (ax1612/eccv16) (pdf, notes)
    • DSSD Deconvolutional Single Shot Detector (ax1701) (pdf, notes)
  • RetinaNet
    • Feature Pyramid Networks for Object Detection (ax1704) (pdf, notes)
    • Focal Loss for Dense Object Detection (ax180207/iccv17) (pdf, notes)
  • Misc
    • OverFeat Integrated Recognition, Localization and Detection using Convolutional Networks (ax1402/iclr14) (pdf, notes)
    • LSDA Large scale detection through adaptation (ax1411/nips14) (pdf, notes)

Video Detection

  • Tubelet
    • Object Detection from Video Tubelets with Convolutional Neural Networks (cvpr16) (pdf, notes)
    • Object Detection in Videos with Tubelet Proposal Networks (ax1704/cvpr17) (pdf, notes)
  • FGFA
    • Deep Feature Flow for Video Recognition (cvpr17) (pdf, arxiv, code) [Microsoft Research]
    • Flow-Guided Feature Aggregation for Video Object Detection (ax1708/iccv17) (pdf, notes)
    • Towards High Performance Video Object Detection (ax1711) (Microsoft) (pdf, notes)
  • RNN
    • Online Video Object Detection using Association LSTM (iccv17) (pdf, notes)
    • Context Matters Refining Object Detection in Video with Recurrent Neural Networks (bmvc16) (pdf, notes)

Multi Object Tracking

  • Deep Learning
    • Tracking The Untrackable: Learning To Track Multiple Cues with Long-Term Dependencies (ax1704/iccv17) (Stanford) (pdf, arxiv, project page, notes)
  • Reinforcement Learning
  • Network Flow
    • Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor (iccv15) (NEC Labs) (pdf, author page, notes)
    • Deep Network Flow for Multi-Object Tracking (cvpr17) (NEC Labs) (pdf, supplementary, notes)
  • Graph Optimization
    • A Multi-cut Formulation for Joint Segmentation and Tracking of Multiple Objects (arxiv July 2016) (highest MT on MOT2015) (University of Freiburg, Germany) (pdf, arxiv, author page, notes)
  • Baseline
    • Simple Online and Realtime Tracking (icip16) (pdf, notes, code)
    • High-Speed Tracking-by-Detection Without Using Image Information (avss17) (pdf, notes, code)

Single Object Tracking

  • Reinforcement Learning
    • Deep Reinforcement Learning for Visual Object Tracking in Videos (arxiv April 2017) (USC-Santa Barbara, Samsung Research) (pdf, arxiv, author page, notes)
    • Visual Tracking by Reinforced Decision Making (arxiv February 2017) (Seoul National University, Chung-Ang University) (pdf, arxiv, author page, notes)
    • Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning (cvpr17) (Seoul National University) (pdf, supplementary, project page, notes)
    • End-to-end Active Object Tracking via Reinforcement Learning (arxiv 30 May 2017) (Peking University, Tencent AI Lab) (pdf, arxiv)
  • Siamese
    • High Performance Visual Tracking with Siamese Region Proposal Network [cvpr18] [pdf] [author] [notes]

Deep Learning

  • Do Deep Nets Really Need to be Deep (NIPS 2014) (pdf, notes)
  • Synthetic Gradients
    • Decoupled Neural Interfaces using Synthetic Gradients (arxiv August 2016) (pdf, notes)
    • Understanding Synthetic Gradients and Decoupled Neural Interfaces (arxiv March 2017) (pdf, notes)

Unsupervised Learning

  • Learning Features by Watching Objects Move (cvpr17) (pdf, notes)

Interpolation

Autoencoder

  • Variational
    • beta-VAE Learning Basic Visual Concepts with a Constrained Variational Framework iclr17 (pdf, notes)
    • Disentangling by Factorising ax1806 (pdf, notes)

Datasets

Collections

Tutorials

Code