/ml-papers

List of Machine Learning papers

MIT LicenseMIT

Machine Learning Papers

Introduction

RNN (Recurrent neural network)

DCGAN (Deep Convolutional Generative Adversarial Networks)

Computer Vision

Dataset

  • Microsoft COCO: Common Objects in Context: New image recognition, segmentation and capturing dataset. Link to COCO Dataset
    • Image annonation with Amazon's Mechanical Turk
    • Bounding Box Detection with DPMv5-P/DPMv5-C
  • YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video: New large-scale data set of video URLs with densely-sampled object bounding box annotations. (Approximately 380,000 video segments about 19s long)
  • AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions: New video dataset. Every person is localized using a bounding box and the attached labels correspond to actions being performed by the person. There is one action corresponding to the pose of the person (whether he or she is standing, sitting, walking, swimming etc.) and there may be additional actions corresponding to interactions with objects or human-human interactions. The main differences with existing video datasets are:
    • the definition of atomic visual actions, which avoids collecting data for each and every complex action
    • precise spatio-temporal annotations with possibly multiple annotations for each human
    • the use of diverse, realistic video material (movies)

Meta-architecture & feature extractor

Models

Others

Face recognition

Dataset

Feature extractor

  • FaceNet: A Unified Embedding for Face Recognition and Clustering: Directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. The benefit of this approach is much greater representational efficiency: they achieve state-of-the-art face recognition performance using only 128-bytes per face.

Models

Comparisons & Benchmark

Neural style algorithm