/A-chronology-of-deep-learning

Tracing back and exposing in chronological order the main ideas in the field of deep learning, to help everyone better understand the current intense research in AI.

A chronology of deep learning

Hey everyone who is reading this!

So what is the hook with deep learning? Why is everyone talking about it? What happened? Well, in the last three decades, a lot of awesome ideas came out, leading to exceptional breakthroughs on general benchmark tasks to evaluate AI systems performance, like image classification, voice recognition, etc. To get the bigger picture, this repository tries to list in chronological order the main papers about deep learning. This list of 70 selected papers covers all deep learning applications and research areas, including image recognition, machine translation, speech recognition, optimization and meta-learning.The number of citations is given according to Google Scholar stats.

Before the 1980s

1980s

1990s

Despite promising breakthroughs in the late 1980s, in the 1990s AI entered a new Winter era, during the which there were few developments (especially compared to what happened in the 2010s). Deep learning approaches were discredited because of their average performance, mostly because of a lack of training data and computational power.

  • Bengio's team was the first to exhibit how hard it can be to learn patterns over a long time depth:
    Learning long-term dependencies with gradient is difficult, Bengio et al., 1994, IEEE, 2418 citations
  • The wake-sleep algorithm inspired the autoencoder type of neural networks:
    The wake-sleep algorithm for unsupervised neural networks, Hinton et al., 1995, Science, 942 citations
  • Convolutional neural networks (CNNs) were developed in the early 1990s, mostly by Yann LeCun, and their broad application was described here:
    Convolutional neural networks for images, speech and time-series, Yann LeCun & Yoshua Bengio, 1995, The Handbook of Brain Theory and Neural Networks, 1550 citations
  • LSTMs, still widely used today for sequence modeling, are actually quite an old invention:
    Long short-term memory, Hochreiter et al., 1997, Neural Computation, 9811 citations
  • Roughly around the same time as LSTMs came the idea of training RNNs in both directions, meaning that hidden states have access to input elements from the past and the future:
    Bidirectional recurrent neural networks, Schuster et al., 1997, IEEE Transactions on Neural Processing, 1167 citations
  • At the end of the 1990s, Yoshua Bengio and Yann LeCun, regarded today as two of the godfathers in deep learning, generalized document recognition via neural networks trained by gradient desent, and introduced Graph Transformer Networks :
    Gradient-based learning applied to document recognition, LeCun et al., 1998, IEEE, 12546 citations (!)

2000s

This AI Winter continued until roughly 2006, when research in deep learning started to flourish again.

2010s

2010-2011

2012

2013

2014

2014 was really a seminal year for deep learning, with major contributions from a broad variety of groups.

2015

2016

2017