Interesting NLP papers

Language Modeling

Recurrent Neural Network based models

  • An Empirical Exploration of Recurrent Network Architectures (Jozefowicz et al) [pdf] [github] [summary]
  • Regularizing and Optimizing LSTM Language Models (S.Merity et al) [pdf] [github] [summary]
  • Improving Language Modeling using Densely Connected Recurrent Neural Networks (Godin et al) [pdf] [github] [summary]
  • Grow and Prune Compact, Fast, and Accurate LSTMs (Dai et al) [pdf] [github] [summary]
  • Improving Neural Language Models With a Continuous Cache (2016) (Grave et al) [pdf] [github] [summary]
  • An Analysis of Neural Language Modeling at Multiple Scales (S.Merity et al) [pdf] [github] [summary]

Regularization Techniques

  • Recurrent Neural Network Regularization (Zaremba et el) [pdf] [github] [summary]
  • Regularizing and Optimizing LSTM Language Models (S.Merity et al) [pdf] [github] [summary]
  • A Theoretically Grounded Application of Dropout in Recurrent Neural Networks (Gal et al.) pdf[github][summary]
  • FREEZEOUT: Accelerate Training By Progressively Freezing Layers [pdf]

Transformer based Language Models

  • Generalization Through Memorization: Nearest Neighbor Language Models (Khandelwal et al) (2020) [pdf] [github] [summary]

Optimization Techniques

  • Efficient Softmax Approximation for GPUs (2016) (Grave et al) [pdf] [github] [summary]
  • Adaptive Input Representations For Neural Language Modeling (2019) (Baevski & Auli) [pdf] [github] [summary]
  • Tying Word Vectors And Word Classifiers: A Loss Framework For Language Modeling (Inan et al) (2016) [pdf] [github] [summary]
  • Using the Output Embedding to Improve Language Models (2017) (Press & Wolf) [pdf] [github] [Summary]

Waiting to be titled and categorized