/nlp-journey

NLP 相关的一些文档、论文及代码, 包括主题模型(Topic Model)、词向量(Word Embedding)、命名实体识别(Named Entity Recognition)、文本分类(Text Classificatin)、文本生成(Text Generation)、文本相似性(Text Similarity)计算、机器翻译(Machine Translation)等,涉及到各种与nlp相关的算法,基于tensorflow 2.0。

Primary LanguagePython

nlp journey

Your Journey to NLP Starts Here ! I sincerely welcome you to pull requests!

全面拥抱tensorflow2,代码全部修改为tensorflow2.0版本。

一. 基础知识

二. 经典书目(百度云 提取码:b5qq)

三. 必读论文

01) 模型与优化

  • LSTM(Long Short-term Memory). 地址
  • Sequence to Sequence Learning with Neural Networks. 地址
  • Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. 地址
  • Dropout(Improving neural networks by preventing co-adaptation of feature detectors). 地址
  • Residual Network(Deep Residual Learning for Image Recognition). 地址
  • Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. 地址
  • How transferable are features in deep neural networks. 地址
  • A Critical Review of Recurrent Neural Networks for Sequence Learning. 地址
  • Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks. 地址
  • Distilling the Knowledge in a Neural Network. 地址

02) 综述论文

  • An overview of gradient descent optimization algorithms. 地址
  • Analysis Methods in Neural Language Processing: A Survey. 地址
  • Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. 地址

03) 文本增强

  • EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks.地址

04) 文本预训练

  • A Neural Probabilistic Language Model. 地址
  • word2vec Parameter Learning Explained. 地址
  • Language Models are Unsupervised Multitask Learners. 地址
  • An Empirical Study of Smoothing Techniques for Language Modeling. 地址
  • Efficient Estimation of Word Representations in Vector Space. 地址
  • Distributed Representations of Sentences and Documents. 地址
  • Enriching Word Vectors with Subword Information(FastText). 地址. 解读
  • GloVe: Global Vectors for Word Representation. 官网
  • ELMo (Deep contextualized word representations). 地址
  • BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. 地址
  • Pre-Training with Whole Word Masking for Chinese BERT. 地址
  • XLNet: Generalized Autoregressive Pretraining for Language Understanding地址

05) 文本分类

  • Bag of Tricks for Efficient Text Classification (FastText). 地址
  • A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification. 地址
  • Convolutional Neural Networks for Sentence Classification. 地址
  • Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification. 地址

06) 文本生成

  • A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation. 地址
  • SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient. 地址
  • Generative Adversarial Text to Image Synthesis. 地址

07) 文本相似性

  • Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks. 地址
  • Learning Text Similarity with Siamese Recurrent Networks. 地址
  • A Deep Architecture for Matching Short Texts. 地址

08) 自动问答

  • A Question-Focused Multi-Factor Attention Network for Question Answering. 地址
  • The Design and Implementation of XiaoIce, an Empathetic Social Chatbot. 地址
  • A Knowledge-Grounded Neural Conversation Model. 地址
  • Neural Generative Question Answering. 地址
  • Sequential Matching Network A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots.地址
  • Modeling Multi-turn Conversation with Deep Utterance Aggregation.地址
  • Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network.地址
  • Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete Attributes. 地址

09) 机器翻译

  • Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. 地址
  • Neural Machine Translation by Jointly Learning to Align and Translate. 地址
  • Transformer (Attention Is All You Need). 地址
  • Transformer-XL:Attentive Language Models Beyond a Fixed-Length Context. 地址

10) 自动摘要

  • Get To The Point: Summarization with Pointer-Generator Networks. 地址
  • Deep Recurrent Generative Decoder for Abstractive Text Summarization. 地址

11) 关系抽取

  • Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks. 地址
  • Neural Relation Extraction with Multi-lingual Attention. 地址
  • FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation. 地址
  • End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. 地址

12) 推荐系统

  • Deep Neural Networks for YouTube Recommendations. 地址
  • Behavior Sequence Transformer for E-commerce Recommendation in Alibaba. 地址
  • MV-DSSM:A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems. 地址

四. 必读博文

  • 如何学习自然语言处理(综合版). 地址
  • The Illustrated Transformer.地址
  • Attention-based-model. 地址
  • Modern Deep Learning Techniques Applied to Natural Language Processing. 地址
  • Bert解读. 地址
  • 难以置信!LSTM和GRU的解析从未如此清晰(动图+视频)。地址
  • 深度学习中优化方法. 地址
  • 从语言模型到Seq2Seq:Transformer如戏,全靠Mask. 地址
  • Applying word2vec to Recommenders and Advertising. 地址

五. 相关优秀github项目

六. 相关优秀博客