/nlp-journey

nlp相关的一些论文及代码, 包括主题模型、词向量(Word Embedding)、命名实体识别(NER)、文本分类(Text Classificatin)、文本生成(Text Generation)、文本相似性(Text Similarity)计算等,涉及到各种与nlp相关的算法,基于keras和tensorflow。

Primary LanguagePython

nlp journey

Your Journey to NLP Starts Here ! I sincerely welcome you to pull requests!

基础

经典书目(百度云 提取码:b5qq)

算法入门

深度学习

  • Deep Learning.深度学习必读. 原书地址
  • Neural Networks and Deep Learning. 入门必读. 原书地址
  • 复旦大学《神经网络与深度学习》邱锡鹏教授. 原书地址

自然语言处理

  • 斯坦福大学《语音与语言处理》第三版:NLP必读. 原书地址
  • CS224d: Deep Learning for Natural Language Processing. 课件地址

必读论文

算法模型与优化

  • LSTM(Long Short-term Memory). 地址
  • Sequence to Sequence Learning with Neural Networks. 地址
  • Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. 地址
  • Dropout(Improving neural networks by preventing co-adaptation of feature detectors). 地址
  • Residual Network(Deep Residual Learning for Image Recognition). 地址
  • Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. 地址
  • How transferable are features in deep neural networks. 地址
  • A Critical Review of Recurrent Neural Networks for Sequence Learning. 地址

综述论文

  • Analysis Methods in Neural Language Processing: A Survey. 地址
  • Neural Text Generation: Past, Present and Beyond. 地址

语言模型

  • word2vec Parameter Learning Explained. 地址
  • A Neural Probabilistic Language Model. 地址
  • Language Models are Unsupervised Multitask Learners. 地址

文本增强

  • EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks.地址

文本预训练

  • Efficient Estimation of Word Representations in Vector Space. 地址
  • Distributed Representations of Sentences and Documents. 地址
  • Enriching Word Vectors with Subword Information(FastText). 地址. 解读
  • GloVe: Global Vectors for Word Representation. 官网
  • ELMo (Deep contextualized word representations). 地址
  • BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. 地址
  • Pre-Training with Whole Word Masking for Chinese BERT. 地址
  • XLNet: Generalized Autoregressive Pretraining for Language Understanding地址

文本分类

  • Bag of Tricks for Efficient Text Classification (FastText). 地址
  • A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification. 地址
  • Convolutional Neural Networks for Sentence Classification. 地址
  • Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification. 地址

文本生成

  • A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation. 地址
  • SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient. 地址
  • Generative Adversarial Text to Image Synthesis. 地址

文本相似性

  • Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks. 地址
  • Learning Text Similarity with Siamese Recurrent Networks. 地址

短文本匹配

  • A Deep Architecture for Matching Short Texts. 地址

自动问答

  • A Question-Focused Multi-Factor Attention Network for Question Answering. 地址
  • The Design and Implementation of XiaoIce, an Empathetic Social Chatbot. 地址
  • A Knowledge-Grounded Neural Conversation Model. 地址
  • Neural Generative Question Answering. 地址
  • Sequential Matching Network A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots.地址
  • Modeling Multi-turn Conversation with Deep Utterance Aggregation.地址
  • Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network.地址
  • Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete Attributes. 地址

机器翻译

  • Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. 地址
  • Neural Machine Translation by Jointly Learning to Align and Translate. 地址
  • Transformer (Attention Is All You Need). 地址
  • Transformer-XL:Attentive Language Models Beyond a Fixed-Length Context. 地址

自动摘要

  • Get To The Point: Summarization with Pointer-Generator Networks. 地址
  • Deep Recurrent Generative Decoder for Abstractive Text Summarization. 地址

事件提取

  • Event Extraction via Dynamic Multi-Pooling Convolutional Neural. 地址

关系抽取

  • Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks. 地址
  • Neural Relation Extraction with Multi-lingual Attention. 地址
  • FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation. 地址
  • End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. 地址

推荐系统

  • Deep Neural Networks for YouTube Recommendations. 地址
  • Behavior Sequence Transformer for E-commerce Recommendation in Alibaba. 地址
  • MV-DSSM:A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems. 地址

搜索

  • DSSM: Learning Deep Structured Semantic Models for Web Search using Clickthrough Data. 地址
  • CLSM: A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval. 地址
  • DSSM-LSTM:Semantic Modelling with Long-Short-Term Memory for Information Retrieval. 地址

必读博文

  • 如何学习自然语言处理(综合版). 地址
  • The Illustrated Transformer.地址
  • Attention-based-model. 地址
  • Modern Deep Learning Techniques Applied to Natural Language Processing. 地址
  • Bert解读. 地址
  • XLNet:运行机制及和Bert的异同比较. 地址
  • 难以置信!LSTM和GRU的解析从未如此清晰(动图+视频)。地址
  • Applying word2vec to Recommenders and Advertising. 地址
  • 浅谈用Python计算文本BLEU分数. 地址

实现代码

  • fasttext(skipgram+cbow)
  • gensim(word2vec)
  • eda
  • svm
  • fasttext
  • textcnn
  • bilstm+attention
  • rcnn
  • han
  • bert
  • bilstm+crf
  • siamese

相关github项目

相关博客

相关会议

  • Association of Computational Linguistics(计算语言学协会). ACL
  • Empirical Methods in Natural Language Processing. EMNLP
  • International Conference on Computational Linguistics. COLING
  • Neural Information Processing Systems(神经信息处理系统会议). NIPS
  • AAAI Conference on Artificial Intelligence. AAAI
  • International Joint Conferences on AI. IJCAI
  • International Conference on Machine Learning(国际机器学习大会). ICML