/roadmap_nlp

Roadmap for NLP 涵盖NLP的理论知识、应用场景和工程实践等

Roadmap for NLP

OVERVIEW


1. NLP Overview

Paper, Course, Code, Book, Website, Article, Case, etc.

2. Corpus, Data and Tool

Data, Embedding, Tool, Practice.

3. Traditional Method

TFIDF, TextRank, AP Clustering, etc.

4. Basic Problem

Vocabulary, OOV, Segmentation, Dependency Parsing, Small Data, New Word, Disambiguation, etc.

THEORY


5. Language Model

NGram, BoW, NNLM, Char-level Model, etc.

6. Embedding

Models: Word2Vec, GloVe, Char-level Embedding, Ngram2Vec, Sentence2Vec, Doc2Vec, Paragraph2Vec, StarSpace, Item2Vec, Node2Vec, Wiki2Vec, Tweet2Vec, etc.

Embeddings Dimensionality Reduction.

7. Pretrained Model

Models: EMLo, BERT, GPT1, GPT2, ULMFit, Flair, ERNIE, CoVe, XLM, XLNet, etc.

8. Seq2Seq, Attention and Transformer

Models: Seq2Seq, Encoder-Decoder, Attention, HAN, LuongAttention, Transformer, Transformer-XL, etc.

9. Transfer Learning and Multi-task Learning

Transfer Learning, Multi-task Learning.

APPLICATION


10. Text Classification

Overview: Paper, Code, Practice, Competition, Article, Library, Multi-label Classification

Models: FastText, TextCNN, TextRNN, TextRCNN, VDCNN, DRNN, DPCNN, multiChannelCNN, DeepCNN, LSTM-CNN, Tree-LSTM, etc.

Traditional Method:

11. Text Clustering

Overview: Paper, Code, Practice, Article, etc.

12. Text Similarity

Overview: Paper, Tool, Practice, Competition

13. Text Matching and Entailment

Overview: Paper, Code, Article, Book, etc.

Models: MatchZoo, ESIM, ABCNN, etc.

14. Text Summary

Overview: Article, Competition, Practice

Models: Deep Learning Papers

15. Pairwise Input

Overview: Paper, Practice, etc.

Models: BiLSTMTextRelation, twoCNNTextRelation, BiLSTMTextRelationTwoRNN, etc.

16. Knowledge Graph

Overview: Papaer, Code, Competition, etc.

Models: Representation Learning, Entity Extraction, Relation Extraction, End-to-end, etc.

17. Sentiment Analysis

Overview: Paper, Practice, Competition, Article, etc.

Models: Deep Learning models, Topic Model, Rule

18. Named Entity Recognition (NER)

Overview: Article

Models: HMM&CRF, RNN, RNN+CRF, CNN, etc.

19. Part-of-Speech Tagging (POS)

Models: MEM, RNN, RNN+CRF, etc.

20. Machine Translation

Overview: Paper

21. Machine Comprehension

Overivew: Paper, Data, Competition

Models: Memory Network, R-NET, Recurrent Entity Network, etc.

22. Dialogue Systems

Overview: Paper, Practice, Competition

23. NLP with Image

Overview: Competition, Image Captioning, etc.

24. Other Application

Law: Paper, Practice, etc.

25. Engineering and Tricks

etc.