SannyZhou/NLP-NoteBook

NLP

Distributed Representation

Neural Language Model

A Neural Probabilistic Language Model cdsn

Pretrained Word Embedding

Word2Vec

from gensim.models import Word2Vec
Distributed Representations of Words and Phrases and Their Compositionality

GloVe

Pretrained Language Model

ELMo

Paper: Deep contextualized word representations

Bidirectional language model
residual connection
character-level embedding

GPT

Improving Language Understanding by Generative Pre-Training github

left to right
transformer encoder for finetuning, multi-layer decoder for language model
BPE(byte pair encoding) token

Bert

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding github

bidirectional LM (mask)
word-level token
LM Multi-task with Next Sentence Predicition

GPT2

Language Models are Unsupervised Multitask Learners github

left-to-right
larger and deeper than gpt
BPE token
few modification in Transformer (position of Layer-Normalization .etc)

XLNet

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Bidirectional LM by Random Permutation instead of Mask (better for text generation task, consistent in train and test)
Two different Attention
Delete NSE

RoBerta

RoBERTa: A Robustly Optimized BERT Pretraining Approach
toolkit:FAIRSEQ: A Fast, Extensible Toolkit for Sequence Modeling github

Bidirectional LM with different mask in various epoch (more robust than origin mask method of bert)
Experiment on NSE with various kind of input and objectives. And prove that removing NSE loss could slightly improve performance of downstream tasks.

KnowBert

Knowledge Enhanced Contextual Word Representations TODO

Classification

Tasks

Sentimental Classification
Text Classification
Textual entailment
Paraphrase Identification (detection)

Convolution Neural Network in NLP

N-gram feature of tokens (word/character)
Paper: Convolutional Neural Networks for Sentence Classification

Recurrent Neural Network in NLP

Long-term dependency feature in Text
Paper: Recurrent Convolutional Neural Networks for Text Classification

CNN + RNN

N-gram feature of tokens + Contextual feature
Paper:A C-LSTM Neural Network for Text Classification

ELMo (BiRNN) => Pretrained Feature

Deep contextualized word representations

Transformer

Sequence2Sequence (Encoder-Decoder)

Tasks

Machine Translation [Statistical MT] [Neural MT]
Text Summarization
Grammar Error Correction
Question & Answer ( Q&A)

Models

Seq2Seq Based on RNN

Attention

Tensor2Tensor Based on Transformer

GPT (pretrained + finetune)

Bert (pretrained + finetune)

NLP Fundemental Tasks

Chinese Word Segmentation

Part-of-Speech Tagging

Parsing

Dependency Parsing
Constituency Parsing

Semantic Role Labeling

Named Entity Recognition

Paraphrase Identification

.etc