A Neural Probabilistic Language Model cdsn
from gensim.models import Word2Vec
- Distributed Representations of Words and Phrases and Their Compositionality
Paper: Deep contextualized word representations
- Bidirectional language model
- residual connection
- character-level embedding
Improving Language Understanding by Generative Pre-Training github
- left to right
- transformer encoder for finetuning, multi-layer decoder for language model
- BPE(byte pair encoding) token
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding github
- bidirectional LM (mask)
- word-level token
- LM Multi-task with Next Sentence Predicition
Language Models are Unsupervised Multitask Learners github
- left-to-right
- larger and deeper than gpt
- BPE token
- few modification in Transformer (position of Layer-Normalization .etc)
XLNet: Generalized Autoregressive Pretraining for Language Understanding
- Bidirectional LM by Random Permutation instead of Mask (better for text generation task, consistent in train and test)
- Two different Attention
- Delete NSE
RoBERTa: A Robustly Optimized BERT Pretraining Approach
toolkit:FAIRSEQ: A Fast, Extensible Toolkit for Sequence Modeling
github
- Bidirectional LM with different mask in various epoch (more robust than origin mask method of bert)
- Experiment on NSE with various kind of input and objectives. And prove that removing NSE loss could slightly improve performance of downstream tasks.
Knowledge Enhanced Contextual Word Representations TODO
- Sentimental Classification
- Text Classification
- Textual entailment
- Paraphrase Identification (detection)
N-gram feature of tokens (word/character)
Paper: Convolutional Neural Networks for Sentence Classification
Long-term dependency feature in Text
Paper: Recurrent Convolutional Neural Networks for Text Classification
N-gram feature of tokens + Contextual feature
Paper:A C-LSTM Neural Network for Text Classification
Deep contextualized word representations
- Machine Translation [Statistical MT] [Neural MT]
- Text Summarization
- Grammar Error Correction
- Question & Answer ( Q&A)
- Dependency Parsing
- Constituency Parsing