- visualization of word vector
- naive co-occurrence matrix
- GloVe
- analogy exercise
- Word2vec (Skip Gram)
- naive softmax
- negative sampling
- Dataset: Stanford Sentiment Treebank
- Dependence parsing using fully connected network
- Data preprocessing: build single-step decision according to parsing tree GT
- NMT with attention: pad all after the
<END>
token - beam search
- BLEU calculation
- Mostly based on Assignment 4
- Attention NMT with both word-level and character-level CNN and LSTM
- Character-level CNN word embedding module for all inputs words
- character-level RNN decoder for all output words using greedy decoding