PyTorch Implementation for Natural Langauge Processing. There are several paper implementation about these tasks.
- Classification
- Language Model
- Named Entity Recognition
- Machine Translation
- Question & Answering
- charcnn: Character-level Convolutional Networks for Text Classification blog
- deepcnn: Very Deep Convolutional Networks for Text Classification blog
- lstmcnn: Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers blog
If you run code like this, then the Amazon Review dataset will be downloaded and the model you choose will be trained. It took about 1 day to train these models.
cd classification
python train.py --name 'name of logs' --model 'with model to run' --gpu 'which gpu to use'
- charcnn
- number of parameters: 11,339,013
- batch time: 0.251s (512 batch)
- accuracy: 60.30 %
- deepcnn
- number of parameters: 16,444,005 개
- batch time: 0.138s (128 batch)
- accuracy: 62.85 %
- lstmcnn
- number of parameters: 501381
- batch time: 0.353s (512 batch)
- accuracy: 59.61 %
- Character-Aware Neural Language Models: paper
- reference
If you run code like this, then the Penn Treebank dataset will be downloaded and the model you choose will be trained. It took about 30 minute to train the models.
cd language_model
python train.py --name 'name of logs' --gpu 'which gpu to use'
cd language_model
python test.py --model_path 'path to trained model path' --gpu 'which gpu to use'
- character-aware neural language model
- number of parameters: 5,312,485
- batch time: 0.031 (20 batch)
- perplexity on test dataset: 89.850 (paper: 92.3)