- A classification of Sentiment Analysis which is implemented by pytorch.
- There are many data in data, *.txt of that are came from sent-conv-torch, *.conll.out of that are generated from our lab's parser.
- CR
- MPQA
- TREC
- MR
- SUBJ
- My processed data by preprocessed.sh will be saved in processed_data.
- Trere are many models in models.
- Pooling
- CNN
- Multi_Channel_CNN
- Multi_Layer_CNN
- CharCNN
- GRU
- LSTM
- LSTM_CNN
- TreeLSTM
- CNN_TreeLSTM
- LSTM_TreeLSTM
- Transformer(TODO)
- My log will be saved in log.
- Trere are many scripts in utils.
- Alphabet.py which is used to build dictionary.
- Common.py which is saved unk-key and pad-key.
- Embedding.py which is used to load pre_train embedding by Yang Song.
- Evaluate.py which is used to calculate the F1.
- Feature.py which is implemented a sentence's features, including word, word_id, label, root and so on.
- build-batch.py which is used to build the data's mini batch.
- log.py which is used to save the log.
- opts.py
- which is implemented the argparses.
- trainer.py which is used to train the data.
- tree.py which is implemented the tree's methods.
python : 3.5+
pytorch : 0.4.0
cuda : 8.0 (support GPU, you can choose)
-
first step
sh preprocess.sh
-
second step
sh run.sh
Data/Model | TREC | SUBJ | MR | CR | MPQA |
---|---|---|---|---|---|
Pooling | 76.12 | 89.58 | 74.51 | 80 | 86.43 |
CNN | 90.4 | 91.98 | 77.73 | 84.38 | 88.96 |
Char_CNN | 91.52 | 93.33 | 78.91 | 84.38 | 86.33 |
Multi_Channel_CNN | 89.06 | 94.06 | 79.59 | 83.12 | 88.48 |
Multi_Layer_CNN | 91.74 | 93.65 | 79 | 84.06 | 89.36 |
LSTM | 89.73 | 92.5 | 80.57 | 83.75 | 89.65 |
LSTM_CNN | 92.63 | 92.19 | 81.05 | 83.12 | 89.16 |
GRU | 89.06 | 92.6 | 79.1 | 83.44 | 89.75 |
TreeLSTM | 89.18 | 91.09 | 78.54 | 81.4 | 89.28 |
CNN_TreeLSTM | - | - | 79.11 | - | - |
If you have any questions about this code, please report to me immediately by zenrran@qq.com
!