Tensorflow implementation of RNN(Recurrent Neural Network) for sentiment analysis, one of the text classification problems. There are three types of RNN models, 1) Vanilla RNN, 2) Long Short-Term Memory RNN and 3) Gated Recurrent Unit RNN.
- Movie reviews with one sentence per review. Classification involves detecting positive/negative reviews (Pang and Lee, 2005)
- Download "sentence polarity dataset v1.0" at the Official Download Page
- Located in "data/rt-polaritydata/" in my repository
- rt-polarity.pos contains 5331 positive snippets
- rt-polarity.neg contains 5331 negative snippets
-
positive data is located in "data/rt-polaritydata/rt-polarity.pos"
-
negative data is located in "data/rt-polaritydata/rt-polarity.neg"
-
"GoogleNews-vectors-negative300" is used as pre-trained word2vec model
-
Display help message:
$ python train.py --help
-
Train Example:
$ python train.py --cell_type "vanilla" \ --pos_dir "data/rt-polaritydata/rt-polarity.pos" \ --neg_dir "data/rt-polaritydata/rt-polarity.neg"\ --word2vec "GoogleNews-vectors-negative300.bin"
$ python train.py --cell_type "lstm" \ --pos_dir "data/rt-polaritydata/rt-polarity.pos" \ --neg_dir "data/rt-polaritydata/rt-polarity.neg"\ --word2vec "GoogleNews-vectors-negative300.bin"
$ python train.py --cell_type "gru" \ --pos_dir "data/rt-polaritydata/rt-polarity.pos" \ --neg_dir "data/rt-polaritydata/rt-polarity.neg"\ --word2vec "GoogleNews-vectors-negative300.bin"
-
Movie Review dataset has no test data.
-
If you want to evaluate, you should make test dataset from train data or do cross validation. However, cross validation is not implemented in my project.
-
The bellow example just use full rt-polarity dataset same the train dataset
-
Evaluation Example:
$ python eval.py \ --pos_dir "data/rt-polaritydata/rt-polarity.pos" \ --neg_dir "data/rt-polaritydata/rt-polarity.neg" \ --checkpoint_dir "runs/1523902663/checkpoints"
- Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales (ACL 2005), B Pong et al. [paper]
- Long short-term memory (Neural Computation 1997), J Schmidhuber et al. [paper]
- Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation (EMNLP 2014), K Cho et al. [paper]
- Understanding LSTM Networks [blog]
- RECURRENT NEURAL NETWORKS (RNN) – PART 2: TEXT CLASSIFICATION [blog]