- This is for multi-class short text classification.
- Model is built with Word Embedding, LSTM, and Fully-connected layer by Pytorch.
- A mini-batch is created by 0 padding and processed by using torch.nn.utils.rnn.PackedSequence.
- Cross-entropy Loss + Adam optimizer.
- Embedding --> Dropout --> LSTM --> Dropout --> FC.
- The following command will download the dataset used in Learning to Classify Short and Sparse Text & Web with Hidden Topics from Large-scale Data Collections from here and process it for training.
python preprocessing.py
- The following command starts training. Run it with
-h
for optional arguments.
python main.py