News-text-Classification-Based-on-Weighted-LSTMs

This repository is an attempt to implement the technique used in the paper W-RNN: News Text Classification based on a Weighted RNN, and then comparing it with the results obtained with standard LSTM based and Bi-LSTM based techniques. Further, an additional model is also proposed which improves the performance over the WRNN model.

Dataset: 20 News Groups

Team Members: Aviral Sethi, Manav Kaushik, Neelabh Sinha

Description of Files in this repository:

Final_CODE.ipynb : Contains our source code
Generator.ipynb : Contains the code for word2vec and dataset generations
train_data.p : Training dataset after cleaning
desired_train_data.p : Training desired values
test_data.p : Testing dataset after cleaning
desired_test_data.p : Testing desired values
Predictions : The prediction list of 100 documents
wordvectors1.kv : word2vec generated keyed vectors

Steps to execute the code in this reposirory:

Install Anaconda Distribution and all relevant libraries
Open up the terminal and type -

$git clone https://github.com/neelabhsinha/News-text-Classification-Based-on-Weighted-LSTMs.git
$cd News-text-Classification-Based-on-Weighted-LSTMs
$jupyter notebook

WRNN with LSTM Architecture -

Additional Model Proposed: Bidirectional WRNN (Replaces the LSTM unit with Bi-LSTM unit)

Training -

(Shown for the WRNN and Bi-WRNN model only) Results of training of all models can be found inside Graphs and Results folder.

Additional Improvements done in the Architecture of the Model -

Introduced Regularization and Recurrent Dropout in the LSTM unit
Used Time Distributed Layers to weigh intermidiate output instead oc conv1d layer
Used Dropout and L1, L2 architecture in Dense Layers

Results -

S. No.	Model Name	Accuracy	F1-score (Obtained)	FI-score (as quoted in the paper)
1	RNN (Baseline 1)	67.8%	64.5%	78%
2	BiLSTM (Baseline 2)	87.9%	87.5 %	75%
3	WRNN (Paper Model)	89.5%	88.8%	84%
4	Bi-WRNN (Additional Model)	89.2%	88.7%	-

Significant Improvement over results quoted in the paper
Additional Model works well
The results improved due to better generalization in the model due to proper use of L1, L2 regularization, and Dropout