Feedback_Prize_Effectiveness
A model which classifies argumentative elements such as Lead, Position, Claim, Counterclaim, Rebuttal, Evidence, and Concluding Statement as "effective," "adequate," or "ineffective." based on essays written by U.S. students in grades 6-12. The Kaggle competition can be found here.
Installation and Dependencies:
We recommend using conda environment to install dependencies of this library first. Please install (or load) conda and then proceed with the following commands:
conda create -n dl_project python=3.9
conda activate dl_project
conda install pytorch torchvision torchaudio -c pytorch
conda install -c pytorch torchtext
conda install matplotlib
pip install pyyaml
pip install datasets
pip install transformers
Sometimes, there might be some errors when using torchtext such as
ModuleNotFoundError: No module named 'torchtext.legacy'
In this case, try downgrade your torchtext
pip install torchtext==0.10.0
Next, please install spacy since it is our default tokenizer
pip install -U pip setuptools wheel
pip install -U spacy
python -m spacy download en_core_web_sm
or, if you are using ARM/M1, run this:
pip install -U pip setuptools wheel
pip install -U 'spacy[apple]'
python -m spacy download en_core_web_sm
Code Hierarchy Table
- Inside the code folder
File | Description |
---|---|
review_data.ipynb | Plot several bar chart to better understand the train data |
preprocessing.py | Prepare train, valid, and test data for classic neural network RNN, LSTM, and GRU |
models.py | Defines the RNN, LSTM, and GRU |
bert_preprocess.py | Prepare train, valid, and test data for bert and debert model including toknizering data, concating discourse type with discouse_text etc. |
config.yml | Sets the hyperparameter for loading data, initializing models, and training. |
train.py | Defines functions that train the model, plot loss/accuracy for train/valid datasets, and make inference |
run.py | Trains the model, plot loss and accuracy |
text_processing.py | Removes stopping words, and stems words |
Model's performance
Model | Bidirectional | Last Hidden | Loss on train | Loss on validation |
---|---|---|---|---|
RNN | True | True | 0.825 | 0.883 |
RNN | True | False | 0.860 | 0.888 |
RNN | False | True | 0.936 | 0.929 |
RNN | False | False | 0.873 | 0.892 |
LSTM | True | True | 0.849 | 0.876 |
LSTM | True | False | 0.842 | 0.880 |
LSTM | False | True | 0.843 | 0.885 |
LSTM | False | False | 0.836 | 0.888 |
GRU | True | True | 0.830 | 0.879 |
GRU | True | False | 0.833 | 0.878 |
GRU | False | True | 0.848 | 0.883 |
GRU | False | False | 0.854 | 0.884 |
BERT | N/A | N/A | N/A | 0.733 |
DeBert | N/A | N/A | N/A | 0.730 |
Report
Here is the link of the project report.