Reproduce

We provide the reproduce code for proposed SBM model Before reproducing, make sure the below python packages installed

torch 
    Version: 1.5.0
json
tqdm
pickle
argparse
transformers
    Version: 2.5.1

How to train

To trained our proposed model, follow the step below

Download the software.zip and data.zip we upload
unzip the tow zipfile above, and put data/ under software/ directory
cd in software/
Download train-v2.0.json and dev-v2.0.json from SQuAD2.0 webpage and put them into data/
preprocessing all data, run ./make_data.sh
To train the model, run ./_run_train.sh You can chage the using GPU and other hyperparameters in this shell script
The best model(with lowest valid loss) will happen in the second or third epochs, check the output when training to select the best model
The model will be store in software/model_merge3/ and in that directory, model_0.pickle is the first epoch model, model_1.pickle is the second, and so on.

How to extract rationales

After training is dene, the model will be saved in software/model_merge3/ , pick the best model from output valid loss and change the --predict_model in _run_beer_reason.sh and _run_movie_reason.sh

Beer rationalizing

To extract the rationales using our model, follow the steps below

run appearance(0), aroma(1), palate(2) ./_run_beer_reason.sh 0 The first input argument is the choosing aspects, 0 is appearance, 1 is aroma, 2 is palate. The output will be store in software/predict_beer_[0/1/2].json
The output from stdout will contain AVG Len, you can scale the --want_len in ./_run_beer_reason.sh script to acquire rationales with different highlighted ratio. For example, since the avg context length is 127, if you want a 10% highlighted rationales, the output AVG Len should be 12. If 20% highlighted rationales is expected, the output AVG Len should be 25.

Movie rationalizing

To extract the rationales using our model, simply execute the command below:

./_run_movie_reason.sh

The output will be store in software/predict_movie.json

Data description

Before running code, please download the data/ directory and move it under software/ directory, both are provided as a zip file when submit the paper. And make sure the data list below was download and placed in the data/ directory

SQuAD dataset

train-v2.0.json (Not provided)
- Download it from https://rajpurkar.github.io/SQuAD-explorer/
dev-v2.0.json (Not provided)
- Download it from https://rajpurkar.github.io/SQuAD-explorer/

Beer Advocate dataset

We use the data provided by the paper ("Rationalizing Neural Predictions". Tao Lei, Regina Barzilay and Tommi Jaakkola. EMNLP 2016), the link of data can be find in http://people.csail.mit.edu/taolei/beer/

annotations.json (Provided)
- Directly download from the link above, is already in the data directory
reviews.mixed3.txt (Provided)
- randomly picked 20000 data from reviews.aspect0.train.txt, reviews.aspect1.train.txt, reviews.aspect2.train.txt and merge together
reviews.mixed_heldout.txt (Provided)
- randomly picked 20000 data from reviews.aspect0.heldout.txt, reviews.aspect1.heldout.txt, reviews.aspect2.heldout.txt and merge together

Movie reviews dataset

We use the data provide from https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews as training and valid data, and use the data provided in (ERASER benchmark)http://www.eraserbenchmark.com as test data

IMDB_dataset.csv (Provided)
- Downloaded directly from the above link, contains 50000 data for training and validation
movie_sentence_layer_test.json (Provided)
- We split each sentence in the test data provided by ERASER to form a dataset in SQuAD 2.0 format. This is used to feed into our model for rationalizing
movie_rationale.pickle (Provided)
- We extract the labeled rationales in test data and store it into a dictionary

Code description

all codes needed for reproduce our experiment is provided in software/

script/train.py
- Include model structure, training process and predict process
script/data_preproc.py
- script forr data preprocessing
script/evaluate_qa.py
- script contains evaluation metrics
script/process_beer.py
- preprocessing for beer training and valid data
script/process_movie.py
- preprocessing for movie training and valid data
script/ask_beer_reason.py
- Creating SQuAD format data for beer rationalization
make_data.sh
- run to preprocces data for training and testing
_run_train.sh
- run to trained ou
_run_beer_reason.sh
- shell script for beer rationale prediction, and evaluate the IOU F1 and Token F1 scores
_run_movie_reason.sh
- shell script for movie rationale prediction, and evaluate the IOU F1 and Token F1 scores

billkunghappy/ADL-Final-Report