AICUP 2021 教育部全國大專校院人工智慧競賽春季賽：醫病決策預判與問答

Installation

Download trained model:
- By script (download and unzip):
```
bash download_model.sh
```
- Direct download:
  - rc_model.zip: https://drive.google.com/file/d/1GHO_PUPwRSYaHLgmWb8wKUD8dvsYw50w/view?usp=sharing
  - qa_model.zip: https://drive.google.com/file/d/1OH2J6m9j_sUmecbpsW3aYrUDCgxJ-W-P/view?usp=sharing
  - Unzip and place models under ckpt/.
Test data is recommended to be placed at data/rc/test.csv and data/qa/test.json for Risk Classification and QA respectively.

The data of QA have to be preprocessed before predicting:

python query_qa.py \
    --data_path data/qa/test.json \
    --model_name model_test.pkl \
    --processed_data_path data/qa/processed_test.json

The data have to be preprocessed before training:

python query_qa.py \
    --data_path data/qa/train.json \
    --model_name model_train.pkl \
    --processed_data_path data/qa/processed_train.json

We also use C3 dialog data to boost performance.
- Download c3-d-train.json, c3-d-dev.json, and c3-d-test.json at https://github.com/nlpdata/c3/tree/master/data and place them at data/c3/train.json, data/c3/dev.json, and data/c3/test.json.
To train:
```
python train_qa.py
```
- Validation data will be split from training data with a ratio of 10% automatically.
  - The splitting process requires data/rc/train.csv to get the exact split as risk classification, so make sure the file exists.
- The program saves the best model by the accuracy of validation data (can be changed by --metric_for_best).
- Model will be saved in --ckpt_dir (default: ckpt/qa).
- Training uses cuda:0 by default (can be changed by --device), and note that using cpu is not tested.