This repo hosts the code for the paper, Crystal: Introspective Reasoners Reinforced with Self-Feedback, presented at EMNLP 2023.
Model: Our Crystal models are now on huggingface model hub! [large] [3b] [11b]
Usage: Please see Crystal's huggingface model card
Create and activate the Conda environment:
conda env create -f environment.yml
conda activate crystal
Install gsutil.
Download the UQA data: Go to /data/
and run python download_uqa.py
The Crystal model is trained in two stages. For simplicity, we show the process and code for training the Crystal-large model.
We trained this stage using 8x V100 GPUs, each has 32G memory.
First, generate silver knowledge from GPT-3.
If you would like to use our pre-generated data, you can download a copy of our pre-generated knowledge.
Go to /data/
and run gdown 10c2Mmjd3Nc6FgoWkUjqy-Ysy41D7qzch
Alternatively, you can download the knowledge_gkp.zip
file from our Google Drive folder, unzip it and put it under /data/
Otherwise, you can generate the knowledge yourself by going to the /scripts/
directory and run
sh generate_knowledge_gkp.sh
Remember to set the OPENAI_API_KEY
envvar beforehand, and be ready to spend a lot of money ;)
Then, you can start Stage I training by going to the /sbatch/
directory and run
sh train_imitation_large.sh
You can track the training in wandb.
The best model ckpt will be saved under /runs_stageI/
.
We trained this stage using 8x V100 GPUs, each has 32G memory.
To train Stage II with the default setting, go to the /sbatch/
directory and run
sh train_crystal_large.sh
Before you run this script, make sure to edit it and fill in the path to the best StageI model ckpt.
You can track the training in wandb.
The best model ckpt will be saved under /runs/
.
To run inference with the default setting, go to the /sbatch/
directory and run
sh eval_crystal_large.sh
This will evaluate the dev split of all seen datasets.
You can view the output knowledge in [PATH_TO_MODEL_CKPT]/../knowledge/
and the inference results in [PATH_TO_MODEL_CKPT]/../inference/
.
If you find this repo useful, please cite our paper:
@article{Liu2023CrystalIR,
title={Crystal: Introspective Reasoners Reinforced with Self-Feedback},
author={Jiacheng Liu and Ramakanth Pasunuru and Hannaneh Hajishirzi and Yejin Choi and Asli Celikyilmaz},
journal={ArXiv},
year={2023},
volume={abs/2310.04921}
}