SAT-LM

Code for SatLM: SATisfiability-Aided Language Models using Declarative Prompting (NeurIPS 2023).

Setup

python==3.8
requirements: pip install -r requirements.txt
Set OPENAI KEY: export KEY=yourkey

Experiments

Preparation:
mkdir misc tmp

Since OpenAI will no longer support code-davinci-002, we provide cached outputs generated by code-davinci-002.

Please run the following command if you want to use cached code-002 ouputs:
unzip aux/cached_code-002_outputs.zip -d .

Experiments on Arithemetic Reasoning

GSM:
sh exp_scripts/gsm.sh test

GSM-system:
sh exp_scripts/gsm.sh system

Algebra:
sh exp_scripts/gsm.sh algebra

Experiments on Logical Reasoning

ARLSAT:
sh exp_scripts/arlsat.sh

BoardgameQA:
sh exp_scripts/boarddp1.sh # depth 1
sh exp_scripts/boarddp2.sh # depth 2
sh exp_scripts/boarddp3.sh # depth 3

CLUTRR:
sh exp_scripts/clutrr.sh

ProofWriter:
sh exp_scripts/proofd5.sh

Prompts

Prompts used in our experiments are stored as jsonline file in manual_prompts/

Citation

@InProceedings{Ye-Et-Al:2023:SAT,
  title = {SatLM: Satisfiability-Aided Language Models Using Declarative Prompting},
  author = {Xi Ye and Qiaochu Chen and Isil Dillig and Greg Durrett},
  booktitle = {Proceedings of NeurIPS},
  year = {2023},
}