HumanPrompt is a framework for easier human-in-the-loop design, manage, sharing, and usage of prompt and prompt methods. It is specially designed for researchers. It is still in progressπΆ, we highly welcome new contributions on methods and modules. Check out our proposal here.
Firstly, clone this repo, then run:
pip install -e .
This will install humanprompt package and add soft link hub to ./humanprompt/artifacts/hub
.
Then you need to set some environmental variables like OpenAI API key:
export OPENAI_API_KEY = "YOUR_OPENAI_API_KEY"
Then, it depends on how you will use this repo. For now, this repo's mission is to help researchers on verifying their ideas. Therefore, we make it really flexible to extend and use.
A minimal example to run a method is as follows:
Our usage is quite simple, it is almost similar if you have used huggingface transformers before.
For example, use the Chain-of-Thought on CommonsenseQA:
from humanprompt.methods.auto.method_auto import AutoMethod
from humanprompt.tasks.dataset_loader import DatasetLoader
# Get one built-in method
method = AutoMethod.from_config(method_name="cot")
# Get one dataset, select one example for demo
data = DatasetLoader.load_dataset(dataset_name="commonsense_qa", dataset_split="test")
data_item = data[0]
# Adapt the raw data to the method's input format, (we will improve this part later)
data_item["context"] = "Answer choices: {}".format(
" ".join(
[
"({}) {}".format(label.lower(), text.lower())
for label, text in zip(
data_item["choices"]["label"], data_item["choices"]["text"]
)
]
)
)
# Run the method
result = method.run(data_item)
print(result)
print(data_item)
Zero-shot text2SQL:
import os
from humanprompt.methods.auto.method_auto import AutoMethod
from humanprompt.tasks.dataset_loader import DatasetLoader
method = AutoMethod.from_config("db_text2sql")
data = DatasetLoader.load_dataset(dataset_name="spider", dataset_split="validation")
data_item = data[0]
data_item["db"] = os.path.join(
data_item["db_path"], data_item["db_id"], data_item["db_id"] + ".sqlite"
)
result = method.run(data_item)
print(result)
print(data_item)
We adopt "one config, one experiment" paradigm to facilitate research, especially when benchmarking different prompting methods.
In each experiment's config file(.yaml) under examples/configs/
, you can config the dataset, prompting method, and metrics.
Following is a config file example for Chain-of-Thought method on GSM8K:
---
dataset:
dataset_name: "gsm8k" # dataset name, aligned with huggingface dataset if loaded from it
dataset_split: "test" # dataset split
dataset_subset_name: "main" # dataset subset name, null if not used
dataset_key_map: # mapping original dataset keys to humanprompt task keys to unify the interface
question: "question"
answer: "answer"
method:
method_name: "cot" # method name to initialize the prompting method class
method_config_file_path: null # method config file path, null if not used(will be overriden by method_args).
method_args:
client_name: "openai" # LLM API client name, adopted from github.com/HazyResearch/manifest
transform: "cot.gsm8k.transform_cot_gsm8k.CoTGSM8KTransform" # user-defined transform class to build the prompts
extract: "cot.gsm8k.extract_cot_gsm8k.CoTGSM8KExtract" # user-defined extract class to extract the answers from output
extraction_regex: ".*The answer is (.*).\n?" # user-defined regex to extract the answer from output
prompt_file_path: "cot/gsm8k/prompt.txt" # prompt file path
max_tokens: 512 # max generated tokens
temperature: 0 # temperature for generated tokens
engine: code-davinci-002 # LLM engine
stop_sequence: "\n\n" # stop sequence for generation
metrics:
- "exact_match" # metrics to evaluate the results
Users can create the transform
and extract
classes to customize the prompt generation and answer extraction process.
Prompt file can be replaced or specified according to the user's need.
To run experiments, you can specify the experiment name and other meta configs in command line under examples/
directory.
For example, run the following command to run Chain-of-Thought on GSM8K:
python run_experiment.py
--exp_name cot-gsm8k
--num_test_samples 300
For new combination of methods and tasks, you can simply add a new config file under examples/configs/
and run the command.
.
βββ examples
β βββ configs # config files for experiments
β βββ main.py # one sample demo script
β βββ run_experiment.py # experiment script
βββ hub # hub contains static files for methods and tasks
β βββ cot # method Chain-of-Thought
β β βββ gsm8k # task GSM8K, containing prompt file and transform/extract classes, etc.
β β βββ ...
β βββ ama_prompting # method Ask Me Anything
β βββ binder # method Binder
β βββ db_text2sql # method text2sql
β βββ react # method ReAct
β βββ standard # method standard prompting
β βββ zero_shot_cot # method zero-shot Chain-of-Thought
βββ humanprompt # humanprompt package, containing building blocks for the complete prompting pipeline
β βββ artifacts
β β βββ artifact.py
β β βββ hub
β βββ components # key components for the prompting pipeline
β β βββ aggregate # aggregate classes to aggregate the answers
β β βββ extract # extract classes to extract the answers from output
β β βββ post_hoc.py # post-hoc processing
β β βββ prompt.py # prompt classes to build the prompts
β β βββ retrieve # retrieve classes to retrieve in-context examples
β β βββ transform # transform classes to transform the raw data to the method's input format
β βββ evaluators # evaluators
β β βββ evaluator.py # evaluator class to evaluate the dataset results
β βββ methods # prompting methods, usually one method is related to one paper
β β βββ ama_prompting # Ask Me Anything(https://arxiv.org/pdf/2210.02441.pdf)
β β βββ binder # Binder(https://arxiv.org/pdf/2210.02875.pdf)
β β βββ ...
β βββ tasks # dataset loading and preprocessing
β β βββ add_sub.py # AddSub dataset
β β βββ wikitq.py # WikiTableQuestions dataset
β β βββ ...
β βββ third_party # third party packages
β βββ utils # utils
β βββ config_utils.py
β βββ integrations.py
βββ tests # test scripts
βββ conftest.py
βββ test_datasetloader.py
βββ test_method.py
This repository is designed for researchers to give a quick usages and easy manipulation of different prompt methods. We spent a lot of time on making it easy to extend and use, thus we hope you can contribute to this repo.
If you are interested in contributing your method into this framework, you can:
- Bring up an issue about your required method, and we will add it into our TODO list and implement as soon as possible.
- Add your method into
humanprompt/methods
folder yourself. To do that, you should follow the following steps:- Clone the repo.
- Create a branch from
main
branch, named you methods. - Commit your code into your branch, you need to:
- add code in
./humanprompt/methods
, and add your method into./humanprompt/methods/your_method_name
folder, - create a hub of your method in
./hub/your_method_name
, - make sure to have an
./examples
folder in./hub/your_method_name
to config the basic usage this method, - a minimal demo in
./examples
for running and testing your method.
- add code in
- Create a demo of usage in ./examples folder.
- Require a PR to merge your branch into
main
branch. - We will handle the last few steps for you to make sure your method is well integrated into this framework.
We use pre-commit to control the quality of code. Before you commit, make sure to run the code below to go over your code and fix the issues.
pip install pre-commit
pre-commit install # install all hooks
pre-commit run --all-files # trigger all hooks
You can use git commit --no-verify
to skip and allow us to handle that later on.
If you find this repo useful, please cite our project and manifest:
@software{humanprompt,
author = {Tianbao Xie and
Zhoujun Cheng and
Yiheng Xu and
Peng Shi and
Tao Yu},
title = {A framework for human-readable prompt-based method with large language models},
howpublished = {\url{https://github.com/hkunlp/humanprompt}},
year = 2022,
month = October
}
@misc{orr2022manifest,
author = {Orr, Laurel},
title = {Manifest},
year = {2022},
publisher = {GitHub},
howpublished = {\url{https://github.com/HazyResearch/manifest}},
}