- install the required packages
pip install -r requirements.txt
- Set up a
.env
file in this folder. A template can be found at.env_template
- Add to
PYTHONPATH
:
export PYTHONPATH=${PYTHONPATH}:/path/to/llm_feedback
Generating the outputs:
python llm_feedback/pilot/run_pilot_generation.py \
--generation_llm gpt-3.5-turbo-0301 \
--task example \
--max_num_examples 50 \
--output_dir /path/to/dir
Evaluating the outputs:
python llm_feedback/pilot/run_pilot_evaluation.py \
--model_outputs_path /path/to/dir/gpt-3.5-turbo-0301__gpt-3.5-turbo-0301__gpt-3.5-turbo-0301__example__train__outputs.jsonl \
--task example \
--output_dir /path/to/dir
- Create a new Python file under
llm_feedback/pilot/tasks/
- Implement a subclass of
llm_feedback.pilot.tasks.base.BaseTask
, specifically following methods:get_dataset
: load the dataset and return some iterable of examplesget_chain
: return a LangChain chainprocess
(optional): apply the chain to the example. Override if special processing (e.g. renaming keys) is neededevaluate
: Evaluate a list of model outputs. Evaluate both initial and refinement outputs if necessary.- See
llm_feedback/pilot/tasks/example.py
andllm_feedback/pilot/tasks/mathqa.py
for examples.
- Add the task to
llm_feedback/pilot/tasks/__init__.py