/llm-debate

HuggingFace LLM multiagent debate

Primary LanguageJupyter Notebook

llm-debate

Runs multi-agent debate with open-source HuggingFace models on the Arithmetic problem.

gen_math.py runs debates with HuggingFace model (e.g. mistralai/Mistral-7B-Instruct-v0.2)

To reproduce the figure in original paper (scaling with rounds and agents):

  1. set agents/rounds and run ./gen_math.sh training script
  2. generate figures in outputs.ipynb

gen_math_panel.py and ./gen_math.sh are modified to run a panel experiment with multiple different HuggingFace models. Models are specified from a list of available options by passing in indices as command line arguments.

Scaling agents and rounds

image

Reproduced figures

image

Panel experiment

Testing with diverse panel of HuggingFace open source models image

model Arithmetic (%) std
Single Agent (Mistral) 16 7.3
Single Agent Panel (Mistral) 28 8.9
Multi Agent Panel 24 8.5

Original paper Github