A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work
Scoring of agents will go here. Both overall and by category.
- Auto-GPT
- gpt-engineer
- mini-agi
- smol-developer
A repo built for the purpose of benchmarking the performance of agents, regardless of how they are set up and how they work.
PythonMIT