/benchy

Primary LanguagePython

BENCHY

Benchmarks you can feel

We all love benchmarks, but there's nothing like a hands on vibe check. What if we could meet somewhere in the middle?

Enter BENCHY. A chill, live benchmark tool that lets you see the performance, price, and speed of LLMs in a side by side comparison for SPECIFIC use cases.

Watch the walk through video here

parallel-function-calling

pick-two

Live Benchmark Tools

Important Files

  • .env - Environment variables for API keys
  • server/.env - Environment variables for API keys
  • package.json - Front end dependencies
  • server/pyproject.toml - Server dependencies
  • src/store/* - Stores all front end state and prompt
  • src/api/* - API layer for all requests
  • server/server.py - Server routes
  • server/modules/llm_models.py - All LLM models
  • server/modules/openai_llm.py - OpenAI LLM
  • server/modules/anthropic_llm.py - Anthropic LLM
  • server/modules/gemini_llm.py - Gemini LLM

Setup

Get API Keys

Client Setup

# Install dependencies using bun (recommended)
bun install

# Or using npm
npm install

# Or using yarn
yarn install

# Start development server
bun dev  # or npm run dev / yarn dev

Server Setup

# Move into server directory
cd server

# Create and activate virtual environment using uv
uv sync

# Set up environment variables
cp .env.sample .env

# Set EVERY .env key with your API keys and settings
ANTHROPIC_API_KEY=
OPENAI_API_KEY=
GEMINI_API_KEY=

# Start server
uv run python server.py

# Run tests
uv run pytest (**beware will hit APIs and cost money**)

Dev Notes & Caveats

  • See src/components/DevNotes.vue for limitations

Resources