/openrouter

A comprehensive Python framework for OpenRouter API with FREE MODEL PRIORITY, cost-aware AI interactions, and advanced features

Primary LanguagePython

OpenRouter Framework

A comprehensive Python framework for interacting with AI models through the OpenRouter API with intelligent cost management and free model prioritization.

🌟 Key Features

Core Features

  • πŸŽ† Free Model Priority: Defaults to free models to minimize costs
  • πŸ’° Cost Awareness: Clear warnings and protection against unexpected charges
  • πŸ”„ Smart Fallbacks: Automatic fallback from paid to free models on failures
  • πŸ›‘οΈ Error Protection: Prevents accidental paid model usage in development
  • πŸ“š Multiple Interaction Patterns: Flexible APIs for different use cases
  • 🎯 14 Free Models: Including Llama 3.3/4, Gemini 2.0/2.5, DeepSeek R1, and more
  • ⚑ Easy Integration: Simple setup and intuitive API design

πŸš€ NEW: Advanced Features (Major Update)

  • πŸ“ API Call Logging: Comprehensive request/response logging with timing and token usage
  • πŸ’Ύ Response Caching: In-memory and persistent caching with TTL and LRU eviction
  • πŸ“‘ Streaming Support: Real-time streaming responses with Server-Sent Events
  • ⚑ Async/Await Support: Full asynchronous support with aiohttp for concurrent operations
  • πŸ”„ Dynamic Model Fetching: Live model data fetching from OpenRouter API (300+ models)
  • πŸ”— Feature Integration: All features work seamlessly together

πŸš€ Quick Start

Installation

  1. Clone or download the framework
  2. Install dependencies (Updated with new async support):
    pip install python-dotenv requests PyYAML aiohttp
  3. Set up your API key in .env:
    OPENROUTER_API_KEY="sk-or-v1-your-key-here"

Simplest Usage (FREE by default)

from framework import AIInteraction, quick_free_response

# One-liner free response
response = quick_free_response("What is Python programming?")

# Default interaction (uses free models with new caching + logging)
interaction = AIInteraction()
response = interaction.generate_response("Explain machine learning")

πŸš€ NEW: Advanced Features Usage

# Streaming responses
from framework import stream_to_console
result = stream_to_console("Tell me a story", show_chunks=True)

# Async support
import asyncio
from async_framework import quick_async_free_response

async def main():
    response = await quick_async_free_response("Async question")
    
# Caching configuration
from caching import CacheConfig
from client import OpenRouterClient
cache_config = CacheConfig(enabled=True, max_size=1000, ttl_seconds=3600)
client = OpenRouterClient(cache_config=cache_config, enable_logging=True)

asyncio.run(main())

Guaranteed Free Usage (Development/Testing)

from framework import FreeModelInteraction

# Only uses free models - perfect for development
interaction = FreeModelInteraction()
response = interaction.generate_response(
    "Write a Python function to sort a list",
    default_prompt_name="code_generator"
)

πŸ“‹ Table of Contents

πŸ—οΈ Architecture Overview

openrouter-framework/
β”œβ”€β”€ .env                     # API key configuration
β”œβ”€β”€ .gitignore              # Git ignore file
β”œβ”€β”€ requirements.txt        # Python dependencies (Updated: aiohttp)
β”œβ”€β”€ config.py              # Environment and API configuration
β”œβ”€β”€ models.py              # AI model definitions + dynamic model fetching
β”œβ”€β”€ prompts.py             # Prompt management system
β”œβ”€β”€ default_prompts.yaml   # Pre-defined system prompts
β”œβ”€β”€ client.py              # OpenRouter API client + logging + caching + streaming
β”œβ”€β”€ async_client.py        # NEW: Async OpenRouter client with aiohttp
β”œβ”€β”€ framework.py           # Main interaction classes + streaming utilities
β”œβ”€β”€ async_framework.py     # NEW: Async interaction classes and utilities
β”œβ”€β”€ caching.py             # NEW: Response caching system (in-memory + persistent)
β”œβ”€β”€ main.py                # Comprehensive demo and examples
β”œβ”€β”€ CLAUDE.md             # Development guide (Updated)
└── README.md             # This file (Updated)

Core Design Principles

  1. Free Model Priority: Framework defaults to free models to prevent unexpected costs
  2. Explicit Paid Usage: Paid models require intentional selection with clear warnings
  3. Cost Awareness: Visual and programmatic alerts for cost-incurring operations
  4. Smart Fallbacks: Graceful degradation from paid to free models on failures
  5. Developer Safety: Multiple safeguards against accidental charges during development

🎯 Model Categories

Free Models (14 Available)

  • Meta Llama 3.3 8B Instruct (Default free model)
  • Meta Llama 4 Maverick (400B params, 17B active)
  • Meta Llama 4 Scout (109B params, 17B active)
  • Meta Llama 3.3 70B Instruct (High-performance free option)
  • Google Gemini 2.5 Pro Experimental (Latest Google model)
  • Google Gemini 2.0 Flash Experimental
  • Google Gemini 2.0 Flash Thinking Experimental
  • DeepSeek Chat V3 (Excellent reasoning)
  • DeepSeek R1 (Advanced reasoning model)
  • DeepSeek R1 Distill Llama 70B
  • Mistral 7B Instruct
  • Mistral Small 3.1 24B Instruct
  • Qwen QwQ 32B
  • NVIDIA Llama 3.1 Nemotron Ultra 253B

Paid Models (9 Available)

  • OpenAI GPT-4o (Premium performance)
  • OpenAI GPT-4o Mini (Cost-effective premium)
  • OpenAI GPT-3.5 Turbo (Default paid model)
  • Anthropic Claude 3 Opus (Advanced reasoning)
  • Anthropic Claude 3 Sonnet (Balanced performance)
  • Anthropic Claude 3 Haiku (Fast responses)
  • Google Gemma 7B Instruct
  • Meta Llama 3 8B Instruct
  • Meta Llama 3 70B Instruct

πŸ”„ Interaction Patterns

1. Default Interaction (Free Priority)

from framework import AIInteraction

# Automatically uses free models
interaction = AIInteraction()
response = interaction.generate_response("Your question here")

# Check current model
info = interaction.get_model_info()
print(f"Using: {info['display_name']} (Cost: {info['cost_tier']})")

2. Free-Only Interaction (Development Safe)

from framework import FreeModelInteraction

# Guaranteed free - rejects paid models
interaction = FreeModelInteraction()
response = interaction.generate_response(
    "Generate Python code",
    default_prompt_name="code_generator"
)

3. Cost-Aware Interaction (Smart Fallbacks)

from framework import CostAwareInteraction

# Smart paid→free fallback
cost_aware = CostAwareInteraction(prefer_free=True, auto_fallback=True)

# Force free model
response = cost_aware.generate_response("Question", force_free=True)

# Allow paid model (with warnings)
response = cost_aware.generate_response("Question", force_paid=True)

4. Quick Utilities

from framework import quick_free_response, compare_free_vs_paid, list_model_costs

# One-liner free response
response = quick_free_response("What is AI?")

# Compare free vs paid quality
comparison = compare_free_vs_paid("Explain quantum computing")
print(f"Free: {comparison['free_response']}")
print(f"Paid: {comparison['paid_response']}")

# List available models
costs = list_model_costs()
print(f"Free models: {len(costs['free'])}")
print(f"Paid models: {len(costs['paid'])}")

🧩 Framework Components

config.py - Configuration Management

Handles environment variables and API configuration:

from config import OPENROUTER_API_KEY, OPENROUTER_BASE_URL

Features:

  • Loads API key from .env file
  • Validates configuration on startup
  • Provides base URL for OpenRouter API

models.py - Model Definitions

Defines all available AI models with cost tiers:

from models import (
    get_default_model, get_free_models, get_paid_models,
    Llama33_8B_Free, GPT35Turbo, CostTier
)

# Get default models
free_model = get_default_model(prefer_free=True)
paid_model = get_default_model(prefer_free=False)

# Check model properties
print(f"Is free: {free_model.is_free}")
print(f"Cost tier: {free_model.cost_tier}")

Key Classes:

  • AIModel - Abstract base class for all models
  • CostTier - Enumeration for FREE/PAID classification
  • Model-specific classes for each supported model

prompts.py - Prompt Management

Manages default system prompts from YAML configuration:

from prompts import PromptManager

pm = PromptManager()
prompt = pm.get_prompt("code_generator")
available = pm.list_prompts()

Available Prompts:

  • code_generator - For code generation tasks
  • creative_writer - For creative writing
  • data_analyst - For data analysis tasks
  • general_assistant - General purpose assistant
  • technical_writer - For technical documentation

client.py - OpenRouter API Client

Handles all communication with the OpenRouter API:

from client import OpenRouterClient

client = OpenRouterClient()
response = client.send_request(
    model_name="meta-llama/llama-3.3-8b-instruct:free",
    messages=[{"role": "user", "content": "Hello!"}]
)
content = client.get_response_content(response)

Features:

  • Automatic retry logic
  • Error handling and validation
  • Response parsing and content extraction
  • Support for all OpenRouter API parameters

framework.py - Main Framework

Core interaction classes and utilities:

from framework import (
    AIInteraction, FreeModelInteraction, CostAwareInteraction,
    MultiModelInteraction, quick_free_response, compare_free_vs_paid
)

Core Classes:

  • AIInteraction - Main interaction class with cost awareness
  • FreeModelInteraction - Free-only interaction (development safe)
  • CostAwareInteraction - Smart cost management with fallbacks
  • MultiModelInteraction - Multiple model management

πŸ“š Usage Examples

Basic Examples

Simple Question Answering

from framework import quick_free_response

# Quick free response
answer = quick_free_response("What is the capital of France?")
print(answer)  # "Paris"

Code Generation

from framework import FreeModelInteraction

interaction = FreeModelInteraction()
code = interaction.generate_response(
    "Create a function to calculate factorial",
    default_prompt_name="code_generator"
)
print(code)

Creative Writing

from framework import AIInteraction

interaction = AIInteraction()
story = interaction.generate_response(
    "A robot discovers emotions for the first time",
    default_prompt_name="creative_writer"
)
print(story)

Advanced Examples

Multi-Turn Conversation

from framework import FreeModelInteraction

interaction = FreeModelInteraction()

conversation = [
    {"role": "system", "content": "You are a helpful programming tutor."},
    {"role": "user", "content": "What is recursion?"}
]

response1 = interaction.chat(conversation)
print(f"AI: {response1}")

# Continue conversation
conversation.extend([
    {"role": "assistant", "content": response1},
    {"role": "user", "content": "Can you give me an example?"}
])

response2 = interaction.chat(conversation)
print(f"AI: {response2}")

Model Comparison

from framework import compare_free_vs_paid

comparison = compare_free_vs_paid(
    "Explain machine learning in simple terms",
    default_prompt_name="general_assistant"
)

print(f"Free Model ({comparison['free_model']}):")
print(comparison['free_response'])
print(f"\nPaid Model ({comparison['paid_model']}):")
print(comparison['paid_response'])

Smart Cost Management

from framework import CostAwareInteraction

# Initialize with preferences
cost_aware = CostAwareInteraction(
    prefer_free=True,      # Default to free models
    auto_fallback=True     # Fallback to free on paid failures
)

# Regular usage (uses free model)
response = cost_aware.generate_response("Explain Python decorators")

# Force premium model when needed
premium_response = cost_aware.generate_response(
    "Complex analysis task",
    force_paid=True,
    default_prompt_name="data_analyst"
)

# Switch models dynamically
cost_aware.switch_to_paid()   # Switch to paid model (with warning)
cost_aware.switch_to_free()   # Switch back to free model

Batch Processing

from framework import FreeModelInteraction
from models import get_free_models

# Process multiple questions with different free models
questions = [
    "What is Python?",
    "Explain machine learning",
    "How do neural networks work?"
]

free_models = get_free_models()[:3]  # Use first 3 free models

for i, question in enumerate(questions):
    interaction = FreeModelInteraction(free_models[i])
    response = interaction.generate_response(question)
    print(f"Model {i+1}: {response[:100]}...")

πŸ’° Cost Management

Cost Protection Features

  1. Default Free Models: Framework defaults to free models automatically
  2. Clear Warnings: Visual alerts when using paid models
  3. Error Prevention: FreeModelInteraction rejects paid models
  4. Smart Fallbacks: Automatic fallback to free models on failures

Cost Warning System

from framework import AIInteraction
from models import GPT35Turbo

# This triggers a cost warning
interaction = AIInteraction(GPT35Turbo())
# Output: ⚠️  COST WARNING: You are using a PAID model...

Development vs Production

Development (Cost-Free)

# Use these patterns during development
from framework import FreeModelInteraction, quick_free_response

# Guaranteed free
interaction = FreeModelInteraction()

# Quick testing
response = quick_free_response("test prompt")

Production (Cost-Aware)

# Use these patterns in production
from framework import CostAwareInteraction

# Smart cost management
cost_aware = CostAwareInteraction(
    prefer_free=True,
    auto_fallback=True
)

# Explicit model choice when needed
from models import GPT35Turbo
premium_interaction = AIInteraction(GPT35Turbo())  # Shows warning

βš™οΈ Configuration

Environment Variables

Create a .env file in the project root:

OPENROUTER_API_KEY="sk-or-v1-your-openrouter-api-key-here"

Default Prompts Configuration

Edit default_prompts.yaml to customize system prompts:

prompts:
  custom_prompt:
    role: "system"
    content: "You are a specialized assistant for my domain."
  
  code_reviewer:
    role: "system"
    content: "You are an expert code reviewer. Analyze code for best practices."

Model Configuration

Add new models in models.py:

class NewFreeModel(AIModel):
    @property
    def model_name(self) -> str:
        return "provider/new-model:free"
    
    @property
    def display_name(self) -> str:
        return "New Free Model"
    
    @property
    def cost_tier(self) -> CostTier:
        return CostTier.FREE

πŸ“– API Reference

AIInteraction Class

Main interaction class with cost awareness.

class AIInteraction:
    def __init__(self, model=None, prompt_manager=None, client=None, warn_on_paid=True)
    def generate_response(self, user_prompt, default_prompt_name=None, **kwargs) -> str
    def chat(self, conversation, **kwargs) -> str
    def switch_model(self, new_model) -> None
    def get_model_info(self) -> Dict[str, Any]
    def get_available_prompts(self) -> List[str]

FreeModelInteraction Class

Free-only interaction class for development.

class FreeModelInteraction(AIInteraction):
    def __init__(self, model=None, **kwargs)
    # Inherits all AIInteraction methods
    # Rejects paid models with ValueError

CostAwareInteraction Class

Smart cost management with fallbacks.

class CostAwareInteraction:
    def __init__(self, prefer_free=True, auto_fallback=True)
    def generate_response(self, user_prompt, force_free=False, force_paid=False, **kwargs) -> str
    def switch_to_free(self) -> None
    def switch_to_paid(self) -> None
    def get_current_model_info(self) -> Dict[str, Any]

Utility Functions

# Quick responses
def quick_free_response(user_prompt, default_prompt_name=None) -> str

# Model comparison
def compare_free_vs_paid(user_prompt, default_prompt_name=None) -> Dict[str, str]

# Model information
def list_model_costs() -> Dict[str, List[str]]

# Model utilities (from models.py)
def get_default_model(prefer_free=True) -> AIModel
def get_free_models() -> List[AIModel]
def get_paid_models() -> List[AIModel]
def is_model_free(model_name: str) -> bool

Model Properties

All models implement the AIModel interface:

@property
def model_name(self) -> str          # OpenRouter API model name
def display_name(self) -> str        # Human-readable name
def cost_tier(self) -> CostTier      # FREE or PAID
def is_free(self) -> bool           # True if free model
def is_paid(self) -> bool           # True if paid model
def default_temperature(self) -> float     # Default sampling temperature
def default_max_tokens(self) -> Optional[int]  # Default max tokens

πŸš€ Advanced Features

πŸ“ API Call Logging

Comprehensive logging system with request/response tracking:

from client import OpenRouterClient

# Enable logging
client = OpenRouterClient(enable_logging=True)

# Make requests (automatically logged)
response = client.send_request(model_name, messages)

# Get call history with timing and token usage
history = client.get_call_history()
for call in history:
    print(f"Status: {call['status']}, Duration: {call['duration']:.3f}s")
    print(f"Tokens: {call.get('usage', {}).get('total_tokens', 0)}")

# Clear history
client.clear_call_history()

πŸ’Ύ Response Caching

Intelligent caching system with TTL and LRU eviction:

from caching import CacheConfig, InMemoryCache
from client import OpenRouterClient

# Configure caching
cache_config = CacheConfig(
    enabled=True,
    max_size=1000,           # Maximum cache entries
    ttl_seconds=3600,        # 1 hour TTL
    persist_to_disk=True,    # Save cache to disk
    cache_dir=".cache"       # Cache directory
)

# Create client with caching
client = OpenRouterClient(cache_config=cache_config)

# First request (cached)
response1 = client.send_request(model_name, messages, use_cache=True)

# Second request (uses cache - much faster)
response2 = client.send_request(model_name, messages, use_cache=True)

# Cache management
stats = client.get_cache_stats()
print(f"Cache hits: {stats['hits']}, misses: {stats['misses']}")
print(f"Hit rate: {stats['hit_rate']:.2%}")

client.clear_cache()
client.set_cache_enabled(False)  # Disable caching

πŸ“‘ Streaming Support

Real-time streaming responses with Server-Sent Events:

from framework import stream_to_console
from client import OpenRouterClient

# Stream to console with real-time display
result = stream_to_console(
    "Tell me a story about a robot",
    show_chunks=True  # Shows chunk boundaries
)

# Manual streaming for custom handling
client = OpenRouterClient()
for chunk in client.send_streaming_request(model_name, messages):
    content = chunk.get('content', '')
    full_content = chunk.get('full_content', '')
    
    if content:
        print(content, end='', flush=True)
    
    # Custom logic for each chunk
    if len(full_content) > 100:
        break  # Stop after 100 characters

⚑ Async/Await Support

Full asynchronous support with concurrent operations:

import asyncio
from async_framework import (
    AsyncAIInteraction, AsyncOpenRouterClient,
    quick_async_free_response, async_stream_to_console
)

async def main():
    # Basic async interaction
    async with AsyncAIInteraction() as interaction:
        response = await interaction.generate_response("Question")
    
    # Quick async responses
    response = await quick_async_free_response("Quick question")
    
    # Async streaming
    result = await async_stream_to_console("Stream question")
    
    # Concurrent requests for faster processing
    tasks = [
        quick_async_free_response("Question 1"),
        quick_async_free_response("Question 2"),
        quick_async_free_response("Question 3")
    ]
    
    # All requests run concurrently
    results = await asyncio.gather(*tasks)
    print(f"Processed {len(results)} requests concurrently")
    
    # Async client for advanced usage
    async with AsyncOpenRouterClient() as client:
        response = await client.send_request(model_name, messages)
        
        # Async streaming
        async for chunk in client.send_streaming_request(model_name, messages):
            print(chunk['content'], end='', flush=True)

# Run async code
asyncio.run(main())

πŸ”„ Dynamic Model Fetching

Live model data fetching from OpenRouter API:

from models import DynamicModelManager

# Create model manager
manager = DynamicModelManager()

# Fetch live model data (300+ models)
models_data = manager.fetch_models_from_api()
print(f"Found {len(models_data['data'])} models")

# Create dynamic model classes
dynamic_models = []
for model_data in models_data['data'][:10]:  # First 10 models
    ModelClass = manager._create_dynamic_model_class(model_data)
    model_instance = ModelClass()
    dynamic_models.append(model_instance)
    
    print(f"{model_instance.display_name}: {model_instance.cost_tier.value}")

# Use dynamic models with framework
from framework import AIInteraction
if dynamic_models:
    free_models = [m for m in dynamic_models if m.is_free]
    if free_models:
        interaction = AIInteraction(free_models[0])
        response = interaction.generate_response("Test with dynamic model")

πŸ”— Feature Integration

All features work seamlessly together:

import asyncio
from caching import CacheConfig
from async_client import AsyncOpenRouterClient
from models import get_default_model

async def integrated_example():
    # Combine all features
    cache_config = CacheConfig(
        enabled=True,
        max_size=500,
        ttl_seconds=1800,
        persist_to_disk=True
    )
    
    # Async client with caching and logging
    async with AsyncOpenRouterClient(enable_logging=True) as client:
        model = get_default_model(prefer_free=True)
        messages = [{"role": "user", "content": "Integrated features test"}]
        
        # Request with caching (first time)
        response1 = await client.send_request(model.model_name, messages)
        
        # Request with caching (cached response - faster)
        response2 = await client.send_request(model.model_name, messages)
        
        # Async streaming with logging
        full_response = ""
        async for chunk in client.send_streaming_request(model.model_name, messages):
            content = chunk.get('content', '')
            if content:
                full_response += content
                print(content, end='', flush=True)
        
        # Check call history and cache stats
        history = client.get_call_history()
        print(f"\nProcessed {len(history)} requests")

asyncio.run(integrated_example())

Custom Interaction Patterns

Create specialized interaction classes:

from framework import AIInteraction
from models import get_default_model

class CodeReviewInteraction(AIInteraction):
    def __init__(self):
        super().__init__(get_default_model(prefer_free=True))
    
    def review_code(self, code: str) -> str:
        return self.generate_response(
            f"Review this code:\n\n{code}",
            default_prompt_name="code_generator"
        )

# Usage
reviewer = CodeReviewInteraction()
feedback = reviewer.review_code("def factorial(n): return n * factorial(n-1)")

Multi-Model Workflows

from framework import MultiModelInteraction
from models import get_free_models, get_paid_models

# Create multi-model interaction
multi = MultiModelInteraction(
    models=get_free_models()[:3],  # Use 3 free models
    warn_on_paid=False
)

# Get different perspectives
question = "What are the pros and cons of microservices?"
for model_name in multi.get_available_models():
    response = multi.generate_response(model_name, question)
    print(f"{model_name}: {response[:100]}...")

Error Handling and Retry Logic

from framework import CostAwareInteraction
import time

def robust_generate_response(prompt: str, max_retries: int = 3) -> str:
    interaction = CostAwareInteraction(prefer_free=True, auto_fallback=True)
    
    for attempt in range(max_retries):
        try:
            return interaction.generate_response(prompt)
        except Exception as e:
            if attempt == max_retries - 1:
                raise e
            time.sleep(2 ** attempt)  # Exponential backoff
    
    raise Exception("All retry attempts failed")

# Usage
response = robust_generate_response("Explain Python decorators")

Custom Prompt Templates

from prompts import PromptManager

# Load custom prompts
pm = PromptManager("custom_prompts.yaml")

# Create dynamic prompts
def create_domain_prompt(domain: str) -> dict:
    return {
        "role": "system",
        "content": f"You are an expert in {domain}. Provide detailed, accurate information."
    }

# Use with interaction
from framework import FreeModelInteraction
interaction = FreeModelInteraction()

# Add custom prompt dynamically
prompt = create_domain_prompt("machine learning")
messages = [prompt, {"role": "user", "content": "Explain neural networks"}]
response = interaction.chat(messages)

πŸ”§ Troubleshooting

Common Issues

1. API Key Issues

Error: OPENROUTER_API_KEY not found in environment variables

Solution: Create .env file with your OpenRouter API key:

OPENROUTER_API_KEY="sk-or-v1-your-key-here"

2. 404 Client Error

Error: 404 Client Error: Not Found for url: https://openrouter.ai/api/v1/chat/completions

Possible causes:

  • Invalid API key
  • Model name not available
  • Network connectivity issues

Solution: Test with a known working model:

from framework import AIInteraction
from models import GPT35Turbo

# Test with GPT-3.5 Turbo (reliable paid model)
interaction = AIInteraction(GPT35Turbo(), warn_on_paid=False)
response = interaction.generate_response("Hello!")

3. Model Rejection in FreeModelInteraction

ValueError: FreeModelInteraction only accepts free models

Solution: Use only free models with FreeModelInteraction:

from framework import FreeModelInteraction
from models import Llama33_8B_Free

# Correct usage
interaction = FreeModelInteraction(Llama33_8B_Free())

4. Import Errors

ModuleNotFoundError: No module named 'yaml'

Solution: Install required dependencies:

pip install python-dotenv requests PyYAML

Debug Mode

Enable debug information:

import logging
logging.basicConfig(level=logging.DEBUG)

from framework import AIInteraction
interaction = AIInteraction()
# Will show detailed debug information

Testing Connectivity

Test your setup:

from framework import AIInteraction
from models import GPT35Turbo

def test_connectivity():
    try:
        interaction = AIInteraction(GPT35Turbo(), warn_on_paid=False)
        response = interaction.generate_response("Say 'Connection successful!'")
        print(f"βœ… Success: {response}")
        return True
    except Exception as e:
        print(f"❌ Failed: {e}")
        return False

# Run test
test_connectivity()

Performance Optimization

For high-throughput applications:

from framework import OpenRouterClient
from models import Llama33_8B_Free
import asyncio

# Reuse client instance
client = OpenRouterClient()
model = Llama33_8B_Free()

# Batch processing
def process_batch(prompts: list) -> list:
    responses = []
    for prompt in prompts:
        messages = [{"role": "user", "content": prompt}]
        response = client.send_request(model.model_name, messages)
        content = client.get_response_content(response)
        responses.append(content)
    return responses

🀝 Contributing

Adding New Models

  1. Research the model on OpenRouter
  2. Determine cost tier (free or paid)
  3. Add model class in models.py:
    class NewModel(AIModel):
        @property
        def model_name(self) -> str:
            return "provider/model-name"
        
        @property
        def display_name(self) -> str:
            return "New Model Display Name"
        
        @property
        def cost_tier(self) -> CostTier:
            return CostTier.FREE  # or CostTier.PAID
  4. Add to model collections in models.py
  5. Test the model with the framework

Adding New Features

  1. Follow existing patterns in the codebase
  2. Maintain cost awareness for new features
  3. Add comprehensive tests for new functionality
  4. Update documentation in README.md and CLAUDE.md

Code Style

  • Use type hints for all functions
  • Follow existing naming conventions
  • Add docstrings for all public methods
  • Maintain backward compatibility

πŸ“„ License

This project is provided as-is for educational and development purposes. Please ensure compliance with OpenRouter's terms of service when using their API.

πŸ†˜ Support

For issues and questions:

  1. Check this README for common solutions
  2. Review the troubleshooting section above
  3. Test with known working models (GPT-3.5 Turbo)
  4. Verify your API key and network connectivity
  5. Check OpenRouter status and model availability

πŸŽ‰ Conclusion

The OpenRouter Framework provides a robust, cost-aware solution for AI model interactions. With intelligent free model prioritization, comprehensive error handling, and flexible interaction patterns, it's designed to minimize costs while maximizing functionality.

Key Benefits:

  • πŸŽ† Cost-effective: Defaults to free models
  • πŸ›‘οΈ Safe: Prevents accidental charges
  • πŸš€ Flexible: Multiple interaction patterns
  • πŸ“š Comprehensive: 23 models supported
  • ⚑ Easy: Simple setup and usage

Start with quick_free_response() for immediate results, then explore the advanced features as your needs grow!