/circuitstream

An LLM relay with automatic instrumentation, rate limiting that supports multiple models

Primary LanguagePython

CircuitStream 🌐⚡

A universal Language Model relay system, making it seamless to integrate with various models across different platforms! Designed to help developers achieve interoperability without hassles.

This type of relay is also useful if you're an organization with a single account distributed accross multiple teams or developers as rate limit can be configured at a per model level.

If you're running tests against multiple LLMs this is also a great way to generate centralized analytics.

License: MIT Python FastAPI


Features 🌟

  • Unified Interface: One API to rule them all! Easily call different models through a standardized interface.
  • Configurable: Add or modify endpoints without touching the core logic.
  • Rate Limiting: Safeguard against excessive requests with built-in rate limiting. Rate Limit
  • Analytics: Dive deep into request analytics with logging and insights. Analytics
  • Cross-Origin Resource Sharing (CORS): Built-in CORS support, making it browser friendly.
  • Automated instrumentation of all models: Automated observability and analytics for all models with Langfuse. Langfuse

Getting Started 🚀

Prerequisites

  • Python 3.8+
  • Virtual Environment (recommended)

Setup & Installation

  1. Clone the repository:
git clone https://github.com/balgan/CircuitStream.git
cd CircuitStream
  1. Create and activate a virtual environment (Optional but recommended):
python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`
  1. Install the dependencies:
pip install -r requirements.txt
  1. Modify the config.json file to match your desired configuration for various language models and platforms.

  2. Create create a secrets.json with your langfuse configuration with the following format:

{
    "ENV_PUBLIC_KEY": "pk-XXX",
    "ENV_SECRET_KEY": "sk-XXX",
    "ENV_HOST": "http://xxx.xxx.xxx.xxx"
}
  1. Run the relay:
python llm_relay.py
  1. Run a webserver to view the index:
python -m http.server 8081

Now, your CircuitStream relay is up and running on http://localhost:8000/ and you can also browse http://localhost:8081 to view analytics.


Usage 🖥️

Here's how to send a request:

curl -X POST http://localhost:8000/callmodel \
  -H "Content-Type: application/json" \
  -d '{
    "project_name": "openai",
    "model_name": "gpt-4",
    "prompt": "Hello World!",
    "api_token": "your-api-key-here"
  }'

curl -X POST http://localhost:8000/callmodel \
  -H "Content-Type: application/json" \
  -d '{
    "project_name": "anthropic",
    "model_name": "claude-2",
    "prompt": "H\nHuman: Hello, world!\n\nAssistant:",
    "api_token": "your-api-key-here"
  }'

For additional routes and functionalities, refer to the API documentation generated with FastAPI.


Contributing 🤝

We love contributions! If you have any improvements or feature suggestions, please:

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/FeatureName)
  3. Commit your Changes (git commit -m 'Add some FeatureName')
  4. Push to the Branch (git push origin feature/FeatureName)
  5. Open a Pull Request

License 📜

Distributed under the MIT License. See LICENSE for more information.


Acknowledgements 🎉


Support 🌐

Having issues? Open an issue and let's debug it together!