unify: A Python repository from Unify

Fully hackable LLMOps. Build custom interfaces for: logging, evals, guardrails, labelling, tracing, agents, human-in-the-loop, hyperparam sweeps, and anything else you can think of ✨

Just unify.log your data, and add an interface using the four building blocks:

tables 🔢
views 🔍
plots 📊
editor 🕹️ (coming soon)

Every LLM product has unique and changing requirements, as do the users. Your infra should reflect this!

We've tried to make Unify as (a) simple, (b) modular and (c) hackable as possible, so you can quickly probe, analyze, and iterate on the data that's important for you, your product and your users ⚡

Quickstart

import unify
from random import randint, choice

# initialize project
unify.activate("Maths Assistant")

# build agent
client = unify.Unify("o3-mini@openai", traced=True)
client.set_system_message(
    "You are a helpful maths assistant, "
    "tasked with adding and subtracting integers."
)

# add test cases
qs = [
    f"{randint(0, 100)} {choice(['+', '-'])} {randint(0, 100)}"
    for i in range(10)
]

# define evaluator
@unify.traced
def evaluate_response(question: str, response: str) -> float:
    correct_answer = eval(question)
    try:
        response_int = int(
            "".join(
                [
                    c for c in response.split(" ")[-1]
                    if c.isdigit()
                ]
            ),
        )
        return float(correct_answer == response_int)
    except ValueError:
        return 0.

# define evaluation
@unify.traced
def evaluate(q: str):
    response = client.copy().generate(q)
    score = evaluate_response(q, response)
    unify.log(
        question=q,
        response=response,
        score=score
    )

# execute + log your evaluation
with unify.Experiment():
    unify.map(evaluate, qs)

Check out our Quickstart Video for a guided walkthrough.

Focus on your product, not the LLM 🎯

Despite all of the hype, abstractions, and jargon, the process for building quality LLM apps is pretty simple.

create simplest possible agent 🤖
while True:
    create/expand unit tests (evals) 🗂️
    while run(tests) failing: 🧪
        Analyze failures, understand the root cause 🔍
        Vary system prompt, in-context examples, tools etc. to rectify 🔀
    Beta test with users, find more failures 🚦

We've tried to strip away all of the excessive LLM jargon, so you can focus on your product, your users, and the data you care about, and nothing else 📈

Unify takes inspiration from:

PostHog / Grafana / LogFire for powerful observability 🔬
LangSmith / BrainTrust / Weave for LLM abstractions 🤖
Notion / Airtable for composability and versatility 🧱

Whether you're technical or non-technical, we hope Unify can help you to rapidly build top-notch LLM apps, and to remain fully focused on your product (not the LLM).

Learn More

Check out our docs, and if you have any questions feel free to reach out to us on discord 👾

Unify is under active development 🚧, feedback in all shapes/sizes is also very welcome! 🙏

Happy prompting! 🧑‍💻

unifyai/unify

Quickstart

Focus on your product, not the LLM 🎯

Learn More