API changes for Chat and Embed

Question

API changes for Chat and Embed

monarchwadia opened this issue 9 months ago · 4 comments

We should change the return API for the c.chat and e.embed methods in order to support things like getting the original response & request objects including API headers; count tokens and rate limiting information; and some additional utility functions like getLastMessage which would make life easier (instead of doing messages.at(-1)?.text which is a mouthful).

These are very much required in order to make Ragged more usable (I have some immediate use cases that demand this, for example.)

So I think Chat should turn into something like the following... and Embed will also follow suit (not shown below).

// Not a final API, just a sketch of some possibilities.
const {
  history,
  incomingMessages,
  rawResponse,
  rawRequest,
  getLastMessage
} = await c.chat("What is a rickroll?")

What properties and methods would you like to see on here?

Anything that's missing?

Anything new that needs to be added?

Anything that needs to be changed, or maybe that needs to be explained more?

Answer 1 · 2024-07-03T19:45:18.000Z

One more consideration: It has occurred to me that the Chat instance could just be rewritten as an instance of the BaseChatAdapter class.

If looked at this way, it opens up avenues for vertical composability of adapters.

for example if Chat actually just extended BaseChatAdapter and added a few extra bells and whistles like history persistence, then that persistence could be one part of a vertical stack of adapters.

this could find use in composition patterns ------- essentially, Ragged becomes a middleware system. This could replace Langchain's LCEL system with something that's potentially a lot more generic, standardized, and composable in a way that actually is easy to reason about.

Just a thought.

Answer 2 · 2024-07-05T02:36:05.000Z

WHAT WE HOPE TO GET

Ability to add rich information & methods on the response of the c.chat() and e.embed() calls. Right now we only return Message[] and this is not sufficient.
Typesafety for all fields
Provider-specific fields, with typesafety, for certain properties (such as rate limiting)
Composable adapters, which give us a few interesting possibilities such as reusable middleware and composable functionality that can be applied across all default adapters, and can be composed for all adapters in general.
A nicer API.

To support rate limits, we need a stronger type system where Adapter types seamlessly flow up to Chat. Once this is done, each adapter can define its own rate limit body, etc. And finally, we can make Chat just another implementation of the BaseChatAdapter interface.

To do this requires a few refactors. Here are the general steps.

Answer 3 · 2024-07-05T02:38:08.000Z

Here is a sketch of the updated ChatAdapter interface.

This will also get used as part of the public interface, so that ChatAdapters are composable.... and at that point, Ragged's Chat instance will just be another adapter. Nothing special.

import { Message } from "../Chat.types";
import { Tool } from "../../tools/Tools.types"

// ==================== Request types ====================

export type ChatAdapterRequest = {
    history: Message[];
    tools?: Tool[];
    model?: string;
}

// ==================== Response types ====================

interface GChatAdapterGenerics {
    Response: {
        RateLimits: unknown;
    }
}

export type ChatAdapterResponse<G extends GChatAdapterGenerics = GChatAdapterGenerics> = {
    history: Message[];
    rateLimits: G['Response']['RateLimits'];
    meta: {
        chatAdapterRequest: ChatAdapterRequest;
        rawFetchRequest: Request;
        rawFetchResponse: Response;
    }
}

// ==================== Adapter types ====================

export interface BaseChatAdapter<G extends GChatAdapterGenerics = GChatAdapterGenerics> {
    chat(request: ChatAdapterRequest): Promise<ChatAdapterResponse<G>>;
}

abstract class Cool {
    protected abstract cool(): void;
}

Answer 4 · 2024-07-08T02:10:52.000Z

On second thoughts... I'm kind of rethinking the above. I think we already have a good base with Chat and Embed. The adapters don't necessarily have to compose all the way to the top -- I actually can't think of any good use cases for that. And it'll be extremely awkward and a lot of work to do that at this point.

I think a simpler approach could be modifying the BaseChatAdapter Request to instead be a Context object... something like the following:

export type ChatAdapterRequest = {
    apiClient: ApiClient;
    request: {
        history: Message[];
        tools?: Tool[];
        model?: string;
    };
}

This way, Chat and Embed can pass down various utilities that are fully controlled & provided at the top layer. The adapters then have to do less work in order to be functional.

Hmm.... will sleep on it.