monarchwadia/ragged

API changes for Chat and Embed

Opened this issue · 4 comments

We should change the return API for the c.chat and e.embed methods in order to support things like getting the original response & request objects including API headers; count tokens and rate limiting information; and some additional utility functions like getLastMessage which would make life easier (instead of doing messages.at(-1)?.text which is a mouthful).

These are very much required in order to make Ragged more usable (I have some immediate use cases that demand this, for example.)

So I think Chat should turn into something like the following... and Embed will also follow suit (not shown below).

// Not a final API, just a sketch of some possibilities.
const {
  history,
  incomingMessages,
  rawResponse,
  rawRequest,
  getLastMessage
} = await c.chat("What is a rickroll?")

What properties and methods would you like to see on here?

Anything that's missing?

Anything new that needs to be added?

Anything that needs to be changed, or maybe that needs to be explained more?

One more consideration: It has occurred to me that the Chat instance could just be rewritten as an instance of the BaseChatAdapter class.

If looked at this way, it opens up avenues for vertical composability of adapters.

for example if Chat actually just extended BaseChatAdapter and added a few extra bells and whistles like history persistence, then that persistence could be one part of a vertical stack of adapters.

this could find use in composition patterns ------- essentially, Ragged becomes a middleware system. This could replace Langchain's LCEL system with something that's potentially a lot more generic, standardized, and composable in a way that actually is easy to reason about.

Just a thought.

WHAT WE HOPE TO GET

  • Ability to add rich information & methods on the response of the c.chat() and e.embed() calls. Right now we only return Message[] and this is not sufficient.
  • Typesafety for all fields
  • Provider-specific fields, with typesafety, for certain properties (such as rate limiting)
  • Composable adapters, which give us a few interesting possibilities such as reusable middleware and composable functionality that can be applied across all default adapters, and can be composed for all adapters in general.
  • A nicer API.

To support rate limits, we need a stronger type system where Adapter types seamlessly flow up to Chat. Once this is done, each adapter can define its own rate limit body, etc. And finally, we can make Chat just another implementation of the BaseChatAdapter interface.

To do this requires a few refactors. Here are the general steps.

  • New BaseChatAdapter interface (because we will need a new response body to chat.chat() anyway, plus we need generics for adapter-specific data, so this is a good time to do this.)
    • Define AbstractChatAdapter (so we get some foundational functional changes out of the way)
      • AbstractChatAdapter implements the existing BaseChatAdapter.
      • All official classes implement AbstractChatAdapter.
      • Export AbstractChatAdapter in Ragged's types.
      • Update documentation.
    • Modify BaseChatAdapter (so we get the interface changes in for all the adapters)
      • Rename to IChatAdapter
      • IChatAdapter::chat() should now exactly mirror the Chat.chat() method's entire overloaded signature. But it should also contain the exploded (i.e. detailed) chat parameter.
      • AbstractChatAdapter should now have a protected abstract handleChat method that must be implemented internally in all official chat adapters. This method is NOT overloaded; instead, it uses the exploded/detailed one from all the overloaded Chat.chat() methods. AbstractChatAdapter.chat() handles the mapping/conversion from the oerloaded parameter into the exploded parameter, and passes it into handleChat
    • Generics
      • Update the IChatAdapter so that it has a mapped generic type that defines RateLimit as always undefined
      • Update the AbstractChatAdapter so that it is also generic.
      • Update all child adapters if necessary, to set ratelimit to undefined.
    • Rate Limit Mapping
      • All adapters will now map rate limit info.
    • Raw request & response typing
      • Add a meta field that has chatAdapterRequest, rawFetchRequest and rawFetchResponse that all get set as any. Now we have request & response info as any.
      • Generify the requests & responses for each adapter.

Here is a sketch of the updated ChatAdapter interface.

This will also get used as part of the public interface, so that ChatAdapters are composable.... and at that point, Ragged's Chat instance will just be another adapter. Nothing special.

import { Message } from "../Chat.types";
import { Tool } from "../../tools/Tools.types"

// ==================== Request types ====================

export type ChatAdapterRequest = {
    history: Message[];
    tools?: Tool[];
    model?: string;
}

// ==================== Response types ====================

interface GChatAdapterGenerics {
    Response: {
        RateLimits: unknown;
    }
}

export type ChatAdapterResponse<G extends GChatAdapterGenerics = GChatAdapterGenerics> = {
    history: Message[];
    rateLimits: G['Response']['RateLimits'];
    meta: {
        chatAdapterRequest: ChatAdapterRequest;
        rawFetchRequest: Request;
        rawFetchResponse: Response;
    }
}

// ==================== Adapter types ====================

export interface BaseChatAdapter<G extends GChatAdapterGenerics = GChatAdapterGenerics> {
    chat(request: ChatAdapterRequest): Promise<ChatAdapterResponse<G>>;
}

abstract class Cool {
    protected abstract cool(): void;
}

On second thoughts... I'm kind of rethinking the above. I think we already have a good base with Chat and Embed. The adapters don't necessarily have to compose all the way to the top -- I actually can't think of any good use cases for that. And it'll be extremely awkward and a lot of work to do that at this point.

I think a simpler approach could be modifying the BaseChatAdapter Request to instead be a Context object... something like the following:

export type ChatAdapterRequest = {
    apiClient: ApiClient;
    request: {
        history: Message[];
        tools?: Tool[];
        model?: string;
    };
}

This way, Chat and Embed can pass down various utilities that are fully controlled & provided at the top layer. The adapters then have to do less work in order to be functional.

Hmm.... will sleep on it.