API changes for Chat and Embed
monarchwadia opened this issue · 4 comments
We should change the return API for the c.chat and e.embed methods in order to support things like getting the original response & request objects including API headers; count tokens and rate limiting information; and some additional utility functions like getLastMessage which would make life easier (instead of doing messages.at(-1)?.text which is a mouthful).
These are very much required in order to make Ragged more usable (I have some immediate use cases that demand this, for example.)
So I think Chat should turn into something like the following... and Embed will also follow suit (not shown below).
// Not a final API, just a sketch of some possibilities.
const {
history,
incomingMessages,
rawResponse,
rawRequest,
getLastMessage
} = await c.chat("What is a rickroll?")
What properties and methods would you like to see on here?
Anything that's missing?
Anything new that needs to be added?
Anything that needs to be changed, or maybe that needs to be explained more?
One more consideration: It has occurred to me that the Chat instance could just be rewritten as an instance of the BaseChatAdapter
class.
If looked at this way, it opens up avenues for vertical composability of adapters.
for example if Chat
actually just extended BaseChatAdapter
and added a few extra bells and whistles like history persistence, then that persistence could be one part of a vertical stack of adapters.
this could find use in composition patterns ------- essentially, Ragged becomes a middleware system. This could replace Langchain's LCEL system with something that's potentially a lot more generic, standardized, and composable in a way that actually is easy to reason about.
Just a thought.
WHAT WE HOPE TO GET
- Ability to add rich information & methods on the response of the
c.chat()
ande.embed()
calls. Right now we only returnMessage[]
and this is not sufficient. - Typesafety for all fields
- Provider-specific fields, with typesafety, for certain properties (such as rate limiting)
- Composable adapters, which give us a few interesting possibilities such as reusable middleware and composable functionality that can be applied across all default adapters, and can be composed for all adapters in general.
- A nicer API.
To support rate limits, we need a stronger type system where Adapter types seamlessly flow up to Chat. Once this is done, each adapter can define its own rate limit body, etc. And finally, we can make Chat just another implementation of the BaseChatAdapter interface.
To do this requires a few refactors. Here are the general steps.
- New BaseChatAdapter interface (because we will need a new response body to chat.chat() anyway, plus we need generics for adapter-specific data, so this is a good time to do this.)
- Define AbstractChatAdapter (so we get some foundational functional changes out of the way)
- AbstractChatAdapter implements the existing BaseChatAdapter.
- All official classes implement AbstractChatAdapter.
- Export AbstractChatAdapter in Ragged's types.
- Update documentation.
- Modify
BaseChatAdapter
(so we get the interface changes in for all the adapters)- Rename to
IChatAdapter
-
IChatAdapter::chat()
should now exactly mirror the Chat.chat() method's entire overloaded signature. But it should also contain the exploded (i.e. detailed) chat parameter. - AbstractChatAdapter should now have a
protected abstract handleChat
method that must be implemented internally in all official chat adapters. This method is NOT overloaded; instead, it uses the exploded/detailed one from all the overloaded Chat.chat() methods. AbstractChatAdapter.chat() handles the mapping/conversion from the oerloaded parameter into the exploded parameter, and passes it intohandleChat
- Rename to
- Generics
- Update the
IChatAdapter
so that it has a mapped generic type that definesRateLimit
as alwaysundefined
- Update the
AbstractChatAdapter
so that it is also generic. - Update all child adapters if necessary, to set ratelimit to undefined.
- Update the
- Rate Limit Mapping
- All adapters will now map rate limit info.
- Raw request & response typing
- Add a
meta
field that haschatAdapterRequest
,rawFetchRequest
andrawFetchResponse
that all get set asany
. Now we have request & response info asany
. - Generify the requests & responses for each adapter.
- Add a
- Define AbstractChatAdapter (so we get some foundational functional changes out of the way)
Here is a sketch of the updated ChatAdapter interface.
This will also get used as part of the public interface, so that ChatAdapters are composable.... and at that point, Ragged's Chat
instance will just be another adapter. Nothing special.
import { Message } from "../Chat.types";
import { Tool } from "../../tools/Tools.types"
// ==================== Request types ====================
export type ChatAdapterRequest = {
history: Message[];
tools?: Tool[];
model?: string;
}
// ==================== Response types ====================
interface GChatAdapterGenerics {
Response: {
RateLimits: unknown;
}
}
export type ChatAdapterResponse<G extends GChatAdapterGenerics = GChatAdapterGenerics> = {
history: Message[];
rateLimits: G['Response']['RateLimits'];
meta: {
chatAdapterRequest: ChatAdapterRequest;
rawFetchRequest: Request;
rawFetchResponse: Response;
}
}
// ==================== Adapter types ====================
export interface BaseChatAdapter<G extends GChatAdapterGenerics = GChatAdapterGenerics> {
chat(request: ChatAdapterRequest): Promise<ChatAdapterResponse<G>>;
}
abstract class Cool {
protected abstract cool(): void;
}
On second thoughts... I'm kind of rethinking the above. I think we already have a good base with Chat
and Embed
. The adapters don't necessarily have to compose all the way to the top -- I actually can't think of any good use cases for that. And it'll be extremely awkward and a lot of work to do that at this point.
I think a simpler approach could be modifying the BaseChatAdapter Request to instead be a Context object... something like the following:
export type ChatAdapterRequest = {
apiClient: ApiClient;
request: {
history: Message[];
tools?: Tool[];
model?: string;
};
}
This way, Chat
and Embed
can pass down various utilities that are fully controlled & provided at the top layer. The adapters then have to do less work in order to be functional.
Hmm.... will sleep on it.