Track and format chat history

Question

Track and format chat history

rlouf opened this issue 5 months ago · 0 comments

The chat format consists in multi-turn conversations between the user and the large language model. Each turn is associated with a role, either system, user or assistant. A chat history can thus be represented as an iterable of tuples:

history = [
    ("system", "This is a system message"),
    ("user", "A user asking a question to the model"),
    ("assistant", "A model responding to the user")
]

We need a function which, given a model, will return the corresponding prompt in the correct format. For instance, for Zephyr the corresponding prompt should be:

<|system|>
This is a system message</s> 
<|user|>
A user asking a question to the model</s> 
<|assistant|>
A model responding to the user.

The simplest way to go about this is to define a function which, given a model name and a sequence of interactions returns the corresponding prompt.

from prompts import to_chat_format

history = [
    ("system", "This is a system message"),
    ("user", "A user asking a question to the model"),
]
prompt = to_chat_format("HuggingFaceH4/zephyr-7b-beta", history)
print(prompt)
# <|system|>
# This is a system message</s> 
# <|user|>
# A user asking a question to the model</s>

However, when we want to pass the prompt to the model for generation we need to append the <|assistant|> token. For this we can define a new function, to_chat_prompt:

from prompts import to_chat_prompt

history = [
    ("system", "This is a system message"),
    ("user", "A user asking a question to the model"),
]
prompt = to_chat_prompt("HuggingFaceH4/zephyr-7b-beta", history)
print(prompt)
# <|system|>
# This is a system message</s> 
# <|user|>
# A user asking a question to the model</s> 
# <|assistant|>

This system is still cumbersome, since we need to append </s> to the response, and need to call to_chat_format with the model name before every call to the model. Could we simplify this?

class Chat:
    history: List[Tuple[str, str]]

    def __init__(self, model_name: str, system: str=None, tools=None, documents=None):
        raise NotImplementedError

    def __str__(self):
        """Render the prompt corresponding to the history.

        If we want to be able to use the prompt with any library this should return the prompt with the "assistant" keyword?

        As with `Template` we need to
be able to specify the model we are working with.

        """
        raise NotImplementedError

    def __getitem__(self, i):
        return self.history[i]

    def user(self, string: str):
        """Add user input to history"""
        self.history.append(("user", string))

    def assistant(self, string: str):
        """Add user input to history"""
        self.history.append(("user", string))

So a session would look like:

from outlines import generate, models


model = models.transformers("HuggingFaceH4/zephyr-7b-beta")

session = Chat("HuggingFaceH4/zephyr-7b-beta", "This is a system message")
session.user("A user asking a question to the model")

result = generate.text(model)(session)
session.assistant(result)

The obvious downside of this design is the necessity to specify the model name twice.

str(chat) must return the correctly formatted templates to be able to interface with any library

This issue (and arguably the repository) was triggered by this issue in Outlines and inspired by this repository opened by a member of the Outlines community.