Is it possible to abort the generation process from the callback?

Question

Is it possible to abort the generation process from the callback?

quadrismegistus opened this issue a year ago · 6 comments

If I'm writing a chatbot dialogue prompt, for example,

HUMAN: What is it like to be an AI?

AI:

Is it possible to terminate the generation process once it starts to go beyond HUMAN: and start generating AI human text (as interesting as that is!)

Does the model object have the ability to terminate the generation? Or is there some way to do it from the callback? I believe model.res keeps up-to-date string which the callback could watch for for HUMAN: (in the e.g.).

In any case: Thanks for such a fantastic package! This is helping democratize and scrutinize GPT technology!

Answer 1 · 2023-04-08T11:50:02.000Z

Hi Ryan! This is an interesting idea. I don't think this is possible from python ATM--the generation happens inside a loop in the C code by watching a variable with local scope, and only breaks when it hits a special EOS token. (one key moment is here--https://github.com/nomic-ai/pyllamacpp/blob/815c32fe2db43b63d4cafcdec90e1287f5d2ceb3/src/main.cpp#L547). But maybe there's another route.

In any case I like the idea of using the callback as a spot where you could set other halting conditions besides the EOS token. One tricky thing is that you would usually want to insert an EOS manually in these cases, I think--even if the model sometimes starts generating dialogues without EOS separating tokens, it's probably working best when the EOS are there as in the training data.

Answer 2 · 2023-04-08T23:10:44.000Z

@quadrismegistus, you could still set interactive=True and and some antiprompt to stop the generation when you reach it, but as @bmschmidt said, the generations is happenings inside the C code and it is not very convenient.

I am planning to push some updates soon to support this interactive mode.

Answer 3 · 2023-04-09T21:51:56.000Z

@abdeladim-s this and/or a stopword sequence #21 would be a huge boost in app development, since prompt templates often rely on a specific stopword sequence that is inconvenient to train into the model

Answer 4 · 2023-04-12T06:41:25.000Z

@hinthornw Yes I agree, I am working on it. I will let you know once it done.

Answer 5 · 2023-05-01T00:47:37.000Z

Take a look at this very simple generation: https://github.com/mkinney/discord_aibot/blob/main/bot.py#L32

    ans = ""
    for token in model.generate(question, temp=0.28, n_predict=4096):
        # <insert condition here>
        ans += token

You could add a condition (timer, nth number token, or whatever) in that for loop.

Answer 6 · 2023-05-02T20:49:40.000Z

Hi @quadrismegistus,

As @mkinney said, you can get the tokens one by one from the generate function and break the loop whenever you want to stop, Please take a look at interactive dialogue tutorial in the readme page