Adriankhl/godot-llm

Is it possible to exclude the `<|eot_id|>` token?

Closed this issue · 6 comments

I know there are already some options for llama.cpp to exclude some tokens, but currently facing the <|eot_id|> with Llama3 Instruct in Interactive mode.

My options look like this:
image

But I'm getting the <|eot_id|> token. (should_output_eos has no effect)

image

I could do some string filtering, but before I set up a complicated system of string caching and live-detection, I thought I'd ask.

I think Should Output Eos is actually the third Should Output... option, you may still have it on 😆

Jep, that's Should Output Eos, but I've tried it both with on and off, but the token is shown either way.
Should I prepare a sample project or is there another way?

I have tested on my side and it definitely works here. Could you download the new Godot LLM Template, either from the asset library or from here, then open the application -> Text Generation -> Change None to Person -> Generate, then see if <|eot_id|> is there or not

It can also well be this issue: ggerganov/llama.cpp#6772
so the problem can come from the model side, but I think it has already been fixed
If the issue persists, could you try one of the "fixed" model here

You're right, it was a model issue. The model you linked fixed the issue.
What's weird is, that I used the linked Meta-Llama-3-8B-Instruct-Q5_K_M.gguf from the ReadMe.
So on the one hand, I would think that it's best to change it, but on the other hand you didn't have the issue with the same model.

It can well be we downloaded different versions of the model from the same repo, I have added a FAQ section in the README to point out that model versions can be a problem even if the model is the same