Adriankhl/godot-llm

When I use Chinese input, I get the error "Missing 2 UTF-8 continuation byte(s)". What could be the cause of this?

Closed this issue ยท 4 comments

image
When I use Chinese input, I get the error "Missing 2 UTF-8 continuation byte(s)". What could be the cause of this?

Ah, it is because the utf8 encoding of Chinese character might get split up in the generation process, and unlike std::string, Godot String doesn't know how to combine them correctly if they are received separately, i.e., streaming through on_generate_text_updated, can you check if the text is displayed correctly in the final on_generate_text_finished signal?

Let me figure out a solution to detect the incomplete utf8 encoding and send it out once it is valid.

Now it has been fixed on the main branch ๐Ÿ˜„ Thanks for the report.

I have release the new version, it is now fixed there, let me know if you have any further problem.

Seems it still have a problem. When I defined n_predict it will cut the output stream? The last word may still be corrupted, as shown in this screenshot.
{8F41B190-D366-4B73-91A5-D1F837B9C87D}