Use tokenizer to fit context window

Question

McPatate opened this issue a year ago · 0 comments

llm-ls does not check if the prompt fits the context window.

We should use https://github.com/huggingface/tokenizers to count the number of tokens being added to the prompt so that a request does not error.