Use tokenizer to fit context window
McPatate opened this issue · 0 comments
McPatate commented
llm-ls does not check if the prompt fits the context window.
We should use https://github.com/huggingface/tokenizers to count the number of tokens being added to the prompt so that a request does not error.