saulpw/aipl

tracking tokenization and cost for different LLMs

Opened this issue · 1 comments

Estimating the cost of each LLM call is a handy tool, but if we start supporting non-OpenAI models we might want the ability to do that in a more general way. Specifically, GooseAI (adding in #30) doesn't return a count of tokens used, so we would need to know how to count the tokens given the prompt and completion (thus we'd need to know the tokenizer that the model used). The math is also different from the math for OpenAI, so if we wanted to improve / expand this feature it might make sense to have LLM provider-specific functions that can calculate the cost. We could also find a heuristic and make sure it's within error tolerance (usually just dividing the character count by a constant).

For now we'll probably just only have cost estimation for OpenAI.

Goose AI Tokenizer documented here: https://goose.ai/tokenizer

We also lose the ability to compute costs easily if we use the streaming API for openai. I compute token costs in chatcli like this:

https://github.com/cthulahoops/chatcli/blob/main/chatcli_gpt/conversation.py#L48