Huge memory consumsation of isWithinTokenLimit

Question

Huge memory consumsation of isWithinTokenLimit

Opened this issue a year ago · 4 comments

I am experiencing 200MB increase after implementing gpt-tokenizer the only function that I am using from this library is isWithinTokenLimit. Here is an image of my memory consumtion before and after deployment.
Here is how I am using is

function getRequestTokenCount(req: ChatCompletionRequestMessage[]) {
  const extraTokensDueToPromptForEachMessage = 7
  return req.reduce((acc, curr) => {
    const tokensInText = isWithinTokenLimit(curr.content, Infinity) || 99999
    return acc + tokensInText + extraTokensDueToPromptForEachMessage
  }, 0)
}

Answer 1 · 2024-06-14T06:08:41.000Z

same here, what seems even more weird, is that it made me run out of memory by just importing it, not even using it

Answer 2 · 2024-07-18T00:45:00.000Z

hi @aminsol and @olboghgc, can you provide a little more context about the platform?
is this under node, bun, the browser?
are you using a bundler of any sort?
which GPT encoding are you trying use?

Answer 3 · 2024-07-18T07:27:55.000Z

@niieani Node 18/typescript/babel 7.0.0 in a firebase cloud function, with gpt 3.5 and 4o(but that shouldn't really matter since a simple import already cause a spike in memory usage)

Answer 4 · 2024-07-18T07:29:54.000Z

btw I was able to +/- ignore this issue by doing a conditional require so I don't always waste memory when I don't need to call it