Continuous Batching

Question

Continuous Batching

Closed this issue 4 months ago · 1 comments

Meowmix42069 commented 4 months ago

Hello, I am the user of a llama.rn derivative app and I am wondering why continuous batching is not included in your implementation. As I understand it, continuous batching should be enabled by default for all server launches. What would be the easiest way to implement this necessary feature?

Answer 1 · 2024-07-27T05:37:31.000Z

Is this what you mean? https://github.com/ggerganov/llama.cpp/blob/2b1f616b208a4a21c4ee7a7eb85d822ff1d787af/examples/server/README.md?plain=1#L162-L167

If so, we have #30 already. Also, since we're need this internally too recently, I'm sure we'll be supporting it soon.