Implement batching

Question

certik opened this issue 2 years ago · 0 comments

Batching means that multiple input streams are being computed at the same time, which can vectorize better and thus speedup the inference per token.