certik/fastGPT

Implement batching

certik opened this issue · 0 comments

Batching means that multiple input streams are being computed at the same time, which can vectorize better and thus speedup the inference per token.