FMInference/FlexLLMGen

Question about the num-gpu-batches and gpu-batch-size

young-chao opened this issue · 0 comments

According to batch_size_table.md, from 144=48 x 3 (144 from batch_size_table.md and 48 x 3 from bench_suite.py) I can think that batch-size is composed of num-gpu-batches and gpu-batch-size together in FlexGen. But I don't understand the actual meaning of these two parameters. Shouldn't num-gpu-batches be the number of batches? and gpu-batch-size is the batch-size.
image

image