Is dynamic batch supported?

Question

Is dynamic batch supported?

Closed this issue a year ago · 2 comments

Hi, thank you for developing this very powerful quantization toolkit!
I have a question about converting the quantized onnx file to TensorRT engine. Does mqbench support dynamic batch?
Because in my project, the batch size is not fixed in the inference stage. However, I didn't find anything about dynamic batch in the toolkit. Seems like we need to manually set the --batch-size flag in the onnx2trt.py file.
Thank you!

Answer 1 · 2022-11-16T15:52:41.000Z

I have solved the problem and I will share my solution about how to enable dynamic batch in MQBench.

First, batch-size flag in the onnx2trt.py file has nothing to do with trt engine generation.
When we use convert_deploy() function to convert the quantized module to onnx file, we should add **extra_kwargs:

# axis 0(batch channel) as dynamic:
dynamic_axes = {'input_1': {0: 'batch_size'}, 'output_1': {0: 'batch_size'}, 'output_2': {0: 'batch_size'}} 
extra_kwargs = dict(input_names=["input_1"], output_names=["output_1","output_2"], dynamic_axes = dynamic_axes)
convert_deploy(...,**extra_kwargs)

Add the optimization profile for the input binding in the onnx2trt.py file:

profile = builder.create_optimization_profile()
profile.set_shape("input_1", (1,2,224,224), (4,2,224,224), (8,2,224,224)) # min_batch, opt_batch, max_batch
config.add_optimization_profile(profile)
engine = builder.build_engine(network, config)

When doing the inference, keep in mind to set the active profile:

self.context = self.engine.create_execution_context()
self.context.active_optimization_profile = 0

Answer 2 · 2023-03-17T02:26:05.000Z

This issue has not received any updates in 120 days. Please reply to this issue if this still unresolved!