NVIDIA/FasterTransformer

How to serving multi-gpu inference?

Alone-wl opened this issue · 1 comments

How to serving multi-gpu inference?

mpirun -np ${num_of_GPU} python xxx.py