multi-gpu at runtime error
ecilay opened this issue · 4 comments
So say if I have two AIT convered models, model0
on cuda0
and model1
on cuda1
.
Even if I used cudaSetDevice
to load the models properly on each cuda device, at run time, after running inference on model0 on cuda0, model1 fails to run. Once I move two models into same devices, problem resolved.
Is this expected? Any possible short term fix? I did the experiment on A10g with 4 GPUs.
Hi @ecilay , Thanks for reporting the issue. What's the error message that you got?
File "/home/test/runtime/runtime/ait/eps_ait.py", line 485, in __call__ return self.forward(
File "/home/test/runtime/runtime/ait/eps_ait.py", line 791, in forward noise_pred = self.dispatch_resolution_forward(inputs)
File "/home/test/runtime/runtime/ait/eps_ait.py", line 890, in dispatch_resolution_forward cur_engines[f"{h}x{w}"].run_with_tensors(inputs, ys, graph_mode=False)
File "/opt/conda/envs/test/lib/python3.10/site-packages/aitemplate/compiler/model.py", line 587, in run_with_tensors outputs_ait = self.run(
File "/opt/conda/envs/test/lib/python3.10/site-packages/aitemplate/compiler/model.py", line 490, in run return self._run_impl(
File "/opt/conda/envs/test/lib/python3.10/site-packages/aitemplate/compiler/model.py", line 429, in _run_impl self.DLL.AITemplateModelContainerRun(
File "/opt/conda/envs/test/lib/python3.10/site-packages/aitemplate/compiler/model.py", line 196, in _wrapped_func raise RuntimeError(f"Error in function: {method.__name__}")
RuntimeError: Error in function: AITemplateModelContainerRun
Thanks, @ecilay ! Hmm, doesn't have any clue. If it's possible, could you share a small repro that would help us investigate? Thanks!
@chenyang78 I think you can repro by using any two AIT model (or maybe they could be the same model), load them on different GPUs, and do inference, see if it works? If it does, would appreciate sharing your inference scripts, thanks.