[question] gloo all_reduce on GPU
Closed this issue · 1 comments
Stonesjtu commented
Hi there, I used gloo
as a backend in pytorch, but I found that when I initialize the gloo backend, there are NCCL
rings created at the same time.
Does that mean NCCL all reduce is used in this configuration?
pietern commented
Hi @Stonesjtu! If you use the Gloo backend in PyTorch there is currently no dependency on NCCL. Perhaps the NCCL rings are created elsewhere? Perhaps there is the initialization of torch.cuda.comm
that initializes them for within-process usage?
Closing the issue, as there is nothing that I can do about this on the Gloo side. If this is unexpected behavior, and you need guidance or a fix, please open an issue on the PyTorch repository.