The `nccl` package on conda-forge does not support stream capture
Closed this issue · 2 comments
leofang commented
To support stream capture NCCL needs to be built using CUDA 11.3+:
https://github.com/NVIDIA/nccl/blob/30ca3fcacf8a73c48d7b8f7aaa54ae8bff89e884/src/enqueue.cc#L1083-L1085
See also https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/usage/cudagraph.html#using-nccl-with-cuda-graphs.
leofang commented
I am thinking for CUDA 11.x we should use the latest version to build and make it compatible for all CUDA 11.