NVIDIA/FasterTransformer

Could NOT find NCCL

arnavdixit opened this issue · 1 comments

While trying to build with PyTorch, I am getting a CMake error.

CMake Error at /opt/conda/envs/fastertransformer/lib/python3.8/site-packages/cmake/data/share/cmake-3.26/Modules/FindPackageHandleStandardArgs.cmake:230 (message): Could NOT find NCCL (missing: NCCL_INCLUDE_DIRS NCCL_LIBRARIES) Call Stack (most recent call first): /opt/conda/envs/fastertransformer/lib/python3.8/site-packages/cmake/data/share/cmake-3.26/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE) cmake/Modules/FindNCCL.cmake:126 (find_package_handle_standard_args) CMakeLists.txt:84 (find_package)

I checked and I do have NCCL installed. Here is what i queries and the output for the same:
python -c "import torch;print(torch.cuda.nccl.version())"
Output: (2, 14, 3)

Not really sure what the issue is

Set those env variables: https://github.com/NVIDIA/FasterTransformer/blob/main/cmake/Modules/FindNCCL.cmake#L93

In my case, they were:

$ export NCCL_INCLUDE_DIR=/usr/local/nccl2/include
$ export NCCL_LIB_DIR=/usr/local/nccl2/lib
$ export NCCL_VERSION=2