bjing2016/alphaflow

Issues with CUDA12 and or G++17

Raul-araya-secchi opened this issue · 2 comments

Hi,

I'm trying to install AlphaFlow on a machine with A30 GPUs with CUDA 12.1 and even tough I found a compatible pytorch version I gett the following error after running the command: pip install 'openfold @ git+https://github.com/aqlaboratory/openfold.git@103d037':

"In file included from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/extension.h:5,
from openfold/utils/kernel/csrc/softmax_cuda_kernel.cu:18:
/home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/all.h:4:2: error: #error C++17 or later compatible compiler is required to use PyTorch.
4 | #error C++17 or later compatible compiler is required to use PyTorch.
| ^~~~~
In file included from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/c10/util/string_view.h:4,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/c10/util/StringUtil.h:6,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/c10/util/Exception.h:5,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/c10/core/Device.h:5,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/ATen/core/TensorBody.h:11,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/ATen/core/Tensor.h:3,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/ATen/Tensor.h:3,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/csrc/autograd/function_hook.h:3,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/csrc/autograd/cpp_hook.h:2,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/csrc/autograd/variable.h:6,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/csrc/autograd/autograd.h:3,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/autograd.h:3,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/all.h:7,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/extension.h:5,
from openfold/utils/kernel/csrc/softmax_cuda_kernel.cu:18:
/home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/c10/util/C++17.h:27:2: error: #error You need C++17 to compile PyTorch
27 | #error You need C++17 to compile PyTorch
| ^~~~~
In file included from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:4,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/all.h:9,
from /home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/torch/extension.h:5,
from openfold/utils/kernel/csrc/softmax_cuda_kernel.cu:18:
/home/raraya/.conda/envs/alpha_flow/lib/python3.9/site-packages/torch/include/ATen/ATen.h:4:2: error: #error C++17 or later compatible compiler is required to use ATen.
4 | #error C++17 or later compatible compiler is required to use ATen.
| ^~~~~
error: command '/usr/local/cuda/bin/nvcc' failed with exit code 1"

My gcc version is 11.3

If I'm understanding correctly, CUDA versions can be installed through conda as well [1].

In my case, it was kind of straightforward [2] for me to have an environment with CUDA 11.3 through this command:
conda install nvidia/label/cuda-11.3.1::cuda

I'm not sure if it conflicts if you have a global CUDA too though.

[1] - https://twitter.com/jeremyphoward/status/1697435246415974747
[2] - apart from some openfold installation errors that I encountered like aqlaboratory/openfold#293 (comment)

The README has now been updated with detailed instructions for (hopefully) reproducible installation of CUDA 11.8 in a Conda environment!