pytorch/extension-cpp

CUDA version Error

hao6699 opened this issue · 3 comments

PyTorch GitHub Issues Guidelines

We like to limit our issues to bug reports and feature requests. If you have a question or would like help and support, please visit our forums: https://discuss.pytorch.org/

If you are submitting a feature request, please preface the title with [feature request].

When submitting a bug report, please include the following information (where relevant):

  • OS: Ubuntu 18.04
  • PyTorch version: 1.2.0
  • How you installed PyTorch (conda, pip, source): pip
  • Python version: 3.6.9
  • CUDA/cuDNN version: 10.1/7.4.2
  • GPU models and configuration:
  • GCC version (if compiling from source): gcc version 7.4.0

Using /tmp/torch_extensions as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /tmp/torch_extensions/lltm_cuda/build.ninja...
Building extension module lltm_cuda...
[1/3] /usr/local/cuda-9.0/bin/nvcc -DTORCH_EXTENSION_NAME=lltm_cuda -DTORCH_API_INCLUDE_EXTENSION_H -isystem /usr/local/lib/python3.6/dist-packages/torch/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.6/dist-packages/torch/include/THC -isystem /usr/local/cuda-9.0/include -isystem /usr/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -std=c++11 -c /home/youmi/PycharmProjects/roi_pooling-master/lltm_cuda_kernel.cu -o lltm_cuda_kernel.cuda.o
FAILED: lltm_cuda_kernel.cuda.o
/usr/local/cuda-9.0/bin/nvcc -DTORCH_EXTENSION_NAME=lltm_cuda -DTORCH_API_INCLUDE_EXTENSION_H -isystem /usr/local/lib/python3.6/dist-packages/torch/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.6/dist-packages/torch/include/THC -isystem /usr/local/cuda-9.0/include -isystem /usr/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -std=c++11 -c /home/youmi/PycharmProjects/roi_pooling-master/lltm_cuda_kernel.cu -o lltm_cuda_kernel.cuda.o
/bin/sh: 1: /usr/local/cuda-9.0/bin/nvcc: not found
[2/3] c++ -MMD -MF lltm_cuda.o.d -DTORCH_EXTENSION_NAME=lltm_cuda -DTORCH_API_INCLUDE_EXTENSION_H -isystem /usr/local/lib/python3.6/dist-packages/torch/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.6/dist-packages/torch/include/THC -isystem /usr/local/cuda-9.0/include -isystem /usr/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/youmi/PycharmProjects/roi_pooling-master/lltm_cuda.cpp -o lltm_cuda.o
FAILED: lltm_cuda.o
c++ -MMD -MF lltm_cuda.o.d -DTORCH_EXTENSION_NAME=lltm_cuda -DTORCH_API_INCLUDE_EXTENSION_H -isystem /usr/local/lib/python3.6/dist-packages/torch/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.6/dist-packages/torch/include/THC -isystem /usr/local/cuda-9.0/include -isystem /usr/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/youmi/PycharmProjects/roi_pooling-master/lltm_cuda.cpp -o lltm_cuda.o
/home/youmi/PycharmProjects/roi_pooling-master/lltm_cuda.cpp:78:16: error: expected constructor, destructor, or type conversion before ‘(’ token
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
^
ninja: build stopped: subcommand failed.

I installed CUDA 10.1, why building lltm_cuda_kernel.cu automatically uses CUDA 9.0?

For your cuda version problem :

You most likely have both cuda 10 and cuda 9 installed.
What it the output of nvcc --version?
See here about how the pytorch extension tries to find cuda.

A part from uninstalling cuda9, you can try to set the environment variable to your cuda 10.1 directory, I'll assume it's the usual /usr/local/cuda-10.1

export CUDA_HOME=/usr/local/cuda-10.1

/bin/sh: 1: /usr/local/cuda-9.0/bin/nvcc: not found

There is a : behind 1
It is the reason of this problem. I have encountered this problem before.
So you need two steps:

  1. vi ~/.bashrc
  2. add export CUDA_HOME="/usr/local/cuda-10.1"
  3. add export PATH="$PATH:/usr/local/cuda/bin"
  4. add export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
  5. source ~/.bashrc

/bin/sh: 1: /usr/local/cuda-9.0/bin/nvcc: not found

There is a : behind 1 It is the reason of this problem. I have encountered this problem before. So you need two steps:

  1. vi ~/.bashrc
  2. add export CUDA_HOME="/usr/local/cuda-10.1"
  3. add export PATH="$PATH:/usr/local/cuda/bin"
  4. add export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
  5. source ~/.bashrc

Thanks bro! You save my sleep tonight🥹.