build issues - cuda headers

Question

build issues - cuda headers

Closed this issue 7 years ago · 1 comments

Hey, thanks for setting things up!
I don't seem to be able to build properly, as I keep getting an error with the location of the cuda headers.

$   ./make.sh                                                                                                                                                         
Compiling functions using nvcc...
Compiled, now linking...
Generating sanitized versions of internals for C compilation...
Building python interface to CUDA code
Including CUDA code.
generating /tmp/tmpb1XVOD/_functions.c
running build_ext
building '_functions' extension
creating home
creating home/<u>
creating home/<u>/repos
creating home/<u>/repos/lgamma
creating home/<u>/repos/lgamma/src
x86_64-pc-linux-gnu-gcc -pthread -fPIC -DWITH_CUDA -I/home/<u>/.virtualenvs/pytorch/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include -I/home/<u>/.virtualenvs/pytorch/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/<u>/.virtualenvs/pytorch/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/include/python2.7 -c _functions.c -o ./_functions.o -std=gnu11
In file included from /home/<u>/.virtualenvs/pytorch/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/THC/THC.h:4:0,
                 from _functions.c:434:
/home/<u>/.virtualenvs/pytorch/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/THC/THCGeneral.h:9:18: fatal error: cuda.h: No such file or directory
 #include "cuda.h"
                  ^
compilation terminated.
Traceback (most recent call last):
  File "build.py", line 44, in <module>
    ffi.build()
  File "/home/<u>/.virtualenvs/pytorch/lib/python2.7/site-packages/torch/utils/ffi/__init__.py", line 164, in build
    _build_extension(ffi, cffi_wrapper_name, target_dir, verbose)
  File "/home/<u>/.virtualenvs/pytorch/lib/python2.7/site-packages/torch/utils/ffi/__init__.py", line 100, in _build_extension
    ffi.compile(tmpdir=tmpdir, verbose=verbose, target=libname)
  File "/home/<u>/.virtualenvs/pytorch/lib/python2.7/site-packages/cffi/api.py", line 684, in compile
    compiler_verbose=verbose, debug=debug, **kwds)
  File "/home/<u>/.virtualenvs/pytorch/lib/python2.7/site-packages/cffi/recompiler.py", line 1484, in recompile
    compiler_verbose, debug)
  File "/home/<u>/.virtualenvs/pytorch/lib/python2.7/site-packages/cffi/ffiplatform.py", line 20, in compile
    outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
  File "/home/<u>/.virtualenvs/pytorch/lib/python2.7/site-packages/cffi/ffiplatform.py", line 56, in _build
    raise VerificationError('%s: %s' % (e.__class__.__name__, e))
cffi.error.VerificationError: CompileError: command 'x86_64-pc-linux-gnu-gcc' failed with exit status 1

I did change make.sh to point it to the correct cuda location as well.. (I also removed the hard-coding of gpu architecture to let my compile autodetect; it's what's worked for me before)

diff --git a/make.sh b/make.sh
index c26d274..a514923 100755
--- a/make.sh
+++ b/make.sh
@@ -11,13 +11,13 @@ rm -f internals_s.c internals_s.h
 echo "Compiling functions using nvcc..."
 
 # force compilation in CUDA/C++ mode
-nvcc -c -dc --shared functions_cuda_kernel.cu -x cu -arch=sm_35 -Xcompiler -fPIC -lcudadevrt -lcudart -o functions_cuda_kernel.cu.o -D __BOTH__='__device__ __host__'
-nvcc -c -dc --shared internals.c -x cu -arch=sm_35 -Xcompiler -fPIC -lcudadevrt -lcudart -o internals.cu.o -D __BOTH__='__device__ __host__' -include cfloat
+nvcc -c -dc --shared functions_cuda_kernel.cu -x cu -Xcompiler -fPIC -lcudadevrt -lcudart -o functions_cuda_kernel.cu.o -D __BOTH__='__device__ __host__'
+nvcc -c -dc --shared internals.c -x cu -Xcompiler -fPIC -lcudadevrt -lcudart -o internals.cu.o -D __BOTH__='__device__ __host__' -include cfloat
 
 echo "Compiled, now linking..."
 
 # required intermediate device code link step
-nvcc -arch=sm_35 -dlink functions_cuda_kernel.cu.o internals.cu.o -o functions.link.cu.o -Xcompiler -fPIC -lcudadevrt -lcudart
+nvcc -dlink functions_cuda_kernel.cu.o internals.cu.o -o functions.link.cu.o -Xcompiler -fPIC -lcudadevrt -lcudart
 
 echo "Generating sanitized versions of internals for C compilation..."
 
@@ -27,4 +27,4 @@ sed "s/__BOTH__//" internals.h > internals_s.h
 cd ../
 
 echo "Building python interface to CUDA code"
-python build.py
+python build.py --cuda-path /opt/cuda/lib

It doesn't appear to matter whether the path is /opt/cuda or /opt/cuda/lib or even the pytorch specific cuda path from the locate call below, I get the same error shown above.

$   locate libcudart.so
/home/<u>/.virtualenvs/pytorch/lib/python2.7/site-packages/torch/lib/libcudart.so
/home/<u>/.virtualenvs/pytorch/lib/python2.7/site-packages/torch/lib/libcudart.so.7.5
/opt/cuda/lib/libcudart.so
/opt/cuda/lib/libcudart.so.7.5
/opt/cuda/lib/libcudart.so.7.5.18
/opt/cuda/lib64/libcudart.so
/opt/cuda/lib64/libcudart.so.7.5
/opt/cuda/lib64/libcudart.so.7.5.18

Answer 1 · 2017-06-08T13:42:19.000Z

Closing issue.
Supplying a symlink from /opt/cuda to /usr/local/cuda fixed things.

Here's what I changed in make.sh to get things working after setting up the symlink.

diff --git a/make.sh b/make.sh
index c26d274..2bf1ebf 100755
--- a/make.sh
+++ b/make.sh
@@ -1,30 +1,30 @@
 #!/usr/bin/env bash
 
-CUDA_PATH=/usr/local/cuda/
+CUDA_PATH=/usr/local/cuda/lib64
 
-cd src
+cd src || exit
 
 # clean everything from before
-rm -f *.o *.so
+rm -f ./*.o ./*.so
 rm -f internals_s.c internals_s.h
 
 echo "Compiling functions using nvcc..."
 
 # force compilation in CUDA/C++ mode
-nvcc -c -dc --shared functions_cuda_kernel.cu -x cu -arch=sm_35 -Xcompiler -fPIC -lcudadevrt -lcudart -o functions_cuda_kernel.cu.o -D __BOTH__='__device__ __host__'
-nvcc -c -dc --shared internals.c -x cu -arch=sm_35 -Xcompiler -fPIC -lcudadevrt -lcudart -o internals.cu.o -D __BOTH__='__device__ __host__' -include cfloat
+nvcc -c -dc --shared functions_cuda_kernel.cu -x cu -arch=sm_52 -Xcompiler -fPIC -lcudadevrt -lcudart -o functions_cuda_kernel.cu.o -D __BOTH__='__device__ __host__'
+nvcc -c -dc --shared internals.c -x cu -arch=sm_52 -Xcompiler -fPIC -lcudadevrt -lcudart -o internals.cu.o -D __BOTH__='__device__ __host__' -include cfloat
 
 echo "Compiled, now linking..."
 
 # required intermediate device code link step
-nvcc -arch=sm_35 -dlink functions_cuda_kernel.cu.o internals.cu.o -o functions.link.cu.o -Xcompiler -fPIC -lcudadevrt -lcudart
+nvcc -arch=sm_52 -dlink functions_cuda_kernel.cu.o internals.cu.o -o functions.link.cu.o -Xcompiler -fPIC -lcudadevrt -lcudart
 
 echo "Generating sanitized versions of internals for C compilation..."
 
 echo "#include <float.h>" | cat - internals.c | sed "s/__BOTH__//g" | sed "s/internals.h/internals_s.h/g" > internals_s.c
 sed "s/__BOTH__//" internals.h > internals_s.h
 
-cd ../
+cd ../ || exit
 
 echo "Building python interface to CUDA code"
-python build.py
+python build.py --cuda-path=$CUDA_PATH