CMake+nvcc+msvc==pure_chaos. I learned it the hard way so you don't have to.
Starting point for GPU accelerated python libraries
Adapted from original work from https://github.com/PWhiddy/pybind11-cuda
Present work uses modern CMake/Cuda approach
CUDA
Python 3.6 or greater
CMake >= 3.18 (for CUDA support and the new FindPython3 module)
You can use variable CMAKE_CUDA_ARCHITECTURES instead of CUDAFLAGS:
mkdir build; cd build
# provide a default cuda hardware architecture to build for
cmake -DCMAKE_CUDA_ARCHITECTURES="75" ..
make
Test it with
python3 ./src/test_cxx_module.py
- Compiles out of the box with cmake, even in Windows with
msvc
- Easy-to-modify demos with modern c++ experience by using libs
such as
Thrust
andcutlass
- Numpy integration
- C++ Templating for composable kernels with generic data types
- The search order for
cuDNN
incutlass
is a bit surprising as of now (v2.10.0). It is recommended to copy your desired version ofcuDNN
into your current CUDA directory. And take notice on the detected path reported bycutlass
'sCMakeLists.txt
.
Originally based on https://github.com/torstem/demo-cuda-pybind11