See the accompanying post on the NVIDIA Developer Blog here.
These Notebooks demonstrate how to accelerate Python code on the GPU using Cython and nvc++ with stdpar.
-
First, you'll need the NVIDIA HPC SDK, which provides the
nvc++
compiler. A minimum version of 20.9 is required to run these examples. Note that unless your NVIDIA driver supports CUDA 11.0, you will want to download the version that is bundled with two previous CUDA versions (10.1 and 10.2).Once installed, please ensure that the
nvc++
executable is in your PATH.Further, your GPU must have CUDA capability >= 6.0 to exploit
-stdpar
feature. -
You will also need the development version of Cython. The simplest way to get the minimum required version is to use
pip
:python -m pip install git+https://github.com/cython/cython@90684ac416f0349761074e242be4d981de40ce0f
-
Install Python dependencies:
python -m pip install numpy pandas matplotlib
-
This step is optional. To run the CPU Parallel benchmarks, you will need
gcc >= 9.1
as well as the TBB library. On Ubuntu 20.04gcc-9
should already be the default, and I didapt install libtbb-dev
to get TBB.