Numba CUDA intro
Numba CUDA tutorial for the parallel computing class at the HKA
Up to date version: https://github.com/daniel-vera-g/numba-cuda-intro
- TLDR; 👉: Online Quickstart
- Explore in online editor(Google account needed to run 💡):
Requirements
- Python 3 with at least
miniconda
installed - Optionally a CUDA GPU
Setup
Automated:
- Create conda environment and install packages if needed:
./run.sh
- Same as above and start jupyter notebook:
./run.sh --jupyter
Manual:
- Create an environment and install the dependencies:
conda env create --name parallele -f environment.yml
NOTE: If this step makes problems, just omit the
-f environment.yml
and install the necessary dependencies manually after activating the environment:conda install numba numpy jupyterlab
- If not already done, activate the environment:
conda activate parallele
- Start jupyter notebook:
jupyter-notebook
Additional notes
- To save newly installed dependencies, run
conda env export > environment.yml
- If you don't have a CUDA capable GPU, you can activate the simulation mode by starting Jupyter notebook with the
NUMBA_ENABLE_CUDASIM=1
ENV:NUMBA_ENABLE_CUDASIM=1 jupyter-notebook
Contents
CUDA concepts in Numba:
- What Kernels are and how they work: Kernels
- How to manage memory when doing operations: Memory management
- How to debug Numba code: Debugging
- Other useful CUDA features in Numba: Other Numba features
References
This tutorial uses the following references:
- https://numba.pydata.org/numba-doc/dev/index.html
- https://people.duke.edu/~ccc14/sta-663/CUDAPython.html
- https://nyu-cds.github.io/python-numba/05-cuda/
- Used image: https://www.researchgate.net/figure/Figure-2-Execution-model-of-a-CUDA-program-on-NVidias-GPU-Hierarchy-grid-blocks-and_fig2_321666991
Additional:
- Thread indexing cheat sheet: https://cs.calvin.edu/courses/cs/374/CUDA/CUDA-Thread-Indexing-Cheatsheet.pdf