This is my experiment on CUDA Multi-Device Synchronization, before my attempt of making this functionality available in Julia.
Firstly, on GitHub, there are only three (useful or not) CUDA files that actually use CUDA multi-grid synchronization:
- the official conjugateGradientMultiDeviceCG.cu by NVIDIA.
- an experiment by another user.
- a test by another user.
and now additionally:
- this repo you are looking at.
To actually learn how should you use cudaLaunchCooperativeKernelMultiDevice
, which is far different from cudaLaunchCooperativeKernel
or cudaLaunchKernel
, go to C.4. Multi-Device Synchronization in CUDA Toolkit Documentation.
Before you get into the detail, make sure you have access to a node/host/machine with more than two GPUs installed. These GPUs must be identical and must have compute capability greater than 6.1.