What This Is

This is a toy simulation code for testing different ways of achieving multi GPU parallelism. There is a GPUDirect version (native CUDA) and an MPI version (CUDA-aware MPI). Feasibly we could look at solutions that incorporate thrust/nccl in the future. We could also look at a native CUDA version that uses remote memory access instead of GPU<->GPU communications.

To run the mpi code, execute the following command:

/usr/local/mpi-cuda/bin/mpirun -np 1 --mca btl openib,self mpi_test

Helpful Links

Link to version of Open MPI used:

https://github.com/open-mpi/ompi-release/tree/v1.10

How to build Open MPI with CUDA-Aware support:

NVIDIA docs about CUDA-Aware MPI:

Other helpful/interesting links:

https://community.mellanox.com/community/support/software-drivers/rdma-software-for-gpu/blog/2014/01/27/using-gpudirect-rdma-with-mpi

Links to info about multi GPU programming:

Helpful CUDA wrappers for future reference:

How to run CUDA samples (there is a multi GPU sample):

http://docs.nvidia.com/cuda/cuda-samples/index.html#getting-cuda-samples

seankhl/multigputests

What This Is

Helpful Links