NCCL 和MPI的例子
编译: mpicc -o3 mpi_test.c -o mpi_test 执行: mpirun -n 4 mpi_test
nvcc -o first_cuda singleProcess.cpp -I/usr/local/nccl/include -L/usr/local/nccl/lib -l nccl
mpicc -o3 oneDevicePerprocess.cpp -o cpi -I/usr/local/cuda/include -L/usr/local/cuda/lib64 -lcudart -lcuda -I/usr/local/nccl/include -L/usr/local/nccl/lib -l nccl
#使用 run.sh 编译文件 ./run.sh [source.cpp] output