P3-miniapps are designed to evalute Performance, Portability and Productivity (P3) of mini-applications with C++ parallel algorithms (stdpar). We have implemented 3D heat equation solver and 4D (2D space and 2D velocity space) Vlasov-Poisson solver. These mini-apps are parallelized with MPI + "X" programming model in C++. "X" includes stdpar, OpenMP, OpenACC, OpenMP4.5, Kokkos, Thrust, CUDA, and HIP. As well as the language standard parallelization (stdpar), we also focus on the language standard high dimensional array support mdspan.
This repository includes the following mini-apps:
For questions or comments, please find us in the AUTHORS file.
Firstly, you need to git clone on your environment as
git clone --recursive https://github.com/yasahi-hpc/P3-miniapps.git
This software relies on external libraries including Kokkos, mdspan and fftw. Kokkos and mdspan are included as submodules. CUDA-Aware-MPI or ROCm-Aware-MPI are also needed for Nvidia and AMD GPUs. In the following, we assume that fftw and MPI libraries are appropriately installed.
We rely on CMake to build the applications. 4 mini applications heat3d
, heat3d_mpi
, vlp4d
, and vlp4d_mpi
are provided. You will compile with the following CMake command. For -DAPPLICATION
option, <app_name>
should be choosen from the application names provided above. To enable test, you should set -DBUILD_TESTING=ON
. Following table summarizes the allowed combinations of <programming_model>
, <compiler_name>
, and <backend>
for each DEVICE
.
cmake -DCMAKE_CXX_COMPILER=<compiler_name> \
-DCMAKE_BUILD_TYPE=<build_type> \
-DBUILD_TESTING=OFF \
-DPROGRAMMING_MODEL=<programming_model> \
-DBACKEND=<backend> \
-DAPPLICATION=<app_name>
DEVICE | programming_model | compiler_name | backend |
---|---|---|---|
IceLake | OPENMP THRUST KOKKOS STDPAR |
icpc g++ icpc nvc++ |
OPENMP |
P100 V100 A100 |
CUDA OPENMP OPENACC THRUST KOKKOS STDPAR |
g++ nvc++ nvc++ g++ nvcc_wrapper nvc++ |
CUDA |
MI100 | HIP OPENMP THRUST KOKKOS |
hipcc hipcc hipcc hipcc |
HIP |
- For Nvidia Devices, you need to add
-DCMAKE_CUDA_ARCHITECTURES=<device>
(device
=60
,70
,80
for P100, V100 and A100, respetively) to compile for<programmind_model>=[THRUST, CUDA]
. For<programmind_model>=KOKKOS
and<backend>=CUDA
, one further needs to add Kokkos options for installation like
-DKokkos_ENABLE_CUDA=On \
-DKokkos_ENABLE_CUDA_LAMBDA=On \
-DKokkos_ARCH_AMPERE80=On
- For AMD Devices, you also need to pass
-DCMAKE_HIP_ARCHITECTURES=<device>
(device
=gfx908
for MI100) to compile for<programming_model>=OPENMP
and<backend>=HIP
. For<programmind_model>=KOKKOS
and<backend>=HIP
, one further needs to add Kokkos options for installation like
-DKokkos_ENABLE_HIP=On \
-DKokkos_ARCH_VEGA908=On
To run the applications, several command line arguments are necessary. Following table summarizes the list of commad line arguments for each mini-app.
app_name | Arguments | Examples |
---|---|---|
heat3d | nx, ny, nz, nbiter, freq_diag | heat3d --nx 512 --ny 512 --nz 512 --nbiter 1000 --freq_diag 0 |
heat3d_mpi | px, py, pz, nx, ny, nz, nbiter, freq_diag | heat3d_mpi --px 2 --py 2 --pz 2 --nx 256 --ny 256 --nz 256 --nbiter 1000 --freq_diag 0 |
vlp4d | (file_name) | vlp4d SLD10_large.dat |
vlp4d_mpi | f | vlp4d_mpi -f SLD10.dat |
For the Kokkos version, you additionally need to give "num_threads", "teams", "device", "num_gpus", and "device_map", e.g. vlp4d_mpi --num_threads 1 --teams 1 --device 0 --num_gpus 8 --device_map 1 -f SLD10.dat
.
More details are given in the docs for heat3d and vlp4d.
@INPROCEEDINGS{Asahi2021,
author={Asahi, Yuuichi and Latu, Guillaume and Bigot, Julien and Grandgirard, Virginie},
booktitle={2021 International Workshop on Performance, Portability and Productivity in HPC (P3HPC)},
title={Optimization strategy for a performance portable Vlasov code},
year={2021},
volume={},
number={},
pages={79-91},
doi={10.1109/P3HPC54578.2021.00011}}
@INPROCEEDINGS{Asahi2019,
author = {Asahi, Yuuichi and Latu, Guillaume and Grandgirard, Virginie and Bigot, Julien},
title = {Performance Portable Implementation of a Kinetic Plasma Simulation Mini-App},
booktitle = {Accelerator Programming Using Directives},
year = {2020},
editor = {Wienke, Sandra and Bhalachandra, Sridutt},
series = {series},
pages = {117--139},
address = {Cham},
publisher = {Springer International Publishing},
}