Atlas: High-performance GPU-based Quantum Circuit Simulator

Installation

Prerequisites

cuQuantum (requires compute capability 7.0+)
OpenMPI/MPICH
cmake >= 3.18
NCCL
CUDA

Build Atlas

1. Set related environment variables in `config/config.linux`

We support two modes for simulation in Atlas. First one is distributed GPU-based simulation (USE_LEGION=OFF). The other one is CPU-offload enabled simulation (USE_LEGION=ON), which support simulating more qubits on a single machine. Note that the second mode hasn't been tested for multi-node execution.

In addition, please also replace all hard-coded paths (starting with /global/homes/m/mingkuan) with your home directory.

2. Install the HiGHS solver in Quartz

cd deps/quartz/external/HiGHS
mkdir build
cd build
cmake ..
make -j 12

3. Build and Install

cd ../../../../..  # cd $TORQUE_HOME
mkdir build
cd build
# module load nccl  # You may need this on Perlmutter
bash ../config/config.linux
make -j 12

Run the Simulation

Perlmutter

Sbatch

There are some sbatch scripts for running simulation using Atlas in scripts/perlmutter/bench. Run them with:

sbatch xxx.sh

Interactive mode:

Allocate nodes

salloc --nodes 2 -q regular --time 00:20:00 --constraint gpu --gpus-per-node 4 --account=YOUR_ACCOUNT

Load modules

module load nccl
module load cudatoolkit
conda activate qs
export PATH=$PATH:$HiGHS_HOME/build/bin
export MPICH_GPU_SUPPORT_ENABLED=1

Distributed GPU-based simulation launched with srun, for example:

srun -u \
     --ntasks="$(( SLURM_JOB_NUM_NODES ))" \
     --ntasks-per-node=1\ 
     $TORQUE_HOME/build/examples/mpi-based/simulate --import-circuit qft --n 31 --local 28 --device 4 --use-ilp

For CPU-offload simulation:

Create a Python 3.8 environment with PuLP:

conda create --name pulp python=3.8
conda activate pulp
pip install pulp

Make sure the setenv("PYTHONPATH", ...) in examples/legion-based/test_sim_legion.cc is pointing to the correct location.
Build and run in interactive mode:

cd build
make -j 12
cd ../scripts/perlmutter/bench
salloc --nodes 1 -q regular --time 00:30:00 --constraint gpu --gpus-per-node 4 --account=YOUR_ACCOUNT
bash offload.sh  # takes around 25 minutes

AWS

For CPU-offload simulation:

To run the scalability test on an AWS p3.8xlarge instance:

cd $TORQUE_HOME/build/examples/legion-based/`

./test -ll:gpu NUM_GPU -ll:fsize F_SIZE -ll:zsize Z_SIZE --local-qubits LOCAL_QUBITS_NUM --all-qubits ALL_QUBITS_NUM

-ll:gpu: set the number of GPUs we have for the simulation.
-ll:fsize: total gpu memory we have (e.g., 15000)
-ll:zsize: zero-copy DRAM size (e.g., 100000)
--local-qubits: the number of local qubits (for a 16G GPU, 28 local qubits at most)
--all-qubits: the number of all the qubits.

quantum-compiler/atlas

Atlas: High-performance GPU-based Quantum Circuit Simulator

Installation

Prerequisites

Build Atlas

1. Set related environment variables in config/config.linux

2. Install the HiGHS solver in Quartz

3. Build and Install

Run the Simulation

Perlmutter

Sbatch

Interactive mode:

For CPU-offload simulation:

AWS

For CPU-offload simulation:

1. Set related environment variables in `config/config.linux`