Pinned Repositories
ComScribe
ComScribe is a tool to identify communication among all GPU-GPU and CPU-GPU pairs in a single-node multi-GPU system.
CPU-Free-model
Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involvement of the CPU beyond the initial kernel launch.
CPU-Free-Model-Compiler
DaCe - Data Centric Parallel Programming
mixed-and-multi-spmv
Mixed and Multi-Precision SpMV for GPUs with Row-wise Precision Selection.
multi-GPU-comm-bench
parcorelab.github.io
ParCoreTools
ReuseTracker
A fast and accurate reuse distance analyzer for multi-threaded applications. It leverages existing hardware features in commodity CPUs.
Snoopie
Multi-GPU communication profiler and visualizer
SpTRSV_Framework
The SpTRSV prediction framework is an automated prediction framework for the fastest sparse triangular solve (SpTRSV) algorithm for a given input sparse matrix on a CPU-GPU platform.
ParCoreLab's Repositories
ParCoreLab/ComScribe
ComScribe is a tool to identify communication among all GPU-GPU and CPU-GPU pairs in a single-node multi-GPU system.
ParCoreLab/Snoopie
Multi-GPU communication profiler and visualizer
ParCoreLab/CPU-Free-model
Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involvement of the CPU beyond the initial kernel launch.
ParCoreLab/ReuseTracker
A fast and accurate reuse distance analyzer for multi-threaded applications. It leverages existing hardware features in commodity CPUs.
ParCoreLab/SpTRSV_Framework
The SpTRSV prediction framework is an automated prediction framework for the fastest sparse triangular solve (SpTRSV) algorithm for a given input sparse matrix on a CPU-GPU platform.
ParCoreLab/mixed-and-multi-spmv
Mixed and Multi-Precision SpMV for GPUs with Row-wise Precision Selection.
ParCoreLab/multi-GPU-comm-bench
ParCoreLab/Split_SpTRSV
The split execution framework can automatically determine the suitability of an SpTRSV for split-execution, find the appropriate split point, and execute SpTRSV in a split fashion using two SpTRSV algorithms while automatically managing any required inter-platform communication. The model is implemented as a C++/CUDA library supporting multiple CPU-GPU algorithms.
ParCoreLab/ParCoreTools
ParCoreLab/BeyondMoore
BeyondMoore has an ambitious goal to develop a software framework that performs static and dynamic optimizations, issues accelerator-initiated data transfers, and reasons about parallel execution strategies that exploit both processor and memory heterogeneity.
ParCoreLab/gpu-fusion
GPU fusion code and algorithm
ParCoreLab/pardnn
ParCoreLab/PES-artifact
ParCoreLab/.github
Homepage README.
ParCoreLab/accuracy-verification-microbenchmarks
The microbenchmarks that are used to verify the accuracy of ComDetective.
ParCoreLab/cha-aware-result-parser
ParCoreLab/CPU-Free-Model-Compiler
DaCe - Data Centric Parallel Programming
ParCoreLab/hpctoolkit-externals
HPCToolkit performance tools: essential third party libraries for hpctoolkit
ParCoreLab/parcorelab.github.io
ParCoreLab/pes-benchs
ParCoreLab/AMD_IBS_Toolkit
AMD Research Instruction Based Sampling Toolkit
ParCoreLab/barnes
ParCoreLab/gpucommanalyzer
ParCoreLab/hpctoolkit
HPCToolkit performance tools: measurement and analysis components
ParCoreLab/snoopie-ucx-tracking-ucx
Modified ucx library to track communications
ParCoreLab/snoopie-visualiser
ParCoreLab/splash2
Splash 2 Benchmarks