jrmadsen
HPC C/C++/Python developer with interest in performance portable solutions. Background in radiation transport. Senior Member of Technical Staff at AMD
AMDAustin, TX
Pinned Repositories
Caliper
Caliper is a flexible application introspection system
compile-time-perf
Measures high-level timing and memory usage metrics during compilation
madthreading
A low-overhead, task-based threading API using a thread-pool of C++11 threads
omnitrace
Omnitrace: Application Profiling, Tracing, and Analysis
PTL
Parallel Tasking Library (PTL) - Lightweight C++11 mutilthreading tasking system featuring thread-pool, task-groups, and lock-free task queue
pyctest
Python bindings of select portions of CMake/CTest package -- enabling generation of CTest test files from Python without a CMake build system
Vectorization-Example
An example testing SIMD with AVX/AVX2 Intrinsics vs. OpenMP SIMD vs. compiler (gcc) auto-vectorization
pykokkos-base
Python bindings for data interoperability with Kokkos (View, DynRankView)
timemory
Modular C++ Toolkit for Performance Analysis and Logging. Profiling API and Tools for C, C++, CUDA, Fortran, and Python. The C++ template API is essentially a framework to creating tools: it is designed to provide a unifying interface for recording various performance measurements alongside data logging and interfaces to other tools.
omnitrace
Omnitrace: Application Profiling, Tracing, and Analysis
jrmadsen's Repositories
jrmadsen/compile-time-perf
Measures high-level timing and memory usage metrics during compilation
jrmadsen/timemory-testing
Scripts for extended testing
jrmadsen/amrex
AMReX: Software Framework for Block Structured AMR
jrmadsen/benchmark
A microbenchmark support library
jrmadsen/CDash
An open source, web-based software testing server
jrmadsen/dataset
Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.
jrmadsen/dyninst-examples
Example usage of Dyninst
jrmadsen/HPCInfo
Information about many aspects of high-performance computing. Wiki content moved to ~/docs.
jrmadsen/kokkos-miniapps
Mini-applications that exclusively use the Kokkos programming model
jrmadsen/likwid
Performance monitoring and benchmarking suite
jrmadsen/line_profiler
Line-by-line profiling for Python
jrmadsen/matplotplusplus
Matplot++: A C++ Graphics Library for Data Visualization 📊🗾
jrmadsen/nccl-tests
NCCL Tests
jrmadsen/numba
NumPy aware dynamic Python compiler using LLVM
jrmadsen/perfetto
Performance instrumentation and tracing for Android, Linux and Chrome (read-only mirror of https://android.googlesource.com/platform/external/perfetto/)
jrmadsen/pykokkos
Provides Kokkos performance portable parallel programming in Python.
jrmadsen/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
jrmadsen/robin-hood-hashing
Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20
jrmadsen/ROC_SHMEM
ROC_SHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.
jrmadsen/ROCm-CompilerSupport
The compiler support repository provides various Lightning Compiler related services.
jrmadsen/rocprofiler
ROC profiler library. Profiling with perf-counters and derived metrics.
jrmadsen/roctracer
ROCm Tracer Callback/Activity Library for Performance tracing AMD GPU's
jrmadsen/scikit-build
Improved build system generator for CPython C, C++, Cython and Fortran extensions
jrmadsen/slate-roofline
jrmadsen/spot2_container
The container infrastructure for the SPOT performance visualization tool
jrmadsen/staged-recipes
A place to submit conda recipes before they become fully fledged conda-forge feedstocks
jrmadsen/timemory-feedstock
A conda-smithy repository for timemory.
jrmadsen/timemory-ping
jrmadsen/TinyInst
A lightweight dynamic instrumentation library
jrmadsen/tracy
C++ frame profiler