DiamonDinoia
Software Engineer | High Performance Computing | Monte Carlo simulations
Simons FoundationNew York
Pinned Repositories
allcaps
altmin
atan2
Testing different implementation of Atan2
benchmark-elementary-functions
this repo aims to test the performance and accuracy different elementary functions (e.g. log, sin, cos..)
benchmark_arch_optimization_flags
I'm testing the differences between gcc/llvm with various optimization flags. Both performance and assembly are analyzed.
cpu-performance-tests
This repository contains the code to benchmark CPU cache miss latency and branch misprediction penalty
jacobi
This is a parallel implementation of the Jacobi algorithm.
mcss
mixmaxCUDA
morton-span
This repository implements a morton transform for mdspan
DiamonDinoia's Repositories
DiamonDinoia/cpu-performance-tests
This repository contains the code to benchmark CPU cache miss latency and branch misprediction penalty
DiamonDinoia/mcss
DiamonDinoia/mixmaxCUDA
DiamonDinoia/morton-span
This repository implements a morton transform for mdspan
DiamonDinoia/altmin
DiamonDinoia/aocl-libm-ose
AMD LIBM
DiamonDinoia/arrayfire
ArrayFire: a general purpose GPU library.
DiamonDinoia/benchmark-elementary-functions
this repo aims to test the performance and accuracy different elementary functions (e.g. log, sin, cos..)
DiamonDinoia/cpp-learning
In this repo, there are random cpp features tested
DiamonDinoia/chebtest
Basic nanobench project for messing around with polynomial evaluation
DiamonDinoia/cmake-minimal
A minimal cmake-based C++ project setup
DiamonDinoia/cuda-variant
variant type for CUDA
DiamonDinoia/ducc
Fork of https://gitlab.mpcdf.mpg.de/mtr/ducc to simplify external contributions
DiamonDinoia/fft_bench
More benchmarks of various fft implementations
DiamonDinoia/fftw3
DO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)
DiamonDinoia/finufft
Non-uniform fast Fourier transform library of types 1,2,3 in dimensions 1,2,3
DiamonDinoia/finufft-native
DiamonDinoia/geant4
Geant4 toolkit for the simulation of the passage of particles through matter - NIM A 506 (2003) 250-303
DiamonDinoia/highway
Performance-portable, length-agnostic SIMD with runtime dispatch
DiamonDinoia/nanobind_example
A nanobind example project
DiamonDinoia/online-alt-min
Source code for paper Choromanska et al. -- Beyond Backprop: Online Alternating Minimization with Auxiliary Variables -- http://proceedings.mlr.press/v97/choromanska19a.html
DiamonDinoia/optimized-routines
Optimized implementations of various library functions for ARM architecture processors
DiamonDinoia/Optional
optional (nullable) objects for C++14
DiamonDinoia/philox
Implementation of the Philox RNG for CPU and GPU (CUDA, HIP)
DiamonDinoia/profiling
DiamonDinoia/test-cuda
testing some stuff in cuda
DiamonDinoia/uk-visa-calculator
DiamonDinoia/winamp
Iconic media player
DiamonDinoia/xsimd
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))
DiamonDinoia/yagit
Library for efficient comparison of 2D, 3D DICOM images using gamma index