high-performance-computing
There are 991 repositories under high-performance-computing topic.
taskflow/taskflow
A General-purpose Task-parallel Programming System using Modern C++
Netflix/metaflow
Open Source Platform for developing, scaling and deploying serious ML, AI, and data science systems
google/tf-quant-finance
High-performance TensorFlow library for quantitative finance.
ProjectPhysX/FluidX3D
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
parallel101/course
高性能并行编程与优化 - 课件
alpa-projects/alpa
Training and serving large-scale neural networks with auto parallelization.
merrymercy/awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
flame/blis
BLAS-like Library Instantiation Software Framework
BOINC/boinc
Open-source software for volunteer computing and grid computing.
kokkos/kokkos
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
chapel-lang/chapel
a Productive Parallel Programming Language
mfem/mfem
Lightweight, general, scalable C++ library for finite element methods
hermit-os/hermit-rs
Hermit for Rust.
Maratyszcza/NNPACK
Acceleration package for neural networks on multi-core CPUs
AdaptiveCpp/AdaptiveCpp
Implementation of SYCL and C++ standard parallelism for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications adapt themselves to all the hardware in the system - even at runtime!
ropensci/drake
An R-focused pipeline toolkit for reproducibility and high-performance computing
mratsim/Arraymancer
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
trilinos/Trilinos
Primary repository for the Trilinos Project
hermit-os/kernel
A Rust-based, lightweight unikernel.
sail-sg/envpool
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
uncomplicate/neanderthal
Fast Clojure Matrix Library
ropensci/targets
Function-oriented Make-like declarative workflows for R
mateogianolio/vectorious
Linear algebra in TypeScript.
Liu-xiandong/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
openmc-dev/openmc
OpenMC Monte Carlo Code
precice/precice
A coupling library for partitioned multi-physics simulations, including, but not restricted to fluid-structure interaction and conjugate heat transfer simulations.
zanellia/prometeo
An experimental Python-to-C transpiler and domain specific language for embedded high-performance computing
Geant4/geant4
Geant4 toolkit for the simulation of the passage of particles through matter - NIM A 506 (2003) 250-303
austinksmith/Hamsters.js
100% Vanilla Javascript Multithreading & Parallel Execution Library
AMReX-Codes/amrex
AMReX: Software Framework for Block Structured AMR
LLNL/sundials
Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.
spcl/dace
DaCe - Data Centric Parallel Programming
brucefan1983/GPUMD
Graphics Processing Units Molecular Dynamics
DeveloperPaul123/thread-pool
A modern, fast, lightweight thread pool library based on C++20
pypr/pysph
A framework for Smoothed Particle Hydrodynamics in Python
3dem/relion
Image-processing software for cryo-electron microscopy