high-performance-computing

There are 1045 repositories under high-performance-computing topic.

taskflow/taskflow
A General-purpose Task-parallel Programming System using Modern C++
Language:C++10.7k 253 4851.3k
Netflix/metaflow
Build, Manage and Deploy AI/ML Systems
Language:Python8.7k 293 695822
google/tf-quant-finance
High-performance TensorFlow library for quantitative finance.
Language:Python4.8k 170 56605
ProjectPhysX/FluidX3D
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
Language:C++4.3k 62 209375
parallel101/course
高性能并行编程与优化 - 课件
Language:C++3.9k 56 31550
alpa-projects/alpa
Training and serving large-scale neural networks with auto parallelization.
Language:Python3.1k 46 297361
merrymercy/awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
2.5k 116 1308
bshoshany/thread-pool
BS::thread_pool: a fast, lightweight, modern, and easy-to-use C++17 / C++20 / C++23 thread pool library
Language:C++2.5k 38 121276
flame/blis
BLAS-like Library Instantiation Software Framework
Language:C2.4k 79 458376
kokkos/kokkos
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
Language:C++2.2k 86 3.1k450
BOINC/boinc
Open-source software for volunteer computing and grid computing.
Language:PHP2.1k 118 3.2k471
chapel-lang/chapel
a Productive Parallel Programming Language
Language:Chapel1.9k 62 7.1k427
mfem/mfem
Lightweight, general, scalable C++ library for finite element methods
Language:C++1.9k 129 2.3k519
hermit-os/hermit-rs
Hermit for Rust.
Language:Rust1.8k 18 11991
Maratyszcza/NNPACK
Acceleration package for neural networks on multi-core CPUs
Language:C1.7k 100 196315
AdaptiveCpp/AdaptiveCpp
Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications adapt themselves to all the hardware in the system - even at runtime!
Language:C++1.6k 42 631191
mratsim/Arraymancer
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Language:Nim1.4k 37 35095
ropensci/drake
An R-focused pipeline toolkit for reproducibility and high-performance computing
Language:R1.3k 34 1.1k129
trilinos/Trilinos
Primary repository for the Trilinos Project
Language:C++1.3k 114 5.4k582
hermit-os/kernel
A Rust-based, lightweight unikernel.
Language:Rust1.3k 13 24392
sail-sg/envpool
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
Language:C++1.1k 22 145108
uncomplicate/neanderthal
Fast Clojure Matrix Library
Language:Clojure1.1k 39 10256
ropensci/targets
Function-oriented Make-like declarative workflows for R
Language:R978 16 54374
Liu-xiandong/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
Language:Cuda968 13 16151
mateogianolio/vectorious
Linear algebra in TypeScript.
Language:TypeScript926 17 12644
openmc-dev/openmc
OpenMC Monte Carlo Code
Language:Python820 70 1.2k536
precice/precice
A coupling library for partitioned multi-physics simulations, including, but not restricted to fluid-structure interaction and conjugate heat transfer simulations.
Language:C++789 38 891187
Geant4/geant4
Geant4 toolkit for the simulation of the passage of particles through matter - NIM A 506 (2003) 250-303
Language:C++663 54 0327
zanellia/prometeo
An experimental Python-to-C transpiler and domain specific language for embedded high-performance computing
Language:Python632 15 1033
austinksmith/Hamsters.js
100% Vanilla Javascript Multithreading & Parallel Execution Library
Language:JavaScript591 27 6233
AMReX-Codes/amrex
AMReX: Software Framework for Block Structured AMR
Language:C++586 56 731378
LLNL/sundials
Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.
Language:C561 35 214140
MarioSieg/magnetron
(WIP) A small but powerful, homemade PyTorch from scratch.
Language:C++542 4 023
brucefan1983/GPUMD
Graphics Processing Units Molecular Dynamics
Language:Cuda533 26 207129
spcl/dace
DaCe - Data Centric Parallel Programming
Language:Python516 16 382133
DeveloperPaul123/thread-pool
A modern, fast, lightweight thread pool library based on C++20
Language:C++488 11 3141

high-performance-computing

taskflow/taskflow

Netflix/metaflow

google/tf-quant-finance

ProjectPhysX/FluidX3D

parallel101/course

alpa-projects/alpa

merrymercy/awesome-tensor-compilers

bshoshany/thread-pool

flame/blis

kokkos/kokkos

BOINC/boinc

chapel-lang/chapel

mfem/mfem

hermit-os/hermit-rs

Maratyszcza/NNPACK

AdaptiveCpp/AdaptiveCpp

mratsim/Arraymancer

ropensci/drake

trilinos/Trilinos

hermit-os/kernel

sail-sg/envpool

uncomplicate/neanderthal

ropensci/targets

Liu-xiandong/How_to_optimize_in_GPU

mateogianolio/vectorious

openmc-dev/openmc

precice/precice

Geant4/geant4

zanellia/prometeo

austinksmith/Hamsters.js

AMReX-Codes/amrex

LLNL/sundials

MarioSieg/magnetron

brucefan1983/GPUMD

spcl/dace

DeveloperPaul123/thread-pool