tgujar's Stars
mikael-s-persson/templight
Templight is a Clang-based tool to profile the time and memory consumption of template instantiations and to perform interactive debugging sessions to gain introspection into the template instantiation process.
simdjson/simdjson
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
facebookincubator/nimble
New file format for storage of large columnar datasets.
yallop/ocaml-flap
A deterministic parser with fused lexing
rapidsai/kvikio
KvikIO - High Performance File IO
Cloud-Code-AI/kaizen
AI powered tool to help software teams with Quality Assurance
cmu-db/optd
CMU-DB's Cascades optimizer framework
NVlabs/CGBN
CGBN: CUDA Accelerated Multiple Precision Arithmetic (Big Num) using Cooperative Groups
NVIDIA/jitify
A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).
duckdb/duckdb
DuckDB is an analytical in-process SQL database management system
andreasfertig/cppinsights
C++ Insights - See your source code with the eyes of a compiler
NVIDIA/MatX
An efficient C++17 GPU numerical computing library with Python-like syntax
facebook/zstd
Zstandard - Fast real-time compression algorithm
RoaringBitmap/CRoaring
Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, and StarRocks
flame/blis
BLAS-like Library Instantiation Software Framework
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
dendibakh/perf-ninja
This is an online course where you can learn and master the skill of low-level performance analysis and tuning.
async-profiler/async-profiler
Sampling CPU and HEAP profiler for Java featuring AsyncGetCallTrace + perf_events
Steamgjk/Nezha
Nezha: Deployable and High-Performance Consensus Using Synchronized Clocks
rapidsai/rmm
RAPIDS Memory Manager
toddwschneider/nyc-taxi-data
Import public NYC taxi and for-hire vehicle (Uber, Lyft) trip data into a PostgreSQL or ClickHouse database
ParRes/Kernels
This is a set of simple programs that can be used to explore the features of a parallel platform.
fenbf/AwesomePerfCpp
A curated list of awesome C/C++ performance optimization resources: talks, articles, books, libraries, tools, sites, blogs. Inspired by awesome.
olcf/cuda-training-series
Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)
NVIDIA/cccl
CUDA C++ Core Libraries
NVIDIA/cuCollections
NVIDIA/nvbench
CUDA Kernel Benchmarking Library
CERT-Polska/phobos-cuda-decryptor-poc
NVIDIA/spark-rapids
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
NVIDIA/nvbandwidth
A tool for bandwidth measurements on NVIDIA GPUs.