mananshah99

@kumo-ai San Francisco, CA

mananshah99's Stars

golang/go
The Go programming language
Language:Go125k 3.4k 64.9k17.8k
kubernetes/kubernetes
Production-Grade Container Scheduling and Management
Language:Go112k 3.2k 46.4k40k
mingrammer/diagrams
:art: Diagram as Code for prototyping cloud system architectures
Language:Python40k 406 5132.6k
kilimchoi/engineering-blogs
A curated list of engineering blogs
Language:Ruby32k 1k 951.7k
karpathy/llm.c
LLM training in simple, raw C/CUDA
Language:Cuda24.9k 252 1412.8k
simdjson/simdjson
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
Language:C++19.5k 242 8491k
ahmetb/kubectx
Faster way to switch between clusters and namespaces in kubectl
Language:Go18k 134 2331.3k
benfred/py-spy
Sampling profiler for Python programs
Language:Rust13.1k 110 375439
adam-maj/tiny-gpu
A minimal GPU design in Verilog to learn how GPUs work from the ground up
Language:SystemVerilog7.3k 69 24553
google/gemma.cpp
lightweight, standalone C++ inference engine for Google's Gemma models.
Language:C++6.1k 41 88520
ReagentX/imessage-exporter
Export iMessage data + run iMessage Diagnostics
Language:Rust3.3k 27 160141
axboe/liburing
Library providing helpers for the Linux kernel io_uring support
Language:C2.9k 107 799411
unitycatalog/unitycatalog
Open, Multi-modal Catalog for Data & AI
Language:Java2.6k 52 280416
unum-cloud/ucall
Web Serving and Remote Procedure Calls at 50x lower latency and 70x higher bandwidth than FastAPI, implementing JSON-RPC & REST over io_uring ☎️
Language:C1.2k 19 3044
ianlancetaylor/libbacktrace
A C library that may be linked into a C/C++ program to produce symbolic backtraces
Language:C1k 36 97234
Liu-xiandong/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
Language:Cuda876 13 15138
roma-glushko/awesome-distributed-system-projects
🚀 List of distributed system projects for inspiration and learning to build distributed services from real world examples
768 8 272
PABannier/bark.cpp
Suno AI's Bark model in C/C++ for fast text-to-speech generation
Language:C++756 39 9162
baidu-research/baidu-allreduce
Language:Cuda572 69 7114
BobMcDear/attorch
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
Language:Python502 11 726
mattgodbolt/pt-three-ways
Path tracing, done three ways
Language:C++193 17 320
siboehm/ShallowSpeed
Small scale distributed training of sequential deep learning models, built on Numpy and MPI.
Language:Python111 3 04
eschluntz/PytorchBridge
Designing bridge trusses with Pytorch autograd
Language:Jupyter Notebook61 3 14
jkomoros/card-web
The web app behind thecompendium.cards
Language:TypeScript56 6 6448
AIWintermuteAI/whispercpp
Pybind11 bindings for Whisper.cpp
Language:C++48 4 39
okuvshynov/llama_duo
asynchronous/distributed speculative evaluation for llama3
Language:C++37 2 10
srush/anynp
Proof-of-concept of global switching between numpy/jax/pytorch in a library.
Language:Python18 2 0
evelynmitchell/shouldersOfGiants.rs
I have no idea what I'm doing , but llm.c in rust
Language:Python12 2 00
jpetazzo/color
Language:Go11 3 02
Tigerrr07/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
Language:Cuda1 0 00

mananshah99

mananshah99's Stars

golang/go

kubernetes/kubernetes

mingrammer/diagrams

kilimchoi/engineering-blogs

karpathy/llm.c

simdjson/simdjson

ahmetb/kubectx

benfred/py-spy

adam-maj/tiny-gpu

google/gemma.cpp

ReagentX/imessage-exporter

axboe/liburing

unitycatalog/unitycatalog

unum-cloud/ucall

ianlancetaylor/libbacktrace

Liu-xiandong/How_to_optimize_in_GPU

roma-glushko/awesome-distributed-system-projects

PABannier/bark.cpp

baidu-research/baidu-allreduce

BobMcDear/attorch

mattgodbolt/pt-three-ways

siboehm/ShallowSpeed

eschluntz/PytorchBridge

jkomoros/card-web

AIWintermuteAI/whispercpp

okuvshynov/llama_duo

srush/anynp

evelynmitchell/shouldersOfGiants.rs

jpetazzo/color

Tigerrr07/How_to_optimize_in_GPU