Merle-Zhang's Stars
gpu-mode/resource-stream
GPU programming related news and material links
variar/klogg
Really fast log explorer based on glogg project
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
gpu-mode/lectures
Material for gpu-mode lectures
srush/Triton-Puzzles
Puzzles for learning Triton
rougier/numpy-100
100 numpy exercises (with solutions)
DefTruth/CUDA-Learn-Notes
🎉 Modern CUDA Learn Notes with PyTorch: fp32, fp16, bf16, fp8/int8, flash_attn, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.
systemdesignfightclub/SDFC
Roadmap and Resource Compilation for System Design Fight Club
openxla/community
Stores documents and resources used by the OpenXLA developer community
nod-ai/techtalks
azl397985856/leetcode
LeetCode Solutions: A Record of My Problem Solving Journey.( leetcode题解,记录自己的leetcode解题之路。)
maybe-finance/maybe
The OS for your personal finances
TodePond/DreamBerd
perfect programming language
danswer-ai/danswer
Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
Jokeren/Awesome-GPU
Awesome resources for GPUs
SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
practical-tutorials/project-based-learning
Curated list of project-based tutorials
SamyPesse/How-to-Make-a-Computer-Operating-System
How to Make a Computer Operating System in C++
openvinotoolkit/npu_plugin
OpenVINO NPU Plugin
AnthonyCalandra/modern-cpp-features
A cheatsheet of modern C++ language and library features.
checkcheckzz/system-design-interview
System design interview for IT companies
tzheng/SystemDesign
This is an interview preparation guide for software engineers. Includes behavior interview, system design and coding(Chinese).
sampsyo/bril
an educational compiler intermediate representation
mlc-ai/mlc-zh
merrymercy/awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
Liu-xiandong/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
BBuf/tvm_mlir_learn
compiler learning resources collect.
googlefonts/compute-shader-101
Sample code for compute shader 101 training
squidfunk/mkdocs-material
Documentation that simply works
OI-wiki/OI-wiki
:star2: Wiki of OI / ICPC for everyone. (某大型游戏线上攻略,内含炫酷算术魔法)