AllenFind's Stars
hahnyuan/LLM-Viewer
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
ggerganov/ggml
Tensor library for machine learning
jssonx/awesome-gemm
📚 A curated list of awesome matrix-matrix multiplication (A * B = C) frameworks, libraries and software
mirage-project/mirage
Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA
OI-wiki/OI-wiki
:star2: Wiki of OI / ICPC for everyone. (某大型游戏线上攻略,内含炫酷算术魔法)
wangzyon/NVIDIA_SGEMM_PRACTICE
Step-by-step optimization of CUDA SGEMM
siboehm/SGEMM_CUDA
Fast CUDA matrix multiplication from scratch
cchan/tccl
extensible collectives library in triton
3b1b/manim
Animation engine for explanatory math videos
voideditor/void
wolfpld/tracy
Frame profiler
mobiusml/gemlite
Fast low-bit matmul kernels in Triton
Lightning-AI/lightning-thunder
Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.
CoffeeBeforeArch/CoffeeBeforeArch.github.io
AleksaMCode/cache-simulator
Trace-driven cache memory simulator with LRU, MRU, RR and Belady replacement policies.
itcharge/LeetCode-Py
⛽️「算法通关手册」:超详细的「算法与数据结构」基础讲解教程,从零基础开始学习算法知识,850+ 道「LeetCode 题目」详细解析,200 道「大厂面试热门题目」。
mlcommons/inference
Reference implementations of MLPerf™ inference benchmarks
acmbpdc/openlib.cs
📚 A Collection of Free & Open Resources for University Coursework in Computer Science.
huihut/interview
📚 C/C++ 技术面试基础知识总结,包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, including language, program library, data structure, algorithm, system, network, link loading library, interview experience, recruitment, recommendation, etc.
srush/Triton-Puzzles
Puzzles for learning Triton
igrek51/wat
Deep inspection of Python objects
e3b0c442/keywords
A list and count of keywords in programming languages.
srush/Tensor-Puzzles
Solve puzzles. Improve your pytorch.
puttsk/cuda-tutorial
A set of hands-on tutorials for CUDA programming
CisMine/Parallel-Computing-Cuda-C
CUDA Learning guide
numba/numba
NumPy aware dynamic Python compiler using LLVM
gpu-mode/lectures
Material for gpu-mode lectures
shap/shap
A game theoretic approach to explain the output of any machine learning model.
mfontanini/presenterm
A markdown terminal slideshow tool
maaslalani/slides
Terminal based presentation tool