qinsiyuan-cool
I am an undergraduate majoring in software engineering. Welcome communication and guidance.
qinsiyuan-cool's Stars
karpathy/nano-llama31
nanoGPT style version of Llama 3.1
zjhellofss/KuiperInfer
校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. 🎉🎉
zjhellofss/KuiperLLama
校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。
openmlsys/openmlsys-cuda
Tutorials for writing high-performance GPU operators in AI frameworks.
InfiniTensor/InfiniTensor
wangzhaode/llm-export
llm-export can export llm model to onnx.
HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese
This is a Chinese translation of the CUDA programming guide
karpathy/llm.c
LLM training in simple, raw C/CUDA
Liu-xiandong/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
Cjkkkk/CUDA_gemm
A simple high performance CUDA GEMM implementation.
AIoT-MLSys-Lab/Efficient-LLMs-Survey
[TMLR 2024] Efficient Large Language Models: A Survey
wgwang/awesome-LLMs-In-China
**大模型
forthespada/InterviewGuide
🔥🔥「InterviewGuide」是阿秀从校园->职场多年计算机自学过程的记录以及学弟学妹们计算机校招&秋招经验总结文章的汇总,包括但不限于C/C++ 、Golang、JavaScript、Vue、操作系统、数据结构、计算机网络、MySQL、Redis等学习总结,坚持学习,持续成长!
guaguaupup/cpp_interview
c++后台服务器开发面经或八股总结!(有深度有广度,和仅有概念的总结文章不同!)
nndeploy/nndeploy
nndeploy is an end-to-end model deployment framework. Based on multi-terminal inference and directed acyclic graph model deployment, it is committed to providing users with a cross-platform, easy-to-use, and high-performance model deployment experience.
Tony-Tan/CUDA_Freshman
sunface/rust-course
“连续八年成为全世界最受喜爱的语言,无 GC 也无需手动内存管理、极高的性能和安全性、过程/OO/函数式编程、优秀的包管理、JS 未来基石" — 工作之余的第二语言来试试 Rust 吧。本书拥有全面且深入的讲解、生动贴切的示例、德芙般丝滑的内容,这可能是目前最用心的 Rust 中文学习教程 / Book
sunface/rust-by-practice
Learning Rust By Practice, narrowing the gap between beginner and skilled-dev through challenging examples, exercises and projects.
jafioti/luminal
Deep learning at the speed of light.
BBuf/giantpandacv.com
www.giantpandacv.com
mlabonne/llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
kamranahmedse/developer-roadmap
Interactive roadmaps, guides and other educational content to help developers grow in their careers.
AniZpZ/AutoSmoothQuant
An easy-to-use package for implementing SmoothQuant for LLMs
DefTruth/CUDA-Learn-Notes
📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
pku-minic/koopa
Library for generating/parsing/optimizing Koopa IR.
BBuf/tvm_mlir_learn
compiler learning resources collect.
royalneverwin/My-Compiler
实现从SysY语言到riscv指令的编译器
chenguokai/acwj-rv
munificent/craftinginterpreters
Repository for the book "Crafting Interpreters"