qinsiyuan-cool

I am an undergraduate majoring in software engineering. Welcome communication and guidance.

qinsiyuan-cool's Stars

inside-compiler/llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
7
inside-compiler/Inside-LLVM-Code-Gen
图书《深入理解LLVM代码生成》的配套示例代码
Language:LLVM15
0voice/introduce_c-cpp_manual
一个收集C/C++新手学习的入门项目，整理收纳开发者开源的小项目、工具、框架、游戏等，视频，书籍，面试题/算法题，技术文章。
Language:C++6k770
LearningInfiniTensor/TinyInfiniTensor
Language:C++661
hahnyuan/LLM-Viewer
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
Language:Python35944
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）
Language:HTML12.5k1.3k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python14.9k1.4k
AdvancedCompiler/AdvancedCompiler
先进编译实验室的个人主页
Language:C++343
0voice/interview_internal_reference
2023年最新总结，阿里，腾讯，百度，美团，头条等技术面试题目，以及答案，专家出题人分析汇总。
Language:Python36.7k9.4k
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Language:Python2.7k217
Bruce-Lee-LY/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
Language:Cuda32668
THU-DSP-LAB/llvm-project
LLVM OpenCL C compiler suite for ventus GPGPU
Language:C++3816
SqueezeAILab/LLMCompiler
[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling
Language:Python1.6k116
zjin-lcf/HeCBench
Language:C++22680
ai-dawang/PlugNPlay-Modules
Language:Python2.8k236
hustvl/Vim
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Language:Python3.1k208
NVlabs/instant-ngp
Instant neural graphics primitives: lightning fast NeRF and more
Language:Cuda16.2k1.9k
microsoft/BitBLAS
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
Language:Python48234
microsoft/T-MAC
Low-bit LLM inference on CPU with lookup table
Language:C++63348
karpathy/nano-llama31
nanoGPT style version of Llama 3.1
Language:Python1.3k67
zjhellofss/KuiperInfer
校招、秋招、春招、实习好项目！带你从零实现一个高性能的深度学习推理库，支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step
Language:C++2.6k302
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. 🎉🎉
3.1k211
zjhellofss/KuiperLLama
校招、秋招、春招、实习好项目，带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。
Language:C++25660
openmlsys/openmlsys-cuda
Tutorials for writing high-performance GPU operators in AI frameworks.
Language:Cuda12616
InfiniTensor/InfiniTensor
Language:C++18030
wangzhaode/llm-export
llm-export can export llm model to onnx.
Language:Python25530
HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese
This is a Chinese translation of the CUDA programming guide
1.4k218
karpathy/llm.c
LLM training in simple, raw C/CUDA
Language:Cuda24.9k2.8k
Liu-xiandong/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
Language:Cuda878138
Cjkkkk/CUDA_gemm
A simple high performance CUDA GEMM implementation.
Language:Cuda34137

qinsiyuan-cool

qinsiyuan-cool's Stars

inside-compiler/llvm-project

inside-compiler/Inside-LLVM-Code-Gen

0voice/introduce_c-cpp_manual

LearningInfiniTensor/TinyInfiniTensor

hahnyuan/LLM-Viewer

liguodongiot/llm-action

Dao-AILab/flash-attention

AdvancedCompiler/AdvancedCompiler

0voice/interview_internal_reference

ModelTC/lightllm

Bruce-Lee-LY/cuda_hgemm

THU-DSP-LAB/llvm-project

SqueezeAILab/LLMCompiler

zjin-lcf/HeCBench

ai-dawang/PlugNPlay-Modules

hustvl/Vim

NVlabs/instant-ngp

microsoft/BitBLAS

microsoft/T-MAC

karpathy/nano-llama31

zjhellofss/KuiperInfer

DefTruth/Awesome-LLM-Inference

zjhellofss/KuiperLLama

openmlsys/openmlsys-cuda

InfiniTensor/InfiniTensor

wangzhaode/llm-export

HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese

karpathy/llm.c

Liu-xiandong/How_to_optimize_in_GPU

Cjkkkk/CUDA_gemm