zejia-lin

Ph.D student @sysu @arcsysu. GPU, Compiler, MLSys. φ(^∇^*) 🎶

Sun Yat-sen UniversityGuangzhou

zejia-lin's Stars

efeslab/Nanoflow
A throughput-oriented high-performance serving framework for LLMs
Language:Cuda45016
Predidit/Kazumi
基于自定义规则的番剧采集APP，支持流媒体在线观看，支持弹幕。
Language:Dart3.6k83
google-research/vision_transformer
Language:Jupyter Notebook10.1k1.3k
weishengying/tiny-flash-attention
使用 cutlass 实现 flash-attention 精简版，具有教学意义
Language:Cuda291
xdit-project/xDiT
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters
Language:Python46840
tspeterkim/flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
Language:Cuda55749
NVIDIA/TensorRT-Incubator
Experimental projects related to TensorRT
Language:MLIR627
NVIDIA/cuda-python
CUDA Python Low-level Bindings
Language:Python84966
travitch/whole-program-llvm
A wrapper script to build whole-program LLVM bitcode files
Language:Python681124
MetaCubeX/mihomo
A simple Python Pydantic model for Honkai: Star Rail parsed data from the Mihomo API.
Language:Python15.3k2.5k
HPMLL/BurstGPT
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
Language:Python1105
alibaba/rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
Language:C++50647
Hannibal046/Awesome-LLM
Awesome-LLM: a curated list of Large Language Model
17.3k1.4k
microsoft/sarathi-serve
A low-latency & high-throughput serving engine for LLMs
Language:Python17225
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Language:Python19.5k2.5k
EfficientLLMSys/MuxServe
Language:Jupyter Notebook112
microsoft/chunk-attention
Language:Python294
pytorch/workshops
This is a repository for all workshop related materials.
Language:Jupyter Notebook20484
j2kun/mlir-tutorial
MLIR For Beginners tutorial
Language:C++72457
Whisky-App/Whisky
A modern Wine wrapper for macOS built with SwiftUI
Language:Swift11.9k253
siyan-zhao/prepacking
The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models"
Language:Jupyter Notebook572
hahnyuan/LLM-Viewer
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
Language:Python27531
pengsida/learning_research
本人的科研经验
5.5k334
zjhellofss/KuiperInfer
校招、秋招、春招、实习好项目！带你从零实现一个高性能的深度学习推理库，支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step
Language:C++2.4k271
jeffreysijuntan/lloco
The official repo for "LLoCo: Learning Long Contexts Offline"
Language:Python1049
karpathy/llm.c
LLM training in simple, raw C/CUDA
Language:Cuda23.2k2.6k
intel-analytics/ipex-llm
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Language:Python6.5k1.2k
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python36.2k5.7k
chenzomi12/AISystem
AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
Language:Jupyter Notebook10.4k1.5k
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
2.4k161

zejia-lin

zejia-lin's Stars

efeslab/Nanoflow

Predidit/Kazumi

google-research/vision_transformer

weishengying/tiny-flash-attention

xdit-project/xDiT

tspeterkim/flash-attention-minimal

NVIDIA/TensorRT-Incubator

NVIDIA/cuda-python

travitch/whole-program-llvm

MetaCubeX/mihomo

HPMLL/BurstGPT

alibaba/rtp-llm

Hannibal046/Awesome-LLM

microsoft/sarathi-serve

microsoft/unilm

EfficientLLMSys/MuxServe

microsoft/chunk-attention

pytorch/workshops

j2kun/mlir-tutorial

Whisky-App/Whisky

siyan-zhao/prepacking

hahnyuan/LLM-Viewer

pengsida/learning_research

zjhellofss/KuiperInfer

jeffreysijuntan/lloco

karpathy/llm.c

intel-analytics/ipex-llm

karpathy/nanoGPT

chenzomi12/AISystem

DefTruth/Awesome-LLM-Inference