Pinned Repositories
bcc
BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
cgroup-icmp-drop
This is a simple ebpf cgroup program, just used for eBPF learing
eBPF-learning
I'm a new beginner for eBPF, and this project is used to record the way to it
exl2-for-all
EXL2 quantization generalized to other models.
GenZ-LLM-Analyzer
LLM Inference analyzer for different hardware platforms
gobpf
Go bindings for creating BPF programs.
QuIP-for-all
QuIP quantization
quip-sharp
vllm-gptq
A high-throughput and memory-efficient inference and serving engine for LLMs
LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
ChuanhongLi's Repositories
ChuanhongLi/bcc
BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
ChuanhongLi/cgroup-icmp-drop
This is a simple ebpf cgroup program, just used for eBPF learing
ChuanhongLi/eBPF-learning
I'm a new beginner for eBPF, and this project is used to record the way to it
ChuanhongLi/exl2-for-all
EXL2 quantization generalized to other models.
ChuanhongLi/GenZ-LLM-Analyzer
LLM Inference analyzer for different hardware platforms
ChuanhongLi/gobpf
Go bindings for creating BPF programs.
ChuanhongLi/QuIP-for-all
QuIP quantization
ChuanhongLi/quip-sharp
ChuanhongLi/vllm-gptq
A high-throughput and memory-efficient inference and serving engine for LLMs
ChuanhongLi/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
ChuanhongLi/CacheBlend
ChuanhongLi/marlin
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.