Pinned Repositories
CUHKSZzxy.github.io
My personal blog!
Hello-World
The first repository that I created on the github.
KIVI
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Machine-Learning
SnapKV
RouteLLM
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!
GEAR
GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM
KVQuant
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
CUHKSZzxy's Repositories
CUHKSZzxy/CUHKSZzxy.github.io
My personal blog!
CUHKSZzxy/Hello-World
The first repository that I created on the github.
CUHKSZzxy/KIVI
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
CUHKSZzxy/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
CUHKSZzxy/Machine-Learning