mit-han-lab/qserve
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
PythonApache-2.0
Stargazers
- A-suozhangTsinghua University
- abacajsoftware eng building things
- cat538
- chuanmingliuWesteros
- DD-DuDaData Science and Analytic Thrust, Information Hub, HKUST(GZ)
- DefTruthStatistics Department of JNU
- dongxiaolong
- elejke
- Flying-CloudTsinghua University
- Guangxuan-XiaoMIT
- HakeemDemiLondon UK
- HaoKang-TimmyGeorgia Institute of Technology
- happierpigUC Berkeley
- HarahanSenseTime Research
- hkeee21
- Iris-Liu96
- jiaxiaosong1002Shanghai Jiao Tong University
- kentang-mitCambridge, Massachusetts, United States
- lmxyyMassachusetts Institute of Technology
- lvdongxuShanghai Jiao Tong University
- mikeshi80Shanghai Hyron Software Co Ltd.
- Minami-su
- ncstiles
- peterjc123Shanghai, China
- qipengwangPeking University
- ruikangliu
- SakitsMIT, EECS
- shadowpa0327Cornell University
- shieldforeverShanghai Jiao Tong University
- TGLTommy
- Xiuyu-LiUC Berkeley
- Xu-KaiNational University of Singapore
- ys-2020MIT
- yvonwin
- zhuoyang20MIT
- ZiruiOu