tlc-pack/cutlass_fpA_intB_gemm
A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer
C++Apache-2.0
Stargazers
- BBufSkyWork
- chhzh123Cornell University
- Dazz993Univeristy of Toronto
- fengyuentauShenzhen, Guangdong, China
- fishelegs
- fly51flyPRIS
- frankxyyShanghai, China
- fxmartyHugging Face
- hhy3@zilliztech
- hyaihjq
- ice-tongBeijing
- JianchaoTanGeorge Mason University, CS Department
- JiaoYanMoGuXiaomi Corporation
- kfeng123
- l1nkrNankai University
- lambda7xxShanghai Jiao Tong University
- LeiWang1999Institute of Computing Technology, UCAS
- liangzelangJilin University
- lianxintao
- MARD1NOSiliconFlow
- MetaBluesBeijing, China
- niexiaokun123
- Oliver-ssDuke University
- peterjc123Shanghai, China
- petrexMountain View, California
- q121q
- Raphael-HaoShanghai Jiao Tong University
- SandalotsVolcanak
- senlyu163
- songkq
- SushantDaga
- tqchenCMU, OctoML
- vicwer
- Yanxing-ShiAMD.inc
- zhangjunBeijing
- ZhW-loop