PotatoSpudowski/fastLLaMa
fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backend.
CMIT
Stargazers
- Franky1Germany
- TimLeeGee
- liuyixin-louis
- sj-data
- dandelinSouth Korea
- ckoshka
- puddleglum56
- Technetium1USA
- seoeaa
- ellonde
- SatoshiVR
- sequoiarSHANGHAI, CHINA
- MarkSchmidtyDetroit, Michigan, USA
- thethiny
- GNamTr
- shaunabananaShanghai, China
- fly51flyBeiJing
- Sandalots
- dsx-aishanghai
- ericxsun
- imranraad07
- cbqinsuzhou
- qhduanBeijing, China
- xyangk
- ronithk
- hieultpViet Nam, Ho Chi Minh City
- yanqiangmiffyBeijing
- Varun0801Tamil Nadu, India
- unnikrishnannambiarIndia
- ElleLeonneMinnesota, USA
- tuna2134Japan
- rmallofSpain
- ptsalexndr
- eterna2Singapore
- dahlejAlabama, USA
- daviddavid