PotatoSpudowski/fastLLaMa

fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backend.

CMIT

Readme
48Issues
408Stargazers
10Watchers

Watchers

eemailme
l0d0v1c
France
lin72h
miolini
SentientWave Inc.
PotatoSpudowski
Quillbot
Qubitium
ModelCloud.ai
SatoshiVR
sultanmirzapydev
tiendung
YannCat

Contact site admin: Geeks.