PotatoSpudowski/fastLLaMa
fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backend.
CMIT
Stargazers
- achuthasubhashGUNTUR , INDIA
- amirrezasalimiToolstack
- amrrs
- bailooLazy IITian
- balsulami
- Banguiskode
- Chensl9393
- chittiman
- cosimoiaiaBerlin
- covexpBarcelona
- dsindexhttps://github.com/kakaobrain
- enod
- Frostchomp
- Gochomer
- GuangDaiChina
- gurv234New Delhi
- ianrowanMindBuilder AI
- joe-barhouch
- jvicentemMadrid, Spain
- karelnagelAsius
- MachineLemonadeCincinnati
- MagnusPetersenFIAS
- migperferPh.D. Student at Pompeu Fabra University
- notmehulDowntown Coolsville
- ntkrnl
- rahulkolasseri
- RaulKiteUniversity of Murcia
- remghoostCalifornia
- shyamsn97
- SupreethRao99GrapheneAI
- thtran02
- tiendung
- UglyStupidHonestHaarlem Netherlands
- vigsivanAltis Labs
- xids2016
- zotona