/prima.cpp

prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters

Primary LanguageC++MIT LicenseMIT

This repository is not active