/fastgen

A Fast and Scalable Generative Model Inference on Distributed Multi-GPU Environment (KCC 2023)

Primary LanguageCudaMIT LicenseMIT

Watchers