A Fast and Scalable Generative Model Inference on Distributed Multi-GPU Environment (KCC 2023)
Primary LanguageCudaMIT LicenseMIT