Paella: Low-latency Model Serving with Virtualized GPU Scheduling
Primary LanguageC++
No issues in this repository yet.