co-serving
There are 1 repositories under co-serving topic.
FineInfer
Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)
There are 1 repositories under co-serving topic.
Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)