James-QiuHaoran/FineInfer
Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)
PythonMIT
No issues in this repository yet.
Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)
PythonMIT
No issues in this repository yet.