LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale
Primary LanguagePythonMIT LicenseMIT