drm-charlie's Stars
aws-samples/reinvent2021-aim408-high-performance-cost-effective-model-deployment-amazon-sagemaker
This repo contains demo code for reInvent2021 session AIM408 Achieve high performance and cost-effective model deployment
predibase/lorax
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
S-LoRA/S-LoRA
S-LoRA: Serving Thousands of Concurrent LoRA Adapters