drm-charlie

drm-charlie's Stars

aws-samples/reinvent2021-aim408-high-performance-cost-effective-model-deployment-amazon-sagemaker
This repo contains demo code for reInvent2021 session AIM408 Achieve high performance and cost-effective model deployment
Language:Jupyter Notebook103
predibase/lorax
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Language:Python2.1k141
S-LoRA/S-LoRA
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Language:Python1.7k92