model-inference-service

There are 4 repositories under model-inference-service topic.

bentoml/BentoML
The easiest way to serve AI apps and models - Build reliable Inference APIs, LLM apps, Multi-model chains, RAG service, and much more!
Language:Python7k 78 1.1k780
bentoml/CLIP-API-service
CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search
Language:Jupyter Notebook48 5 03
bentoml/transformers-nlp-service
Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more
Language:Python43 6 13
ksm26/Efficiently-Serving-LLMs
Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.
Language:Jupyter Notebook9 1 03