model-inference-service

There are 4 repositories under model-inference-service topic.

  • BentoML

    bentoml/BentoML

    The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

    Language:Python7.2k771.1k793
  • bentoml/CLIP-API-service

    CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search

    Language:Jupyter Notebook54504
  • bentoml/transformers-nlp-service

    Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more

    Language:Python43613
  • ksm26/Efficiently-Serving-LLMs

    Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.

    Language:Jupyter Notebook9103