/odsc-west-workshop-2023

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Productionizing AI and LLM Apps with Ray Serve

ODSC West 2023

© 2023, Anyscale Inc. All Rights Reserved

join-ray-slack discuss twitter

Overview

Once our AI/ML models are ready for deployment, that's when the fun really starts. We need our AI-powered services to be resilient and efficient, scalable to demand and adaptable to heterogeneous environments (like using GPUs or TPUs as effectively as possible). Moreover, when we build applications around online inference, we often need to integrate different services: multiple models, data sources, business logic, and more.

Ray Serve was built so that we can easily overcome all of those challenges.

In this class we'll learn to use Ray Serve to compose online inference applications meeting all of these requirements and more. We'll build services that integrate with each other while autoscaling individually, even supporting individual hardware and software requirements -- all using regular Python and often with just one new line of code.

Motivating Scenario: Multilingual LLM Chat

For our example use case, we’ll see how to leverage Ray Serve to host a LLM Chat model and how to enhance it using additional services for multilingual interactions.

Learning Outcomes

  • Develop an understanding of the various architectural components of Ray Serve.
  • Use deployments and deployment graphs API to serve machine learning models in production environments for online inference.
  • Combine multiple models to build complex logic, allowing for a more sophisticated machine learning pipeline.

Topics discussed

  • Context of Ray Serve
  • Deployments
  • Service resources (e.g., CPU/GPU/...)
  • Runtime environments and dependencies
  • Composing deployments to build more complex applications
  • Architecture / Under-the-hood
  • Scaling, Performance, Batching, and more production patterns

Connect with the Ray community

You can learn and get more involved with the Ray community of developers and researchers: