/BentoML

🍱 Machine Learning Toolkit for packaging and deploying models

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

BentoML

From a model in ipython notebook to production API service in 5 minutes.

project status build status pypi status python versions

BentoML is a python library for packaging and deploying machine learning models. It provides high-level APIs for defining a ML service and packaging its artifacts, source code, dependencies, and configurations into a production-system-friendly format that is ready for deployment.


Feature Highlights

  • Multiple Distribution Format - Easily package your Machine Learning models into format that works best with your inference scenario:

    • Docker Image - deploy as containers running REST API Server
    • PyPI Package - integrate into your python applications seamlessly
    • CLI tool - put your model into Airflow DAG or CI/CD pipeline
    • Spark UDF - run batch serving on large dataset with Spark
    • Serverless Function - host your model with serverless cloud platforms
  • Multiple Framework Support - BentoML supports a wide range of ML frameworks out-of-the-box including Tensorflow, PyTorch, Scikit-Learn, xgboost and can be easily extended to work with new or custom frameworks.

  • Deploy Anywhere - BentoML bundled ML service can be easily deploy with platforms such as Docker, Kubernetes, Serverless, Airflow and Clipper, on cloud platforms including AWS Lambda/ECS/SageMaker, Gogole Cloud Functions, and Azure ML.

  • Custom Runtime Backend - Easily integrate your python preprocessing code with high-performance deep learning model runtime backend (such as tensorflow-serving) to deploy low-latancy serving endpoint.

Installation

python versions pypi status

pip install bentoml

Verify installation:

bentoml --version

Getting Started

Let's get started with a simple scikit-learn model as example:

from sklearn import svm
from sklearn import datasets

clf = svm.SVC(gamma='scale')
iris = datasets.load_iris()
X, y = iris.data, iris.target
clf.fit(X, y)

To package this model with BentoML, you don't need to change anything in your training code. Simply create a new BentoService by subclassing it:

%%writefile iris_classifier.py
from bentoml import BentoService, api, env, artifacts
from bentoml.artifact import PickleArtifact
from bentoml.handlers import DataframeHandler

# You can also import your own python module here and BentoML will automatically
# figure out the dependency chain and package all those python modules

@artifacts([PickleArtifact('model')])
@env(conda_pip_dependencies=["scikit-learn"])
class IrisClassifier(BentoService):

    @api(DataframeHandler)
    def predict(self, df):
        # arbitrary preprocessing or feature fetching code can be placed here 
        return self.artifacts.model.predict(df)

The @artifacts decorator here tells BentoML what artifacts are required when packaging this BentoService. Besides PickleArtifact, BentoML also provides TfKerasModelArtifact, PytorchModelArtifact, and TfSavedModelArtifact etc.

@env is designed for specifying the desired system environment in order for this BentoService to load. Other ways you can use this decorator:

  • If you already have a requirement.txt file listing all python libraries you need:
@env(requirement_txt='../myproject/requirement.txt')
  • Or if you are running this code in a Conda environment that matches the desired production environment:
@env(with_current_conda_env=True)

Lastly @api adds an entry point for accessing this BentoService. Each api will be translated into a REST endpoint when deploying as API server, or a CLI command when running as a CLI tool.

Each API also requires a Handler for defining the expected input format. In this case, DataframeHandler will transform either a HTTP request or CLI command arguments into a pandas Dataframe and pass it down to ther user defined API function. BentoML also supports JsonHandler, ImageHandler and TensorHandler.

Next, to save your trained model for prodcution use with this custom BentoService class:

# 1) import the custom BentoService defined above
from iris_classifier import IrisClassifier

# 2) `pack` it with required artifacts
svc = IrisClassifier.pack(model=clf)

# 3) save packed BentoService as archive
svc.save('./bento_archive', version='v0.0.1')
# archive will saved to ./bento_archive/IrisClassifier/v0.0.1/

That's it. You've just created your first BentoArchive. It's a directory containing all the source code, data and configurations files required to load and run a BentoService. You will also find three 'magic' files generated within the archive directory:

  • bentoml.yml - a YAML file containing all metadata related to this BentoArchive
  • Dockerfile - for building a Docker Image exposing this BentoService as REST API endpoint
  • setup.py - the config file that makes a BentoArchive 'pip' installable

Deployment & Inference Scenarios

Serving via REST API

For exposing your model as a HTTP API endpoint, you can simply use the bentoml serve command:

bentoml serve ./bento_archive/IrisClassifier/v0.0.1/

Note you must ensure the pip and conda dependencies are available in your python environment when using bentoml serve command. More commonly we recommand using BentoML API server with Docker:

You can build a Docker Image for running API server hosting your BentoML archive by using the archive folder as docker build context:

cd ./bento_archive/IrisClassifier/v0.0.1/

docker build -t iris-classifier .

Next, you can docker push the image to your choice of registry for deployment, or run it locally for development and testing:

docker run -p 5000:5000 iris-classifier

Loading BentoService in Python

bentoml.load is the enssential API for loading a BentoArchive into your python application:

import bentoml

# yes it works with BentoArchive saved to s3 ;)
bento_svc = bentoml.load('s3://my-bento-svc/iris_classifier/')
bento_svc.predict(X[0])

Use as PyPI Package

BentoML also supports distributing a BentoService as PyPI package, with the generated setup.py file. A BentoArchive can be installed with pip:

pip install ./bento_archive/IrisClassifier/v0.0.1/
import IrisClassifier

installed_svc = IrisClassifier.load()
installed_svc.predict(X[0])

With the setup.py config, a BentoArchive can also be uploaded to pypi.org as a public python package, or to your organization's private PyPI index for all developers in your organization to use:

cd ./bento_archive/IrisClassifier/v0.0.1/

# You will need a ".pypirc" config file before doing this:
# https://docs.python.org/2/distutils/packageindex.html
python setup.py sdist upload

Use as CLI tool

When pip install a BentoML archive, it also provides you with a CLI tool for accsing your BentoService's APIs from command line:

pip install ./bento_archive/IrisClassifier/v0.0.1/

IrisClassifier info  # this will also print out all APIs available

IrisClassifier predict --input='./test.csv'

Alternatively, you can also use the bentoml cli to load and run a BentoArchive directly:

bentoml info ./bento_archive/IrisClassifier/v0.0.1/

bentoml predict ./bento_archive/IrisClassifier/v0.0.1/ --input='./test.csv'

More About BentoML

We build BentoML because we think there should be a much simpler way for machine learning teams to ship models for production. They should not wait for engineering teams to re-implement their models for production environment or build complex feature pipelines for experimental models.

Our vision is to empower Machine Learning scientists to build and ship their own models end-to-end as production services, just like software engineers do. BentoML is enssentially this missing 'build tool' for Machine Learing projects.

Examples

All examples can be found in the BentoML/examples directory.

Releases and Contributing

BentoML is under active development. Current version is a beta release, we may change APIs in future releases.

Want to help build BentoML? Check out our contributing documentation.

License

BentoML is GPL-3.0 licensed, as found in the COPYING file.