From a model in ipython notebook to production API service in 5 minutes.
BentoML is a python library for packaging and deploying machine learning models. It provides high-level APIs for defining a ML service and packaging its artifacts, source code, dependencies, and configurations into a production-system-friendly format that is ready for deployment.
- Installation
- Getting Started
- Documentation (Coming soon!)
- Examples
- Releases and Contributing
- License
-
Multiple Distribution Format - Easily package your Machine Learning models into format that works best with your inference scenario:
- Docker Image - deploy as containers running REST API Server
- PyPI Package - integrate into your python applications seamlessly
- CLI tool - put your model into Airflow DAG or CI/CD pipeline
- Spark UDF - run batch serving on large dataset with Spark
- Serverless Function - host your model with serverless cloud platforms
-
Multiple Framework Support - BentoML supports a wide range of ML frameworks out-of-the-box including Tensorflow, PyTorch, Scikit-Learn, xgboost and can be easily extended to work with new or custom frameworks.
-
Deploy Anywhere - BentoML bundled ML service can be easily deploy with platforms such as Docker, Kubernetes, Serverless, Airflow and Clipper, on cloud platforms including AWS Lambda/ECS/SageMaker, Gogole Cloud Functions, and Azure ML.
-
Custom Runtime Backend - Easily integrate your python preprocessing code with high-performance deep learning model runtime backend (such as tensorflow-serving) to deploy low-latancy serving endpoint.
pip install bentoml
Verify installation:
bentoml --version
Let's get started with a simple scikit-learn model as example:
from sklearn import svm
from sklearn import datasets
clf = svm.SVC(gamma='scale')
iris = datasets.load_iris()
X, y = iris.data, iris.target
clf.fit(X, y)
To package this model with BentoML, you don't need to change anything in your training code. Simply create a new BentoService by subclassing it:
%%writefile iris_classifier.py
from bentoml import BentoService, api, env, artifacts
from bentoml.artifact import PickleArtifact
from bentoml.handlers import DataframeHandler
# You can also import your own python module here and BentoML will automatically
# figure out the dependency chain and package all those python modules
@artifacts([PickleArtifact('model')])
@env(conda_pip_dependencies=["scikit-learn"])
class IrisClassifier(BentoService):
@api(DataframeHandler)
def predict(self, df):
# arbitrary preprocessing or feature fetching code can be placed here
return self.artifacts.model.predict(df)
The @artifacts
decorator here tells BentoML what artifacts are required when
packaging this BentoService. Besides PickleArtifact
, BentoML also provides
TfKerasModelArtifact
, PytorchModelArtifact
, and TfSavedModelArtifact
etc.
@env
is designed for specifying the desired system environment in order for this
BentoService to load. Other ways you can use this decorator:
- If you already have a requirement.txt file listing all python libraries you need:
@env(requirement_txt='../myproject/requirement.txt')
- Or if you are running this code in a Conda environment that matches the desired production environment:
@env(with_current_conda_env=True)
Lastly @api
adds an entry point for accessing this BentoService. Each
api
will be translated into a REST endpoint when deploying as API
server, or a CLI command when running as a CLI
tool.
Each API also requires a Handler
for defining the expected input format. In
this case, DataframeHandler
will transform either a HTTP request or CLI
command arguments into a pandas Dataframe and pass it down to ther user defined
API function. BentoML also supports JsonHandler
, ImageHandler
and
TensorHandler
.
Next, to save your trained model for prodcution use with this custom BentoService class:
# 1) import the custom BentoService defined above
from iris_classifier import IrisClassifier
# 2) `pack` it with required artifacts
svc = IrisClassifier.pack(model=clf)
# 3) save packed BentoService as archive
svc.save('./bento_archive', version='v0.0.1')
# archive will saved to ./bento_archive/IrisClassifier/v0.0.1/
That's it. You've just created your first BentoArchive. It's a directory containing all the source code, data and configurations files required to load and run a BentoService. You will also find three 'magic' files generated within the archive directory:
bentoml.yml
- a YAML file containing all metadata related to this BentoArchiveDockerfile
- for building a Docker Image exposing this BentoService as REST API endpointsetup.py
- the config file that makes a BentoArchive 'pip' installable
For exposing your model as a HTTP API endpoint, you can simply use the bentoml serve
command:
bentoml serve ./bento_archive/IrisClassifier/v0.0.1/
Note you must ensure the pip and conda dependencies are available in your python
environment when using bentoml serve
command. More commonly we recommand using
BentoML API server with Docker:
You can build a Docker Image for running API server hosting your BentoML archive by using the archive folder as docker build context:
cd ./bento_archive/IrisClassifier/v0.0.1/
docker build -t iris-classifier .
Next, you can docker push
the image to your choice of registry for deployment,
or run it locally for development and testing:
docker run -p 5000:5000 iris-classifier
bentoml.load
is the enssential API for loading a BentoArchive into your
python application:
import bentoml
# yes it works with BentoArchive saved to s3 ;)
bento_svc = bentoml.load('s3://my-bento-svc/iris_classifier/')
bento_svc.predict(X[0])
BentoML also supports distributing a BentoService as PyPI package, with the
generated setup.py
file. A BentoArchive can be installed with pip
:
pip install ./bento_archive/IrisClassifier/v0.0.1/
import IrisClassifier
installed_svc = IrisClassifier.load()
installed_svc.predict(X[0])
With the setup.py
config, a BentoArchive can also be uploaded to pypi.org
as a public python package, or to your organization's private PyPI index for all
developers in your organization to use:
cd ./bento_archive/IrisClassifier/v0.0.1/
# You will need a ".pypirc" config file before doing this:
# https://docs.python.org/2/distutils/packageindex.html
python setup.py sdist upload
When pip install
a BentoML archive, it also provides you with a CLI tool for
accsing your BentoService's APIs from command line:
pip install ./bento_archive/IrisClassifier/v0.0.1/
IrisClassifier info # this will also print out all APIs available
IrisClassifier predict --input='./test.csv'
Alternatively, you can also use the bentoml
cli to load and run a BentoArchive
directly:
bentoml info ./bento_archive/IrisClassifier/v0.0.1/
bentoml predict ./bento_archive/IrisClassifier/v0.0.1/ --input='./test.csv'
We build BentoML because we think there should be a much simpler way for machine learning teams to ship models for production. They should not wait for engineering teams to re-implement their models for production environment or build complex feature pipelines for experimental models.
Our vision is to empower Machine Learning scientists to build and ship their own models end-to-end as production services, just like software engineers do. BentoML is enssentially this missing 'build tool' for Machine Learing projects.
All examples can be found in the BentoML/examples directory.
- Quick Start with sklearn
- Sentiment Analysis with Scikit-Learn
- Text Classification with Tensorflow Keras
- Fashion MNIST classification with Pytorch
- Fashion MNIST classification with Tensorflow Keras
- Deploy with Serverless framework
- More examples coming soon!
BentoML is under active development. Current version is a beta release, we may change APIs in future releases.
Want to help build BentoML? Check out our contributing documentation.
BentoML is GPL-3.0 licensed, as found in the COPYING file.