/bodywork-core

ML pipeline orchestration and model deployments on Kubernetes.

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0


Bodywork is a command line tool that deploys machine learning pipelines to Kubernetes. It takes care of everything to do with containers and orchestration, so that you don't have to.

Who is this for?

Bodywork is aimed at teams who want a solution for running ML pipelines and deploying models to Kubernetes. It is a lightweight and simpler alternative to Kubeflow, or to building your own platform based around a workflow orchestration tool like Apache Airflow, Argo Workflows or Dagster.

Pipeline = Jobs + Services

Any stage in a Bodywork pipeline can do one of two things:

  • run a batch job, to prepare features, train models, compute batch predictions, etc.
  • start a long-running process, like a Flask app that serves model predictions via HTTP.

You can use these to compose pipelines for many common ML use-cases, from serving pre-trained models to running continuous training on a schedule.

No Boilerplate Code Required

Defining a stage is as simple as developing an executable Python module or Jupyter notebook that performs the required tasks, and then committing it to your project's Git repository. You are free to structure your codebase as you wish and there are no new APIs to learn.

Git project structure

Easy to Configure

Stages are assembled into DAGs that define your pipeline's workflow. This and other key configuration is contained within a single bodywork.yaml file.

Simplified DevOps for ML

Bodywork removes the need for you to build and manage container images for any stage of your pipeline. It works by running all stages using Bodywork's custom container image. Each stage starts by pulling all the required files directly from your project's Git repository (e.g., from GitHub), pip-installing any required dependencies, and then running the stage's designated Python module (or Jupyter notebook).

Get Started

Bodywork is distributed as a Python package - install it from PyPI:

Add a bodywork.yaml file to your Python project’s Git repo. The contents of this file describe how your project will be deployed:

Bodywork is used from the command-line to deploy projects to Kubernetes clusters. With a single command, you can start Bodywork containers (hosted by us on Docker Hub), that pull Python modules directly from your project’s Git repo, and run them:

You don’t need to build Docker images or understand how to configure Kuberentes resources. Bodywork will fill the gap between executable Python modules and operational jobs and services on Kubernetes.

If you’re new to Kubernetes then check out our guide to Kubernetes for ML - we’ll have you up-and-running with a test cluster on your laptop, in under 10 minutes.

Documentation

The documentation for bodywork-core can be found here. This is the best place to start.

Deployment Templates

To accelerate your project's journey to production, we provide deployment templates for common use-cases:

We want your Feedback

If Bodywork sounds like a useful tool, then please send us a signal with a GitHub ★

Contacting Us

If you:

  • Have a question that these pages haven't answered, or need help getting started with Kubernetes, then please use our discussion board.
  • Have found a bug, then please open an issue.
  • Would like to contribute, then please talk to us first at info@bodyworkml.com.
  • Would like to commission new functionality, then please contact us at info@bodyworkml.com.

Bodywork is brought to you by Bodywork Machine Learning.