/mlflow-minio-h2o-example

MLflow-tracking server example with Minio and H2O

Primary LanguageJupyter Notebook

MLflow-tracking example with Minio and H2O

  • MLflow tracking in server mode requires a cloud storage bucket in order to store models and artifacts
  • Minio provides an AWS-S3-like object storage bucket that can be deployed on a private cloud
  • This repo contains an example of using mlflow-tracking w/ minio to run an ML binary classifier experiment using the H2O framework
  • H2O model artifacts will be accessible in MLflow including: plots of scoring history, roc curve, confusion matrix, and variable importance

Step-by-step guide

[1] start Minio server for example use docker and choose access and secret keys:

docker run -p 9000:9000 --name minio1 \
  -e MINIO_ACCESS_KEY=minio_access_key \
  -e MINIO_SECRET_KEY=minio_secret_key \
  -v /mnt/data:/data \
  -v /mnt/config:/root/.minio \
  minio/minio server /data 

[2]

conda env create -f environment.yml

[3]

source activate mlflow_minio

[4] set environment variables:

export MLFLOW_S3_ENDPOINT_URL=minio_url
export AWS_ACCESS_KEY_ID=minio_access_key
export AWS_SECRET_ACCESS_KEY=minio_secret_key
export MLFLOW_URL=mlflow_url

[5] create a Minio bucket. For example see: minio_create_bucket.ipynb

[6] start the mlflow-server. for example, set the env variables and Minio bucket name of this Dockerfile

[7] run mlflow_tracking_with_h2o_minio.ipynb. make sure step 4 was completed before running the notebook.

references: