/ml-workflow

machine learning workflow playground

Primary LanguagePythonOtherNOASSERTION

ML Workflow Playground (Deprecated)

This repo is a playground for implementing machine learning workflows which contain DAGs of dependent tasks. It's inspired by ideas from the blog post Patterns for Research in Machine Learning and the FBLearner Flow project, as well as personal experience working on the Raster Vision project. To make things concrete yet simple, this workflow implements object recognition using neural networks via PyTorch on the MNIST and CIFAR10 datasets. Hopefully, this repo will be a good starting point for new projects, even if they use a different framework. The goals are to make it easy to:

  • add and switch between problem types, datasets, model architectures, and hyperparameters
  • run complex workflows and cache computations shared between tasks
  • switch between running locally and in the cloud
  • test workflows in a pragmatic fashion

Getting started

Install Docker and run ./scripts/update which will build a Docker container. Create a directory to hold datasets and results and set it as an environment variable called DATA_DIR. Then, run ./scripts/run to run the container. To run an example task, run the following which will generate $DATA_DIR/results/recog/plot_data/mnist_default_train.png

python -m pt.recog.tasks.plot_data --namespace recog --dataset mnist \
  --split train --nimages 12

To run an entire Makefile-based workflow, run the following which will generate a set of directories inside $DATA_DIR/results/recog/mnist_tiny. Each subdirectory will contain the output of a task.

./workflows/recog/tests/mnist_tiny all

TODO

  • Implement object recognition workflow using command-line arguments and Makefiles
  • Testing with code coverage tool
  • Run in cloud using AWS Batch
  • Switch to using Luigi or Airflow
  • Support hyperparameter sweeps, ensembling and multiple post-processing tasks