2019-minicourse-submarine

DESIGN AND IMPLEMENTATION OF A MACHINE LEARNING PLATFORM

2019-minicourse-submarine slide, doc

What is Apache Submarine?

Apache Submarine is a unified AI platform which allows engineers and data scientists to run Machine Learning and Deep Learning workload in distributed cluster.

Goals of Submarine:

It allows jobs easy access data/models in HDFS and other storages.
Can launch services to serve TensorFlow/PyTorch models.
Support run distributed TensorFlow jobs with simple configs.
Support run user-specified Docker images.
Support specify GPU and other resources.
Support launch TensorBoard for training jobs if user specified.
Support customized DNS name for roles (like TensorBoard.$user.$domain:6006)

Prerequisites

Maven 3.3 or later ( 3.6.2 is known to fail, see SUBMARINE-273 )
JDK 1.8

Install mini-submarine

git clone https://github.com/apache/submarine.git
cd submarine
mvn clean install package -DskipTests
cd dev-support/mini-submarine 
./build_mini-submarine.sh

Pull from dockerhub without maven and java

docker pull hadoopsubmarine/mini-submarine:0.3.0-SNAPSHOT

Run mini-submarine

docker run -it -h submarine-dev --net=bridge --privileged -P local/mini-submarine:0.3.0-SNAPSHOT /bin/bash

# In the container, use root user to bootstrap hdfs and yarn
/tmp/hadoop-config/bootstrap.sh

su yarn
# Run distributed training on hadoop
cd && cd submarine && ./run_submarine_mnist_tony.sh

ML code in a real-world ML system is a lot smaller than the infrastructure

Deep learning use cases in the real world.

About this tutorial

After this tutorial, you will know :

Apache Submarine - is Cloud Native Machine Learning Platform

Apache airflow - a platform to programmatically author, schedule, and monitor workflows.

kaggle - an online community of data scientists and machine learners. Kaggle allows users to find and publish data sets, explore and build models in a web-based data-science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges.

jupyter notebook - an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text

mlflow - An open source platform for the machine learning lifecycle

Prerequisites

Ubuntu >= 16.04
Docker
Docker-compose
memory >= 5G

Installation

sudo apt install docker-compose # install docker-compose
sudo apt-get install docker.io # install docker
service docker status

cd airflow
vim kaggle.json
# {"username":"<Kaggle account username>", "key":"<API key>"}

Build

sudo docker-compose build

Usage

sudo docker-compose -f docker-compose.yml up

UI Links

mlflow : localhost:5000
jupyter notebook : localhost:7000
airflow : localhost:8080

1. Turn on airflow DAG

2. Trigger DAG

3. Open data_visualization.ipynb and start visualizing data

4. mlflow compare ml experiments

5. Try to optimize ML model

# open ./dags/src/training.py and tune parameters
params = {
        "colsample_bytree": 0.4603,
        "gamma": 0.0468,
        "learning_rate": 0.05,
        "max_depth": 20,
        "min_child_weight": 2,
        "n_estimators": 2200,
        "reg_alpha": 0.4640,
        "reg_lambda": 0.8571,
        "subsample": 0.5213,
        "random_state": 7,
        "nthread": -1
    }

6. Kaggle Leaderboard

Leaderboard

pingsutw/2019-minicourse-submarine

2019-minicourse-submarine

What is Apache Submarine?

Prerequisites

Install mini-submarine

Pull from dockerhub without maven and java

Run mini-submarine

ML code in a real-world ML system is a lot smaller than the infrastructure

Deep learning use cases in the real world.

About this tutorial

Prerequisites

Installation

Install Docker Desktop on Mac

Join Kaggle Competition

Setting kaggle user name and API key in kaggle.json

Build

Usage

UI Links

1. Turn on airflow DAG

2. Trigger DAG

3. Open data_visualization.ipynb and start visualizing data

4. mlflow compare ml experiments

5. Try to optimize ML model

6. Kaggle Leaderboard