/arena

Run Deep Learning Jobs on Kubernetes in An Easy Way.

Primary LanguageGoApache License 2.0Apache-2.0

Arena

Build Status Go Report Card

Overview

Arena is a command-line interface for the data scientists to run and monitor the machine learning training jobs and check their results in an easy way. Currently it supports solo/distributed TensorFlow training. In the backend, it is based on Kubernetes, helm and Kubeflow. But the data scientists can have very little knowledge about kubernetes.

Meanwhile, the end users require GPU resource and node management. Arena also provides top command to check avaliable GPU resources in the Kubernetes cluster.

In one word, Arena's goal is to make the data scientists feel like to work on a single machine but with the Power of GPU clusters indeed.

Setup

You can follow up the Installation guide

User Guide

Arena is a command-line interface to run and monitor the machine learning training jobs and check their results in an easy way. Currently it supports solo/distributed training.

Demo

Developing

Prerequisites:

  • Go >= 1.8
mkdir -p $GOPATH/github.com/kubeflow
cd $GOPATH/github.com/kubeflow
git clone https://github.com/AliyunContainerService/arena.git
cd arena
make

Then you can get arena binary from arena/bin

RoadMap

See RoadMap