/mpi-operator

Repository for the MPI operator.

Primary LanguageGoApache License 2.0Apache-2.0

MPI Operator

The MPI Operator makes it easy to run allreduce-style distributed training.

Deploy

kubectl create -f deploy/

Test

Launch a multi-node tensorflow benchmark training job:

kubectl create -f examples/tensorflow-benchmarks.yaml

Once everything starts, the logs are available in the launcher pod.