Basic MNIST Example

This project implements a beginner classification task on MNIST dataset with a Convolutional Neural Network(CNN or ConvNet) model. This is a porting of pytorch/examples/mnist making it usables on FloydHub.

Usage

Training/Evaluating script:

usage: main.py [-h] [--dataroot DATAROOT] [--evalf EVALF] [--outf OUTF]
               [--ckpf CKPF] [--batch-size N] [--test-batch-size N]
               [--epochs N] [--lr LR] [--momentum M] [--no-cuda] [--seed S]
               [--log-interval N] [--train] [--evaluate]

PyTorch MNIST Example

optional arguments:
  -h, --help           show this help message and exit
  --dataroot DATAROOT  path to dataset
  --evalf EVALF        path to evaluate sample
  --outf OUTF          folder to output images and model checkpoints
  --ckpf CKPF          path to model checkpoint file (to continue training)
  --batch-size N       input batch size for training (default: 64)
  --test-batch-size N  input batch size for testing (default: 1000)
  --epochs N           number of epochs to train (default: 10)
  --lr LR              learning rate (default: 0.01)
  --momentum M         SGD momentum (default: 0.5)
  --no-cuda            disables CUDA training
  --seed S             random seed (default: 1)
  --log-interval N     how many batches to wait before logging training status
  --train              training a ConvNet model on MNIST dataset
  --evaluate           evaluate a [pre]trained model

If you want to use more GPUs set CUDA_VISIBLE_DEVICES as bash variable then run your script:

# CUDA_VISIBLE_DEVICES=2 python main.py  # to specify GPU id to ex. 2

MNIST CNN Architecture

Run on FloydHub

Here's the commands to training, evaluating and serving your MNIST ConvNet model on FloydHub.

Project Setup

Before you start, log in on FloydHub with the floyd login command, then fork and init the project (make sure you have already created the project on FloydHub):

$ git clone https://github.com/floydhub/mnist.git
$ cd mnist
$ floyd init mnist

Training

This project will automatically dowload and process the MNIST dataset for you, moreover I have already uploaded it as FloydHub dataset so that you can try and familiarize with --data parameter which mount the specified volume(datasets/model) inside the container of your FloydHub instance.

Now it's time to run our training on FloydHub. In this example we will train the model for 10 epochs with a gpu instance and with cuda enabled. Note: If you want to mount/create a dataset look at the docs.

$ floyd run --gpu --env pytorch-1.0 --data redeipirati/datasets/pytorch-mnist/1:input "python main.py --train"

Note:

--gpu run your job on a FloydHub GPU instance
--env pytorch-1.0, PyTorch 1.0 on Python3
--data redeipirati/datasets/pytorch-mnist/1 mounts the pytorch mnist dataset in the /input folder inside the container for our job so that we do not need to dowload it at training time.

You can follow along the progress by using the logs command. The training should take about 2 minutes on a GPU instance and about 15 minutes on a CPU one.

Evaluating

It's time to evaluate our model with some images:

floyd run --gpu --env pytorch-1.0 --data <REPLACE_WITH_JOB_OUTPUT_NAME>:resume "python main.py --evaluate --ckpf /resume/<REPLACE_WITH_MODEL_CHECKPOINT_PATH> --evalf ./test"

Notes:

I've prepared for you some images in the test folder that you can use to evaluate your model. Feel free to add on it a bunch of handwritten images download from the web or created by you.
Remember to evaluate images which are taken from a similar distribution, otherwise you will have bad performance due to distribution mismatch.

Try our pre-trained model

We have provided to you a pre-trained model trained for 10 epochs with an accuracy of 98%.

floyd run --gpu --env pytorch-1.0  --data redeipirati/datasets/pytorch-mnist-10-epochs-model/2:/model "python main.py --evaluate --ckpf /model/mnist_convnet_model_epoch_10.pth --evalf ./test"

Serve model through REST API

FloydHub supports seving mode for demo and testing purpose. If you run a job with --mode serve flag, FloydHub will run the app.py file in your project and attach it to a dynamic service endpoint:

floyd run --gpu --mode serve --env pytorch-1.0  --data <REPLACE_WITH_JOB_OUTPUT_NAME>:input

The above command will print out a service endpoint for this job in your terminal console. Or you can use the more name-friendly (static) serving URL that you will find in the Model API tab of your project(https://www.floydlabs.com/serve/<USERNAME>/projects/<PROJECT_NAME>)

The service endpoint will take a couple minutes to become ready. Once it's up, you can interact with the model by sending an handwritten image file with a POST request that the model will classify:

# Template
# curl -X POST -F "file=@<HANDWRITTEN_IMAGE>" -F "ckp=<MODEL_CHECKPOINT>" <SERVICE_ENDPOINT>

# e.g. of a POST req
curl -X POST -F "file=@./test/images/1.png" https://www.floydlabs.com/serve/BhZCFAKom6Z8RptVKskHZW

Any job running in serving mode will stay up until it reaches maximum runtime. So once you are done testing, remember to shutdown the job!

More resources

Some useful resources on MNIST and ConvNet:

Contributing

For any questions, bug(even typos) and/or features requests do not hesitate to contact me or open an issue!