Keras Container on Nautilus

This project allows for the usage of Keras on a jupter notebook in Nautilus (as an importable package). With this project, we are able to train keras models on the Nautilus cloud.

Getting Started

These instructions will get you a copy of the project up and running on your namespace.

Prerequisites

Nautilus namespace Nvidia GPU

Components

The project has the following components:

- Dockerfile (Dockerfile)
- Continous Integration Yaml (.gitlab-ci.yml)
- An example jupter notebook (ClassificationExample.ipynb)
- Nautilus deployment Yaml (kerasDeloyment.yaml)

Dockerfile

This file is used to make the enviroment necessary to run Keras on Jupyter Notebook. Unless 
truely needed, please avoid editing this file.

Continous Integration Yaml

This file is used to utilize gitlab's continous integration feature. Nautilus uses kaniko instead of docker, which can be changed back into using a docker image by replacing the current .gitlab-ci.yml with the "dockerBased-ci.yml" file.  

Jupter notebook

This was the notebook I used to train an wildfire classification model. The structure and import commands can be used to utilize keras in 
other notebooks. I will go over the specific details below.  

Nautilus Deployment Yaml

If you are planning to use this implementation on another Nautilus namespace, this portion of the readme is especially important. Here are the important aspects of this yaml:

  1. Changing namespace address

    Changing the names
    Change the name and the namespace entries to the current working namespace and a suitable name

  2. Change the resource requests

    Change the resource limits and requests
    Change the numbers to suit the task

  3. Mount volumne

    Mount Volume onto a path if already created. To find out how to create a persistent volumne claim, refer to Nautilus documentation
    Very important for crash-resistance. I highly recommend saving all work onto mounted directory

  4. Choose GPU type

    Choose correctly
    If doing intensive training, choose larger/more expensive GPUs

Using the Components

Starting the development and accessing jupyter notebook

  1. Go into kerasDeloyment.yaml file

  2. Choose the RAW file format

  3. copy url of RAW file

  4. execute yaml file on nautilius namespace

  5. exec into nautilus pod

  6. Navigate to /userdata/kerasData and Start Jupyter Notebook


    Note: The port number choice does not matter, as long as there are not other processes running on that port. If a port is already in use, jupyter will automatically assign another port. Make sure to match the port number in the next step


    What happens when a wrong port is chosen

  7. Go to your computer terminal and start port-forward, matching the port in the pod

  8. Go to the localhost address

  9. Test for keras Create a new notebook or use the ClassificationExample.ipynb file

  • Run the following tests


Make sure that the outputs return True or some name.
You are now ready to use Keras on a jupyter notebook hosted on Kubernetes

Using Keras in Notebook

EXTREMELY IMPORTANT!

In order to prevent Keras from assigning too much GPU memory and stalling training efforts later on, run this:
If you see an error, shutdown the network server and try again

If you see nvidia-smi memory allocation at 0/- you have suceeded in reseting the GPU

Please refer to Keras Documentation for instructions and information

I used the notebook for the following:

  • Training a CNN on the notebook for reference
  • Using a LearningRateFinder to find the optimal learning rate

Using the Fire-Classification training

  1. Write the network using Keras layers


  2. Set the paths

    The following must be set
  • FIRE_PATH = Path of the directory with the fire images
  • Non_FIRE_PATH = Path of the directory with images without fire
  • MODEL_PATH = Path where the saved model file should go
  • LRFIND_PLOT_PATH = Where the learning rate finder graph should go
  • TRAINING_PLOT_PATH = Where the training plot graph (loss & accuracy graphs) shoud go
  1. Loading Data There shouldn't be a need to edit this, unless another data loading solution is desired. This section also splits the data into training
  2. Image Load Tester Tests the images to see if the loading worked
  3. Model Initialization

  • The width, height and depth is the data format. Classes are the number of condiitons in the data. In our case: ["Fire", "Not-Fire"]
  • Change the optimization function if you know what you are doing. We are using a starndard SDG
  1. Learning Rate Finder
    Run to find the place where the Network starts to learn

    More information is availbe here pyimagesearch

Finally, fill out the INIT_LR from what you learned from above
7. Train

8. Get results

You will find the accuracy measures in the table. Find the model in fire_detection.model

Contributors

Acknowledgments

  • The Dockerfile is from the Dockerhub of the Keras team
  • The Fire CNN and the Learning Rate finder is adapted from Adrain's excellent blog on first-detection - Pyimagesearch