Building a CNN from Scratch - Lab

Introduction

Now that you have background knowledge regarding how CNNs work and how to implement them via Keras, its time to pratice those skills a little more independently in order to build a CNN on your own to solve a image recognition problem. In this lab, you'll practice building an image classifier from start to finish using a CNN.

Objectives

You will be able to:

  • Transform images into tensors
  • Build a CNN model for image recognition

Loading the Images

The data for this lab concerns classifying lung xray images for pneumonia. The original dataset is from kaggle. We have downsampled this dataset in order to reduce trainging time for you when you design and fit your model to the data. It is anticipated that this process will take approximately 1 hour to run on a standard machine, although times will vary depending on your particular computer and set up. At the end of this lab, you are welcome to try training on the complete dataset and observe the impact on the model's overall accuracy.

You can find the initial downsampled dataset in a subdirectory, chest_xray, of this repository.

#Your code here; load the images; be sure to also preprocess these into tensors.

Designing the Model

Now it's time to design your CNN! Remember a few things when doing this:

  • You should alternate convolutional and pooling layers
  • You should have later layers have a larger number of parameters in order to detect more abstract patterns
  • Add some final dense layers to add a classifier to the convolutional base
#Your code here; design and compile the model

Training and Evaluating the Model

Remember that training deep networks is resource intensive: depending on the size of the data, even a CNN with 3-4 successive convolutional and pooling layers is apt to take a hours to train on a high end laptop. Using 30 epochs and 8 layers (alternating between convolutional and pooling), our model took about 40 minutes to run on a year old macbook pro.

If you are concerned with runtime, you may want to set your model to run the training epochs overnight.

If you are going to run this process overnight, be sure to also script code for the following questions concerning data augmentation. Check your code twice (or more) and then set the notebook to run all, or something equivalent to have them train overnight.

#Set the model to train; see warnings above
# Plot history

Save the Model

#Your code here; save the model for future reference.

Data Augmentation

Recall that data augmentation is typically always a necessary step when using a small dataset as this one which you have been provided. As such, if you haven't already, implement a data augmentation setup.

Warning: This process took nearly 4 hours to run on a relatively new macbook pro. As such, it is recommended that you simply code the setup and compare to the solution branch, or set the process to run overnight if you do choose to actually run the code.

#Add data augmentation to the model setup and set the model to train; 
#See warnings above if you intend to run this block of code

Final Evaluation

Now use the test set to perform a final evaluation on your model of choice.

# Your code here; perform a final evaluation using the test set..

Extension: Adding Data to the Model

As discussed, the current dataset we worked with is a subset of a dataset hosted on Kaggle. Increasing the data that we use to train the model will result in additional performance gains but will also result in longer training times and be more resource intensive.

It is estimated that training on the full dataset will take approximately 4 hours (and potentially significantly longer) depending on your computer's specifications.

In order to test the impact of training on the full dataset, start by downloading the data from kaggle here: https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia.

#Optional extension; Your code here

Summary

Well done! In this lab, you practice building your own CNN for image recognition which drastically outperformed our previous attempts using a standard deep learning model alone. In the upcoming sections, we'll continue to investigate further techniques associated with CNNs including visualizing the representations they learn and techniques to further bolster their performance when we have limited training data such as here.