In this lab, you'll practice everything you have learned during the lecture. We know there is quite a bit of math involved, but don't worry! Using Python and trying things out yourself will actually make a lot of things much more clear! Before we start, let's load some necessary libraries so we can import our data.
In this lab you will:
- Import images using Keras
- Build a "shallow" neural network from scratch
As usual, we'll start by importing the necessary packages that we'll use in this lab.
!pip install pillow
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
import numpy as np
import os
In this lab, you'll import a bunch of images to correctly classify them as "Santa", meaning that Santa is present on the image or "not Santa" meaning that something else is in the images.
If you have a look at this GitHub repository, you'll notice that the images are simply stored in .jpeg
files and stored under the folder '/data'
. Luckily, keras
has great modules that make importing images stored in this type of format easy. We'll do this for you in the cell below.
The images in the '/data'
folder have various resolutions. We will reshape them so they are all 64 x 64 pixels.
# Directory path
train_data_dir = 'data/train'
test_data_dir = 'data/validation'
# Get all the data in the directory data/validation (132 images), and reshape them
test_generator = ImageDataGenerator().flow_from_directory(
test_data_dir,
target_size=(64, 64), batch_size=132)
# Get all the data in the directory data/train (790 images), and reshape them
train_generator = ImageDataGenerator().flow_from_directory(
train_data_dir,
target_size=(64, 64), batch_size=790)
# Create the datasets
train_images, train_labels = next(train_generator)
test_images, test_labels = next(test_generator)
Note that we have four numpy arrays now: train_images
, train_labels
, test_images
, and test_labels
. We'll need to make some changes to the data in order to work with them, but before we do anything else, let's have a look at some of the images we loaded in train_images
. You can use array_to_img()
from keras.processing.image
on any image (select any train_image
using train_image[index]
to look at it).
# Preview an image
# Preview another image
Now, let's use np.shape()
to look at what these numpy arrays look like.
# Preview the shape of both the images and labels for both the train and test sets (4 objects total)
Let's start with train_images
. From the lesson, you might remember that the expected input shape is
train_images
is 790.
So, translated to this example, we need to transform our (790, 64, 64, 3)
matrix to a (64*64*3, 790)
matrix!
Hint: You should use both the
.reshape()
method and then transpose the result using.T
.
# Reshape the train images
train_img_unrow = None
Verify that the shape of the the newly created train_img_unrow
is correct.
# Preview the shape of train_img_unrow
Next, let's transform test_images
in a similar way. Note that the dimensions are different here! Where we needed to have a matrix shape of train_images
; for test_images
, we need to get to a shape of
# Define appropriate m
m = None
test_img_unrow = test_images.reshape(m, -1).T
# Preview the shape of test_img_unrow
Earlier, you noticed that train_labels
and test_labels
have shapes of
Let's have a closer look.
# Run this cell; no need to edit
train_labels
Looking at this, it's clear that for each observation (or image), train_labels
doesn't simply have an output of 1 or 0, but a pair - either [0, 1]
or [1, 0]
.
Having this information, we still don't know which pair corresponds with santa
versus not_santa
. Luckily, this was stored using keras.preprocessing_image
, and you can get more info using the command train_generator.class_indices
.
# Run this cell; no need to edit
train_generator.class_indices
Index 0 (the first column) represents not_santa
, index 1 represents santa
. Select one of the two columns and transpose the result such that you get 1
represents santa
.
# Your code here
train_labels_final = None
# Run this cell; no need to edit
np.shape(train_labels_final)
# Your code here
test_labels_final = None
# Run this cell; no need to edit
np.shape(test_labels_final)
As a final sanity check, look at an image and the corresponding label, so we're sure that santa is indeed stored as 1.
- First, use
array_to_image()
again on the originaltrain_images
with index 240 to look at this particular image - Use
train_labels_final
to get the 240th label
# Preview train image at index 240
# Preview train label at index 240
This seems to be correct! Feel free to try out other indices as well.
Remember that each RGB pixel in an image takes a value between 0 and 255. In Deep Learning, it is very common to standardize and/or center your dataset. For images, a common thing that is done is to make sure each pixel value is between 0 and 1. This can be done by dividing the entire matrix by 255. Do this here for the train_img_unrow
and test_img_unrow
.
# Your code here
train_img_final = None
test_img_final = None
type(test_img_unrow)
Now we can go ahead and build our own basic logistic regression-based neural network to distinguish images with Santa from images without Santa. You saw in the lesson that logistic regression can actually be represented as a very simple neural network.
Remember that we defined that, for each
The cost function is then given by:
In the remainder of this lab, you'll do the following:
- Initialize the parameters of the model
- Perform forward propagation, and calculate the current loss
- Perform backward propagation (which is basically calculating the current gradient)
- Update the parameters (gradient descent)
- remember that
$b$ is a scalar -
$w$ however, is a vector of shape$n$ x$1$ , with$n$ beinghorizontal_pixel x vertical_pixel x 3
Initialize
# Your code here
Define a function init_w()
, with a parameter n
. The function should return an array with zeros that has a shape
# Define your function
# Call your function using appropriate parameters
In forward propagation, you:
- get
x
- compute
y_hat
:
- You calculate the
cost
function:$J(w,b) = -\dfrac{1}{l}\displaystyle\sum_{i=1}^{l}y^{(i)}\log(\hat y^{(i)})+(1-y^{(i)})\log(1-\hat y^{(i)})$
Here are the two formulas you will be using to compute the gradients. Don't be scared by the mathematics. The long formulas are just to show that this corresponds with what we derived in the lesson!
# Define the propagation function
# Use the propogation function
dw, db, cost = None
print(dw)
print(db)
print(cost)
Next, in the optimization step, we have to update
Note that this optimization()
function uses the propagation()
function. It loops over the propagation()
function in each iteration, and updates both
# Complete the function below using your propagation function to define dw, db and cost
# Then use the formula above to update w and b in the optimization function
def optimization(w, b, x, y, num_iterations, learning_rate, print_cost = False):
costs = []
for i in range(num_iterations):
dw, db, cost = None
w = None
b = None
# Record the costs and print them every 50 iterations
if i % 50 == 0:
costs.append(cost)
if print_cost and i % 50 == 0:
print ("Cost after iteration %i: %f" %(i, cost))
return w, b, costs
# Run this block of code as is
w, b, costs = optimization(w, b, train_img_final, train_labels_final,
num_iterations= 151, learning_rate = 0.0001, print_cost = True)
Next, let's create a function that makes label predictions. We'll later use this when we will look at our Santa pictures. What we want is a label that is equal to 1 when the predicted
def prediction(w, b, x):
l = x.shape[1]
y_prediction = None
w = w.reshape(x.shape[0], 1)
y_hat = None
p = y_hat
for i in range(y_hat.shape[1]):
# Transform the probability into a binary classification using 0.5 as the cutoff
return y_prediction
Let's try this out on a small example. Make sure you have 4 predictions in your output here!
# Run this block of code as is
w = np.array([[0.035], [0.123], [0.217]])
b = 0.2
x = np.array([[0.2, 0.4, -1.2, -2],
[1, -2., 0.1, -1],
[0.2, 0.4, -1.2, -2]])
prediction(w, b, x)
Now, let's build the overall model!
# Review this code carefully
def model(x_train, y_train, x_test, y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):
b = 0
w = init_w(np.shape(x_train)[0])
# Gradient descent (≈ 1 line of code)
w, b, costs = optimization(w, b, x_train, y_train, num_iterations, learning_rate, print_cost)
y_pred_test = prediction(w, b, x_test)
y_pred_train = prediction(w, b, x_train)
# Print train/test errors
print('train accuracy: {} %'.format(100 - np.mean(np.abs(y_pred_train - y_train)) * 100))
print('test accuracy: {} %'.format(100 - np.mean(np.abs(y_pred_test - y_test)) * 100))
output = {'costs': costs,
'y_pred_test': y_pred_test,
'y_pred_train' : y_pred_train,
'w' : w,
'b' : b,
'learning_rate' : learning_rate,
'num_iterations': num_iterations}
return output
# Run the model!
# ⏰ Expect your code to take several minutes to run
output = model(train_img_final, train_labels_final, test_img_final,
num_iterations=2000, learning_rate=0.005, print_cost=True)
Well done! In this lab you built your first neural network in order to identify images of Santa! In the upcoming labs you'll see how to extend your neural networks to include a larger number of layers and how to then successively prune these complex schemas to improve test and train accuracies.