MakeModel
Opened this issue · 12 comments
Preprocessing the dataset
The greyscale assigned to each pixel within an image has a value range of 0-255. We will want to flatten (smoosh… scale…) this range to 0-1. To achieve this flattening, we will exploit the data structure that our images are stored in, arrays. You see, each image is stored as a 2-dimensional array where each numerical value in the array is the greyscale code of particular pixel. Conveniently, if we divide an entire array by a scalar we generate a new array whose elements are the original elements divided by the scalar.
>>> train_images = train_images / 255.0
>>> test_images = test_images / 255.0
>>>
Two vital notes about the above.
- Use the value "255.0". This value is a floating point number (float), and will always return a float during algebraic operations. In Python, the division operator always returns a float to avoid rounding; but, that is not true for all programming languages, so it's a good habit to include that decimal because it automatically sets that number to be a float.
- Do not rescale the train_labels or test_labels arrays, these values are already in the range 0-9, as they should be!
Enter a comment (TRUE or FALSE) about the following statement:
We need to rescale both the images and labels, so they are on the same scale.
FALSE
Nailed it!
Remember, the label arrays are only used to associate images with their lables.
Model Generation
Every NN is constructed from a series of connected layers that are full of connection nodes. Simple mathematical operations are undertaken at each node in each layer, yet through the volume of connections and operations, these ML models can perform impressive and complex tasks.
Our model will be constructed from 3 layers. The first layer – often referred to as the Input Layer – will intake an image and format the data structure in a method acceptable for the subsequent layers. In our case, this first layer will be a Flatten layer that intakes a multi-dimensional array and produces an array of a single dimension, this places all the pixel data on an equal depth during input. Both of the next layers will be simple fully connected layers, referred to as Dense layers, with 128 and 10 nodes respectively. These fully connected layers are the simplest layer in the sense of understanding, yet allow for the greatest number of layer-to-layer connections and relationships.
The final bit of hyper-technical knowledge you'll need to learn is that each layer can have its own particular mathematical operation. These activation functions determine the form and relationship between the information provided by the layer. The first dense layer will feature a Rectified Linear Unit (ReLU) Activation Function that outputs values between zero and 1; mathematically, the activation function behaves like f(x)=max(0,x). The final layer uses the softmax activation function. This function also produces values in the 0-1 range, BUT generates these values such that the sum of the outputs will be 1! This makes the softmax a layer that is excellent at outputting probabilities.
>>> model = keras.Sequential([ keras.layers.Flatten(input_shape=(28,28)), keras.layers.Dense(128, activation=tf.nn.relu), keras.layers.Dense(10, activation=tf.nn.softmax)])
WARNING: Logging before flag parsing goes to stderr.
W0824 22:50:02.551490 8392 deprecation.py:506] From C:\Users\ross.hoehn\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
>>>
Enter a comment (TRUE or FALSE) about the following statement:
The softmax activation function will generate values which will add up to 1
FALSE
Sorry! 😭 "FALSE" is not the right answer. The correct answer is: "TRUE" Leave a comment with the right answer to continue.😄
TRUE
That's right! ✔️
Softmax activation not only flattens each value (between 0 and 1) but also scales everything to add up to 1.
Training the Model
Models must be both compiled and trained prior to use. When compiling we must define a few more parameters that control how models are updated during training (optimizer), how the model's accuracy is measured during training (loss function), and what is to be measured to determine the model's accuracy (metrics). These values were selected for this project, yet are generally dependent on the model's intent and expected input and output.
>>> model.compile( optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
>>>
Now we can begin training our model! Now, with already having generated and compiled the model, the code required to train the model is a single line.
>>> model.fit(train_images, train_labels, epochs=5)
2019-08-24 22:56:32.884249: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
Epoch 1/5
60000/60000 [==============================] - 2s 40us/sample - loss: 0.4985 - acc: 0.8264
Epoch 2/5
60000/60000 [==============================] - 2s 36us/sample - loss: 0.3787 - acc: 0.8632
Epoch 3/5
60000/60000 [==============================] - 2s 36us/sample - loss: 0.3368 - acc: 0.8766
Epoch 4/5
60000/60000 [==============================] - 2s 35us/sample - loss: 0.3122 - acc: 0.8863
Epoch 5/5
60000/60000 [==============================] - 2s 35us/sample - loss: 0.2962 - acc: 0.8901
<tensorflow.python.keras.callbacks.History object at 0x00000133F219C470>
>>>
This single line completes the entire job of training our model, but let's take a brief look at the arguments provided to the model.fit command.
- The first argument is input data, and recall that our input Flatten layer takes a (28,28) array, conforming to the dimensionality of our images.
- Next we train the system by providing the correct classification for all the training examples.
- The final argument is the number of epochs undertaken during training; each epoch is a training cycle over all the training data. Our setting the epoch value to 5 means that the model will be trained overall 60,000 training examples 5 times. After each epoch, we get both the value of the loss function and the model's accuracy (88.97% after epoch 5) at this epoch.
Leave a comment with the answer to this question:
Which argument in the model.fit method is used to classify our data into categories?(1,2, or 3)?
train_images
Sorry! 😭 "train_images" is not the right answer. The correct answer is: "2" Leave a comment with the right answer to continue.😄
2
That's correct!
This is why we used the train_labels array as the second argument.
Evaluating Our Model
Now we are working with a functional and trained NN model. Following our logic from the top, we have built a NN that intakes a (28,28) array, flattens the data into a (784) array, compiled and trained 2 dense layers, and the softmax activation function of the final output layer will provide a probability that the image belongs to each of the 10 label categories.
Our model can be evaluated by using the model.evaluate command, that takes in the images and labels so that it can compare its predictions to the ground truth provided by the labels. Model.evaluate provides two outputs, the value of the loss function over the testing examples, and the accuracy of the model over this testing population. The important output for us is the model's accuracy.
>>> test_loss, test_acc = model.evaluate(test_images, test_labels)
10000/10000 [==============================] - 0s 27us/sample - loss: 0.3543 - acc: 0.8721
>>> print(test_acc)
0.8721
>>>
This is great! Our model performs at an accuracy of 87.21%. As good as that is, it is lower than the model accuracy promised above (89.01%). This lower performance is due to the model overfitting on the training data. Overfitting occurs when there are too many parameters within the model when compared to the number of training instances; this allows the model to over learn on those limited examples. Overfitting leads to better model performance over non-training data.
That said, 87.21% is a decent number! Let's finally learn how you can feed our model the series of test examples from the test_images array, and have it provide its predictions.
>>> predictions = model.predict(test_images)
>>> predictions[0]
array([5.1039719e-04, 1.4324225e-07, 6.3209918e-06, 1.4587535e-07,
7.1591121e-06, 3.9024312e-02, 3.2491367e-05, 9.4579764e-02,
1.8918892e-05, 8.6582035e-01], dtype=float32)
As we can see, most of the entries in our prediction array are very close to 0. They are written in scientific notation--the value after the e being the number decimal places to adjust the value (for example 5.1 e-04 is actually 0.00051). The entry that stands out is predictions[0][9] at .8658, or 86.58%, certainty that this image should be classified as a boot!
If you prefer to not look through a list to determine the class label, we can simplify the output by:
>>> numpy.argmax(predictions[0])
9
>>>
Finally, we can verify this prediction by looking at the label ourselves:
>>> test_labels[0]
9
>>>
To complete this course, leave a comment with the letter (A,B,C,D) that best answers the following question:
In the prediction array generated by our model:
>>> predictions = model.predict(test_images)
>>> predictions[0]
array([5.1039719e-04, 1.4324225e-07, 6.3209918e-06, 1.4587535e-07,
7.1591121e-06, 3.9024312e-02, 3.2491367e-05, 9.4579764e-02,
1.8918892e-05, 8.6582035e-01], dtype=float32)
each number represents:
A. The probability that the image is a boot
B. Average pixel values (between 0 and 1)
C. The probability that the image matches the corresponding label in our set of labels
D. The accuracy of our model
C
You're right! Remember we had ten potential articles of clothing that we were testing.
There you have it! You have built and trained your first neural network from scratch, and properly classified a boot as a boot!
Next Steps:
- Try using the model on a item of clothing outside the dataset (make sure to preprocess it first so it is on the same scale as the other images).
- Find another image dataset to try this out on.
- Make an interface that responds with a label when you select a clothing image.