Build a Traffic Sign Recognition Project
The goals / steps of this project are the following:
- Load the data set
- Explore, summarize and visualize the data set
- Design, train and test a model architecture
- Use the model to make predictions on new images
- Analyze the softmax probabilities of the new images
- Summarize the results with a written report
Here I will consider the rubric points individually and describe how I addressed each point in my implementation.
You are reading my Writeup/README. Here is a link to my project code Here is a link to my report.html
I used "shape" and "len()" to calculate summary statistics of the traffic signs data set:
- The size of training set is 34799
- The size of the validation set is 4410
- The size of test set is 12630
- The shape of a traffic sign image is (32, 32, 3)
- The number of unique classes/labels in the data set is 43
Here is an exploratory visualization of the data set. It is a bar chart showing the distribution of the data.
As a first step, I decided to convert the images to grayscale because this task is color independent, grayscale can decrease the computations and train time.
Here is an example of a traffic sign image before and after grayscaling.
My final model consisted of the following layers:
Layer | Description |
---|---|
Input | 32x32x1 Gray image |
Convolution 5x5 | 1x1 stride, VALID padding, outputs 32x32x64 |
RELU | |
Max pooling | 2x2 stride, outputs 14x14x20 |
Convolution 3x3 | 1x1 stride, VALID padding, outputs 12x12x40 |
RELU | |
Dropout | 50% keep chance |
Convolution 3x3 | 1x1 stride, VALID padding, outputs 10x10x80 |
RELU | |
Max pooling | 2x2 stride, outputs 5x5x80 |
Flatten | outputs 2000 |
Fully connected | outputs 120 |
RELU | |
Fully connected | outputs 84 |
RELU | |
Fully connected | outputs 43 |
To train the model, I used
Hyperparameter | Description |
---|---|
epochs | 40 |
batch size | 128 |
dropout probabilities | 50% |
learning rate | 0.001 |
optimizer | Adam |
Adam is adaptive optimizer.
My final model results were:
- training set accuracy of 0.999
- validation set accuracy of 0.964
- test set accuracy of 0.948
I choose a well known architecture "LeNet". I add "dropout layer" to prevent overfit.
Here are five German traffic signs that I found on the web:
The fourth image might be difficult to classify because it has a different shape and some watermarks.
Here are the results of the prediction:
Image | Prediction |
---|---|
Speed limit (30km/h)-1 | Speed limit (30km/h)-1 |
No entry-17 | No entry-17 |
Road work-25 | Road work-25 |
Road work-25 | Keep right-38 |
Right-of-way at the next intersection-11 | Right-of-way at the next intersection-11 |
The model was able to correctly guess 4 of the 5 traffic signs, which gives an accuracy of 80%.
Why was the fourth image predicted to 25? We can see the distribution plot of the data above. The data with 25 label is more than others, it may lead biased toward the classes with more images. We can use data augmentation to increase the less images and balance the distribution to decrease the bias.
The code for making predictions on my final model is located in the 14th cell of the Ipython notebook.
For the correct predicted image, the model is very sure. For the fourth image, the model is not very sure.
The top five soft max probabilities were