PyTorch implementation of a dog breed classifier using convolutional neural nets. Part of Udacity's Deep Learning Nanodegree program.
The dataset contains 8352 images in total, divided across 133 breeds and already split in training, validation, and test sets. On average, there are 50 training images per breed, with a minimum of 26 for the Norwegian buhund and a maximum of 77 for the Alaskan malamute.
- Download the dog dataset. Unzip the folder and place it in this project's home directory, at the location
/dogImages
. - Download the human dataset. Unzip the folder and place it in the home directory, at location
/lfw
.
The repository consists of the following three main files:
- from_scratch.ipynb: Here I analyze in depth the influence of different augmentation techniques and architectural improvements on the performance of a dog breed classifier trained from scratch (i.e. no transfer learning). This notebook also questions the usual choice of replacing a fully connected layer with global average pooling. I show that for this dataset global max pooling substantially outperforms global average pooling and I argue that it is a better fit in the philosophy of convolutional networks.
- misc.py: Most of the imports, function and class definitions used in the above.
- dog_app.ipynb: The Udacity project. It consists of the following steps:
- Assessing the performance of a Haar cascade classifier for human face detection.
- Assessing the performance of six different pretrained networks for dog detection, irrespective of the breed.
- Creating an architecture for classifying dog breeds from scratch. Here I use the results of the from_scratch notebook.
- Using transfer learning for classifying dog breeds.
- Putting it all together.
- fix_truncated_images.ipynb: One of the training images is truncated. More info and fix in this notebook.
The easiest way to run everything is to create an environment from Requirements.txt
with the command:
conda create -n <environment_name_here> --file Requirements.txt