ImageRecognitionInKeras
A few simple scripts to help you train and evaluate Transfer Learning-based custom image recognition models.
Scripts
There are three scripts in this module - one for splitting your image data, one for training your model, and the final for scoring your model and getting a picture of the confusion matrix.
train_test_split.py
- Splits an existing directory of image files (with subdirectories named for the labels)
- Uses hashing on the filenames to ensure that even if you add more files, existing files will hash into the same buckets they did previously
train_keras.py
- Trains a Keras model for image recognition based on Transfer Learning. See Keras Applications for a sense of the pre-built models that are available.
- As new models are launched or you develop/find them, it's easy to add them to this script and use them as base models for your own transfer-learned versions.
- Models are automatically named based on the hyperparameters given, and can also have a timestamp appended. However, this can be overridden if needed.
- Data augmentation flags are supported, and include the methods supported by Keras (flipping horizontally and vertically, rotation, zooming, and shearing).
- You can bundle training and scoring if desired.
- As part of the training, a Markdown file describing the model is produced. If you are also doing scoring, details on the performance will also be included.
score_keras.py
- Takes an existing trained model and runs multiple passes over the test-set of images, generating a confusion matrix and precision, recall and F-score values.
- All scripts support the
--help
flag to give you details on usage, and should guide you in required vs. optional parameters. If you have any issues, please feel free to reach out.
Examples
train_test_split.py
python -m scripts.train_test_split --image_dir data/my_photos --output_dir data/my_split --pct_test 10 --pct_validation 20 --seed 1337
- Splits images in
data/my_photos
into output directorydata/my_split
- 10% of the images in each class go into
testing
, 20% intovalidation
, and the other 70% intotraining
- Splits images in
train_keras.py
python -m scripts.train_keras --image_dir data/my_split --model_type InceptionV3 --batch_size 8 --learning_rates 0.001 0.001 0.0005 --epochs 1 1 1 --use_weights True --score True --num_batches_to_score 20 --model_dir ./models --gpu 1
- Trains a new model based on a pre-trained InceptionV3 instance.
- Since
output_model
is not specified, the name is imputed from the hyperparameters. - Uses class-weights based on inverse class distribution.
- Scores the results and writes out confusion matrix and description markdown file.
score_keras.py
python -m scripts.score_keras --model_dir ./models --model_name InceptionV3_avg-1024_relu_lr001-001-0005_wts_e1-1-1 --num_batches 20 --image_dir ./data/my_split
- Scores an existing model (
./models/InceptionV3_avg-1024_relu_lr001-001-0005_wts_e1-1-1.h5
) - Uses images from
./data/my_split/testing
for evaluation. - Stores confusion matrix and scores into
./models
, into a.png
and.csv
respectively.
- Scores an existing model (
LICENSE
See LICENSE
. Overall code is licensed under MIT license. train_test_split.py
is Apache v2.0 licensed because it is adapted from TensorFlow code (see comments in that file).