Authors: Olivier Moindrot and Guillaume Genthial
Take the time to read the tutorials.
Note: all scripts must be run in folder tensorflow/vision
.
We recommend using python3 and a virtual env. See instructions here.
virtualenv -p python3 .env
source .env/bin/activate
pip install -r requirements.txt
When you're done working on the project, deactivate the virtual environment with deactivate
.
Given an image of a hand doing a sign representing 0, 1, 2, 3, 4 or 5, predict the correct label.
For the vision example, we will used the SIGNS dataset created for this class. The dataset is hosted on google drive, download it here.
This will download the SIGNS dataset (~1.1 GB) containing photos of hands signs making numbers between 0 and 5. Here is the structure of the data:
SIGNS/
train_signs/
0_IMG_5864.jpg
...
test_signs/
0_IMG_5942.jpg
...
The images are named following {label}_IMG_{id}.jpg
where the label is in [0, 5]
.
The training set contains 1,080 images and the test set contains 120 images.
Once the download is complete, move the dataset into data/SIGNS
.
Run the script build_dataset.py
which will resize the images to size (64, 64)
. The new reiszed dataset will be located by default in data/64x64_SIGNS
:
python build_dataset.py --data_dir data/SIGNS --output_dir data/64x64_SIGNS
- Build the dataset of size 64x64: make sure you complete this step before training
python build_dataset.py --data_dir data/SIGNS\ dataset/ --output_dir data/64x64_SIGNS
- Your first experiment We created a
base_model
directory for you under theexperiments
directory. It countains a fileparams.json
which sets the parameters for the experiment. It looks like
{
"learning_rate": 1e-3,
"batch_size": 32,
"num_epochs": 10,
...
}
For every new experiment, you will need to create a new directory under experiments
with a similar params.json
file.
- Train your experiment. Simply run
python train.py --data_dir data/64x64_SIGNS --model_dir experiments/base_model
It will instantiate a model and train it on the training set following the parameters specified in params.json
. It will also evaluate some metrics on the development set.
- Your first hyperparameters search We created a new directory
learning_rate
inexperiments
for you. Now, run
python search_hyperparams.py --data_dir data/64x64_SIGNS --parent_dir experiments/learning_rate
It will train and evaluate a model with different values of learning rate defined in search_hyperparams.py
and create a new directory for each experiment under experiments/learning_rate/
.
- Display the results of the hyperparameters search in a nice format
python synthesize_results.py --parent_dir experiments/learning_rate
- Evaluation on the test set Once you've run many experiments and selected your best model and hyperparameters based on the performance on the development set, you can finally evaluate the performance of your model on the test set. Run
python evaluate.py --data_dir data/64x64_SIGNS --model_dir experiments/base_model
We recommend reading through train.py
to get a high-level overview of the steps:
- loading the hyperparameters for the experiment (the
params.json
) - getting the filenames / labels
- creating the input of our model by zipping the filenames and labels together (
input_fn(...)
), reading the images as well as performing batching and shuffling. - creating the model (=nodes / ops of the
tf.Graph()
) by callingmodel_fn(...)
- training the model for a given number of epochs by calling
train_and_evaluate(...)
Once you get the high-level idea, depending on your dataset, you might want to modify
model/model_fn.py
to change the modelmodel/input_fn.py
to change the way you read datatrain.py
andevaluate.py
if somes changes in the model or input require changes here
If you want to compute new metrics for which you can find a tensorflow implementation, you can define it in the model_fn.py
(add it to the metrics
dictionnary). It will automatically be updated during the training and will be displayed at the end of each epoch.
Once you get something working for your dataset, feel free to edit any part of the code to suit your own needs.
Introduction to the tf.data
pipeline