- This model has not been tested for potential bias based on race, skin color, etc, as such it is not designed for real-world use unless comprehensively tested and reformed.
- Be aware of the ethical issues behind sex/gender classification before using this code.
I have used the VGG Face Descriptor model for transfer learning, to train a new model for classifiying sex in the Adience dataset.
After training for 3 epochs on my laptop (~7 hours):
Set | Accuracy | Binary Crossentropy |
---|---|---|
Trainin set | 0.9670 | 0.0897 |
Validation set | 0.9275 | 0.2631 |
We can see it is slightly overfitting to training data. Allowing for more data augmentation, or using a bigger training set can possibly help with this. Another option will be to start from a lower layer of the VGG-face model when building the new classifier on top of it. Currenly I'm building on top of layer 30 out of the 53 total layers in VGG-face. (53 layers when counting padding and activation layers separately). See explanations below for more on this.
The code provides three options for running:
Use the argument classify when running the command:
Example:
$python3.7 train_adience_gender.py classify -m trained_gender_classifier.h5 -i ../data/adience/combined/valid/11_F/landmark_aligned_face.957.12059888826_929090d81b_o.jpg
****
output class is female. (sigmoid value=0.025847017765045166)
****
Use the argument train when running the command:
Example:
$python3.7 gender_classifier.py train -w ../../vgg_face_torch/VGG_FACE.t7 -i ../../data/adience/combined -o ~/output_model.h5 -e 1
- input argument for "-w" specifies the file for weights of the pretrained model, downloadable from here
- input argument for "-i" specifies the directory for the images of the Adience dataset. It includes two subdirectories: aligned and valid. It can be downloaded from here.
- "-o" specifies the filename and the path to where the trained model should be stored
- "-e" is the number of epochs to train, 1-3 should be enough. By default it is set to 1.
Other optional arguments include:
- "-b1" for training set batch size
- "-b2" for validation set batch size
- "-m" whether to use low-memory or high-memory setting. The high memory setting, loads all the images in memory, whereas the low-memory loads them batch by batch through ImageDataGenerator
- more options -such as whether to perform data augmentation- are provided in the code interface but not through the command line
Use the argument evaluate when running the command:
Example:
$python3.7 train_adience_gender.py evaluate -m trained_gender_classifier.h5 -i ../data/adience/combined
-Found 29437 images belonging to 2 classes.
-Found 3681 images belonging to 2 classes.
-evaluating on validation data...
-116/116 [==============================] - 985s 8s/step - loss: 0.2631 - accuracy: 0.9275
-evaluating on training data...
-920/920 [==============================] - 985s 8s/step - loss: 0.0897 - accuracy: 0.9670
I have implemented the VGG-face model using TF2. The model implementation can be found in src/vgg_face.py file. The implementation is based on the architecture described in this paper.
The weights are also downloaded from the same webpage. After unzipping this file, there is a t7 file, containing the trained weights for the torch implementation that I use as input and convert to TF2 weights.
The main challenges in this step are:
- Although the paper calls the last few layers "fully connected", they are not dense layers. As the paper describes: "They are the same as a convolutional layer, but the size of the filters matches the size of the input data, such that each filter “senses” data from the entire image."
- Data format in the torch by default is channel-first, as opposed to TF/TF2 where it is channel-last. Additionally the images are trained in "BGR" order. The axis transpositions are important.
The data are downloaded from this link (provided in the email). After unzipping, there are two subdirectories: "aligned" and "valid". There is no overlap between these two subdirectories. I have used the "aligned" subdirectory for training, and "valid" for validation.
I have created a separate class for this data. To instantiate this class, one needs to provide the path to the data. The object will include the data and a data-generator that can later be used as input to fit() function of the model.
I have also provided two options for an object of this class: "low-memory" and "high-memory". By default it uses low-memory, and that's how I have ran the code on my laptop. I provided the "high-memory" option, since I noticed the image data are not too big, and given a reasonable computing resource could be possibly loaded in-memory all at once, facilitating the training.
I chose not to implement my own data generator and instead using the ImageDataGenerator from TF. This is because I realized I wouldn't be adding anything more than what TF ImageDataGenerator already provides.
This model is built on top of the VGG-face pretrained model. I use the bottom 30 layers of the vgg-face model and add 2 dense layers on top of them.
There are two arguments in the src/adience_model.py. The first one frozen specifies how many layers should have their weights frozen and not trainable. The other argument add_on_top specifies how many layers of the vgg-face model should be kept and add the new layers on top of them. Both of these have their default set to 30.