This Application involves Training a RetinaNet to instinctively crop out scale and position-variant face(s) from pictures and save as a new image file.
If you want to know about the RetinaNet - I'd suggest you check this kernel, it's pretty intuitive
I used a pre-trained RetinaNet with Resnet50 as the backbone and then fine-tuned it to classify and detect (to draw a bounding box around a human face). The bounding box output was used to crop out the specific object(s) in the image.
This Model is very pretty scale and pose invariant
Images from flickr and shutterstock constituted of the training dataset
The workflow are outlined as follows:
- Use labelimg to annotate (label and specify the bounding box cordinates) all the objects in the image in a Pascal VOC format
- Run a xml script to convert the Pascal VOC format (xml) to csv as that's what a RetinaNet expects
- Load the Pretrained RetinaNet from keras and all it's dependencies and navigate to the main file directory
- Train the Pretrained RetinaNet by specifying a backbone (I used Retina50) and save the learned parameters after each epochs
- Convert the saved model to an inference graph to test on unseeen data
- Save the results as a Pandas DataFrame using the corresponding datapoints after testing on the test data
- Use OpenCV's imwrite function to save the cropped image to a folder
- This can be used to build an Image Dataset or Database
-
Python 3.7
-
Tensorflow 2.x
-
Keras
-
Numpy
-
OpenCV
-
Matplotlib
-
Pillow
-
Pandas
.
P.S: I had to fail (tune) the Training Process a little bit, so that the cropped image has a little background