GoFaceRec: A Go repository from modanesh

GoFaceRec

This repository uses tfgo to perform face recognition on an image from file. After a lot of efforts, I came to the conclusion that if one wants to load a deep learning model in PyTorch or Jax in Go, they better think twice before committing a good amount of effort into it. Instead, they are better to first convert their models to TensorFlow and then work with tfgo.

In this repo, the input image is first processed, and then its embeddings are compared against the ones already computed from our dataset. In order to compute and save embeddings from an arbitrary dataset, one can use the QMagFace's repo. Once the embeddings are ready, this repo uses Go in order to do face recognition. If the distance between embeddings falls bellow a specific threshold, then the face is considered as unknown. Otherwise, the proper label will be printed.

Requirements

This project is tested using Go 1.17 on Ubuntu 20.04. Except for tfgo, latest version of other packages have been used and installed.

For gocv, the version of OpenCV installed is 4.7. And for tfgo, I installed this version instead of the official one.

Installing

Just run the following command in you project in order to install this package:

go get github.com/modanesh/GoFaceRec@v0.1.1

Converting models

There are many ways to convert a non-TF model to a TF one. For that purpose, I used ONNX as an intermediary to convert the QMagFace's model from PyTorch to TF.

Use the model_converter.py script to convert the PyTorch model to ONNX first, and then the ONNX model to TF.

Some of the code in the model_converter.py is taken from the official QMagFace's implementation.

For this project, you may download the MTCNN and MagFace tensorflow models from the following URL:

Model	URL
MTCNN	Google Drive
MagFace	Google Drive

Extracting layers

In order to run the model using tfgo, you should know the input and output layers' names. In order to extract such information, the saved_model_cli command could be useful. A model exported with tf.saved_model.save() automatically comes with the "serve" tag because the SavedModel file format is designed for serving. This tag contains the various functions exported. Among these, there is always present the "serving_default" signature_def. This signature def works exactly like the TF 1.x graph. Get the input tensor and the output tensor, and use them as placeholder to feed and output to get, respectively.

To get info inside a SavedModel the best tool is saved_model_cli that comes with the TensorFlow Python package, for example:

saved_model_cli show --all --dir output/keras
gives, among the others, this info:

signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
  inputs['inputs_input'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 28, 28, 1)
      name: serving_default_inputs_input:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['logits'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 10)
      name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict

Knowing the input and output layers' names, serving_default_inputs_input:0 and StatefulPartitionedCall:0, is essential to run the model in tfgo.

Running the model

This project uses MTCNN for face detection and QMagFace for face recognition. For MTCNN, three stages (PNet, RNet, ONet) have been used in a close fashion similar to FaceNet. Each stage is done in its corresponding function:

First stage (PNet): totalBoxes := firstStage(scales, img, pnetModel)
Second stage (RNet): squaredBoxes := secondStage(totalBoxes, width, height, img, rnetModel)
Third stage (ONet): thirdPickedBoxes, pickedPoints := thirdStage(squaredBoxes, width, height, img, onetModel)

You may download the models from the available Google Drive URLs.

After the face detection stage, there is a face alignment. The function to perform face alignment is pImgs := alignFace(thirdPickedBoxes, pickedPoints, img) which imitates the steps from here.

Finally, once the face is detected and aligned, the recognition phase can start. It happens at this line: recognizeFace(pImgs, qmfModel, regEmbeddings, bSize, regFiles).

Use the bellow command to run the code:

go run main.go IMAGE.jpg path/to/REGISTERED_IMAGES path/to/EMBEDDINGS.npy path/to/MTCNN_MODELS_DIR path/to/MAGFACE_MODEL_DIR

where:

IMAGE.jpg: path to the given image
path/to/REGISTERED_IMAGES: directory containing register images
path/to/EMBEDDINGS.npy: the embeddings extracted from the register images using the Python's QMagFace implementation
path/to/MTCNN_MODELS_DIR: directory containing tensorflow models for MTCNN
path/to/MAGFACE_MODEL_DIR: directory containing tensorflow model for MagFace

Challenges

The main challenge thus far was the conversion between gocv.Mat, tfgo.Tensor, gonum, and Go's native slice. The conversion is required as some matrix transformations are only available in gocv and some in tfgo. Also, the input to the tfgo model should be of type tfgo.Tensor, so inevitably one needs to convert the image read by gocv to tfgo. Also, some matrix operations are not available in any of these packages, so I had to implement them myself from scratch. To do so, I had to use Go's native slice. So inevitable conversions between these types are frequent throughout the code.

For example, the function adjustInput() besides doing some scaling, it also converts a gocv.Mat to Go's [][][][]float32. In addition, at this line: inputBufTensor, _ := tf.NewTensor(inputBuf) a [][][][]float32 slice is converted to a tfgo.Tensor.

In contrast, these type conversions are done pretty easy and fast in Python.

ToDo

Check why recognition model takes so long for a forward pass. In Python, it takes about 0.5 milliseconds while in Go it takes about 5500 milliseconds. For the first run, in Go, the session instantiation takes a long time. For next runs, Go runs pretty fast. Take a look at this issue. The fakeRun() function is for that purpose.
Upload the models
Create a Go package