habrman/FaceRecognition

What algorithm did you use in this project?

Closed this issue ยท 13 comments

wetit commented

Hello Sir, I would like to know algorithm that you used. if possible, I want to know about working process from Start to End. Thank you.

@wetit It seems he used the "Facenet" architecture, because the triplet loss function.

@wetit you can find the information in the readme under the section "Inspiration". I'm using Multi-task Cascaded Convolutional Networks for facial and landmark detection and Inception Resnet to calculate the embeddings. To train the Resnet please look into this repository.

I start by detecting faces and calculating embeddings for all images in the ids folder. The live pipeline can then be split into three main parts:

  1. Facial detection
  2. Calculating embeddings for detected faces
  3. Matching ids by comparing the new embeddings to the precalculated embeddings

Please let me know if you have any more questions

@habrman Kindly help in understanding this project a bit more clearly.

  1. Since we are not training the network and instead the using a pretrained Inception Resnet model, can I run this main.py module on a CPU?
  2. Secondly, currently I am getting a average fps of 6. Is there any way to improve it?
  3. Thirdly, if I want to do all this in an Android app and convert the pretrained Inception Resnet model using Tensorflow Lite, how will I package these three detect_and_align.py, id_data.py and main.py in any Android app?
  4. And what is the def test_run(pnet, rnet, onet, sess, images_placeholder, phase_train_placeholder, embeddings, id_dataset, test_folder): function in the main.py module doing?
  1. Yes, it will probably be quite a bit slower but it works. You can even train on the cpu if you want to.

  2. I didn't really spend any time on optimizing the code. I wrote this since I did my master thesis on facial recognition and wanted to make a fun demo. I was also about to start a new job where I would be writing python and tensorflow so I wanted to refresh my knowledge. Looking at the code now that my knowledge has increased I can find a lot to improve. You can optimize the code and maybe multithread/multiprocess it and allow the Inception Resnet to work on the previous frame so that it can run in parallel with the cascades. Another way to make it much faster would be to actually start tracking the faces. This way it would be enough to run the Inception Resnet only once per track to find the id and then just track it over time. I might spend some time on this in the future but right now I'm spending my spare time on generative adversarial networks.

  3. Unfortunately I have not worked with deep learning on Android so I can't help you here. But doing a fast google search show some potential of running python in an Android app.

  4. It's a function I used to test the matching. Given a folder with images of people it basically tries to match the images to the ids in the id folder

@habrman Thanks for the answer. Yes I tried running it with Tensorflow CPU, it works, but the fps is slow, average is 3fps compared to when running it on GPU, average is 6. Yes we can do some multi threading to optimize the code.
But the problem is when I increase the ./ids/ folder. With 3 different people, containing only 1 example image each, it works fine. But if number of people in the ./ids/ folder increases, tried with 20 different people containing only 1 example image each, the system miss classifies. How to solve this?
Whether giving more and more image sample for the person solve this?
Another thing if ./ids/ increases containing 100 different people, I get an error, ran out of memory. Does it require so much memory? My GPU VRAM is 2GB and on board RAM is 8GB. Running it on CPU also results in the same error.

For missed/wrong classifications you could add more images per id. You could also try increasing the distance threshold if you can't find matches. If you get wrong matches you should instead lower the threshold.

Do you know if you run out of gpu memory or if it's ram? Might be that I keep some unnecessary things in memory and when loading all of the ids.

Since there seems to be some interest in this repo I might spend some time on rewriting the code to make it a bit faster and also nicer to read and use

@habrman Very much appreciated. Thanks a lot, need help ASAP!

Don you want it to run in 25-30 fps? It's probably not possible since the Inception Resent takes more time then this. Maybe you could change it to run on each 4:th frame in a separate thread? Then you should reach a reasonable fps without sacrificing to much.

You say that you need help asap. The main problems you have are that you run out of memory and you want to increase the speed?

@habrman Yes Inception Resnet is pretty big model. Not really bothered about the fps. Being able to recognise more number of faces accurately is more crucial.
Check the pretrained model folder, it's about 230MB. So if I increase the ids folder with 100 different people I get ran out of memory error. I have 8GB of RAM, is 8GB not enough?
How many different faces are you able to recognise accurately?

I haven't tried a lot of people my self but as I said earlier you could use more images per person and also decrease the matching threshold. This way you can reduce the number of false positives.

Regarding the memory it might be a bug and I will look into it. Can you post you entire error message? How many images do you have in you id folder when you run out?

I just rewrote large parts of the code. I also tried it with 375 images in the id folders and it uses less than 2.5 GB of ram and runs on my GPU with 4 GB memory. Can you try the new patch and see if it solves the problem? I'm closing this issue since I can't reproduce it my self and I think I've answered the original question. If the patch doesn't work, please open a new issue and post your error message.

Thanks @habrman By 375 images you mean, 375 different people in the ids folder or a total of 375 people? How many different people can it recognize properly? It works perfectly with 2 peoples, but it gets confused when number of people increases.

I had about 4 different people with up to 100 images of each. I haven't really tried recognizing a lot of different people but you should take a look at the distance matrix to see what's going on. Images of the same person should have a small distance while images of different people should have larger distances. By looking at the matrix you should be able to find a suitable threshold to separate the different identities.