mpatacchiola/deepgaze

What is the range of angles for the head pose estimation?

nyck33 opened this issue · 3 comments

Does this go as far as full profile? Sorry I'm being a bit lazy here as I can clone the updated 2.0 version and test it out on the i-Bug menpo dataset with full profile faces. But I was just wondering because the demo videos lose it, it seems at about 45 degrees. Also I think the demo videos show that the Haar Cascaded face detector is not robust enough so when that cannot detect the face, the dlib and head pose don't even get a chance to show their stuff.
Anyways, I would not mind trying this out on my project and supplying the scripts and videos if you can add me as a minor contributor.
Thanks.

Anyways, I would not mind trying this out on my project and supplying the scripts and videos if you can add me as a minor contributor.

You are welcome to contribute. If you think you can provide an additional feature or do some useful tests then do it! Based on the contribution we will decide if adding you as major or minor contributor.

Yes the bottleneck is the face detector. The OpenCV Haar cascade can recognize front and profile faces. A slight rotation on the vertical axis is also allowed.

There are a few parameters in the method returnFacePosition() that you can manage in Deepgaze to allocate more resources to tilted and scaled faces. Give a look at this method (also reported below).

For instance if you change the value of rotationAngleCCW and rotationAngleCW the detector will look for faces at different degrees. Changing minSizeX you can allow the detector to look for smaller faces.

The problem is that since the detector is called multiple times, the method starts to be slow and it is difficult to guarantee real-time. We were working on something faster based on Deep Learning, but the feature has never been finished.

Directly from the description of the method:

def returnFacePosition(self, inputImg, 
                           runFrontal=True, runFrontalRotated=True, 
                           runLeft=True, runRight=True, 
                           frontalScaleFactor=1.1, rotatedFrontalScaleFactor=1.1, 
                           leftScaleFactor=1.1, rightScaleFactor=1.1,
                           minSizeX=30, minSizeY=30, 
                           rotationAngleCCW=30, rotationAngleCW=-30, 
                           lastFaceType=0):
        """Find a face (frontal or profile) in the input image 
        Find a face and return the position. To find the right profile the input 
        image is vertically flipped, this is done because the training 
        file for profile faces was trained only on left profile. When all the
        classifiers are working the computation can be slow. To solve the problem
        it is possible to accurately tune the minSize and ScaleFactor parameters.
        @param inputImg the image where the cascade will be called
        @param runFrontal if True it looks for frontal faces
        @param runFrontalRotated if True it looks for frontal rotated faces
        @param runLeft if True it looks for left profile faces
        @param runRight if True it looks for right profile faces
        @param frontalScaleFactor=1.1
        @param rotatedFrontalScaleFactor=1.1
        @param leftScaleFactor=1.1
        @param rightScaleFactor=1.1
        @param minSizeX=30
        @param minSizeY=30
        @param rotationAngleCCW (positive) angle for rotated face detector
        @param rotationAngleCW (negative) angle for rotated face detector
        @param lastFaceType to speed up the chain of classifier
        Return code for face_type variable: 1=Frontal, 2=FrontRotLeft, 
        3=FronRotRight, 4=ProfileLeft, 5=ProfileRight.
        """

@mpatacchiola
I'll definitely try replacing the Haar face detector with MTCNN, Tencent DFSD (this one is good but very slow as you mentioned it rotates, resizes and calls detect() multiple times) but I know with MTCNN, it did not detect a frontal face lying sideways without rotate so it's probably going to be slow as well.
Thank you for all the info above. By the way though, I am just learning PnP and found this: https://www.learnopencv.com/head-pose-estimation-using-opencv-and-dlib/ but are the fundamentals behind your implementation similar?

Yes that code it is pretty similar to the one I have implemented. I think the guy took inspiration from Deepgaze to create that blog post. I think that MTCNN may be a good choice. Good luck with your experiments!