How to test
woo1 opened this issue · 1 comments
I followed this step and it run correctly.
https://github.com/crisie/RecurrentGaze#test
But it doesn't work with new images.
It seems like I should have other files.
How can I make these files? ("CCS_3D_info.txt", "landmarks.txt", "calibration.txt")
It looks like there are no code to make the files for inference.
In README.md, there are only descriptions that you used Openface.
Please tell me about the format of 3 files or how to make that.
Dear woo1,
So sorry for late reply.
CCS_3D_info.txt
is created using OpenFace, but it could be created with other libraries or methods as well. There is one row per frame. You can follow the header to know which features are needed:
- OF_head_pos_X,Y,Z: 3D head position wrt camera coordinate system (X,Y,Z).
- OF_head_rot_X,Y,Z: euler angles for head rotation (X,Y,Z).
- 3Dcam_landmarks_<n_landmark>_X,Y,Z: 3d face landmarks wrt camera coordinate system. 1 value per coordinate and landmark (64 landmarks).
landmarks.txt
is created using Bulat et al. face alingment code (link on ReadMe file). There is one row per frame, and 64 landmarks (X,Y,Z) per row, such that, for 1 frame:
landmark1_X landmark1_Y landmark1_Z landmark2_X landmark2_Y .......
One may argue why landmarks are computed twice, that is, once using the former file and another time with the latter. At the time this code was implemented, Bulat et al was showing better performance than OpenFace for close-to-profile faces. However, it only provides 3d landmarks wrt to pixels, that is, not real 3D landmarks in the 3D space, which is needed to compute real-world head position and rotation. As head position and rotation was already provided by EYEDIAP for the paper experiments, we just used Bulat et al. face alignment code to extract the "3D" landmarks info which is sufficient to have the geometry info needed by the network, and thus the network is trained with that "3d" landmarks format. However, to apply the gaze estimation code to images outside EYEDIAP, we do need to compute the real 3D landmarks to compute the head position and rotation wrt camera coordinate system, but we still need the Bulat et al landmarks format to comply with how the network was trained. Therefore, we use OpenFace to compute head position and rotation, and Bulat et al for the geometry face info. The network could in fact be retrained using OpenFace's landmarks format, to get rid of Bulat et al.'s dependency, but due to time constraints it is not in my current roadmap.
And finally, calibration.txt
refers to the camera calibration:
- resolution: image resolution
- intrinsic: Camera matrix (intrinsic parameters)
- distortion: distortion parameters.
If you have a calibrated camera, you can replace the file values with your values. If you don't know your camera parameters, you could use a dummy matrix. The results won't be as accurate as with a calibrated camera, but it will be an approximation. You can use the following dummy matrix, assuming a 720x576 image as in the example file, and assuming that the principal point is in the middle of the image:
1.2*720.0, 0.0, 360.0
0.0,1.2*720.0, 288.0
0.0, 0.0,1.0