/3D-Morphable-Model

Project implementing features from the following paper: http://www-inst.eecs.berkeley.edu/~cs194-26/fa14/Papers/BlanzVetter99.pdf

Primary LanguageC++

3D-Morphable-Model (Write Up)

Project implementing features from the following paper: http://www-inst.eecs.berkeley.edu/~cs194-26/fa14/Papers/BlanzVetter99.pdf

The paper describes a tecnique for creating morphable 3D face models from a limited dataset of 3d facial scans of real individuals. The dataset I used came from the Basel Face Model http://faces.cs.unibas.ch/bfm/main.php?nav=1-1-0&id=details as described in the link.

The geometry of the BFM consists of 53,490 3D vertices connected by 160,470 triangles. Faces of different identities can be composed as linear combinations of 199 principal components.

The data included in this model is given by:
  • The average shape
  • The principal shape components
  • The shape variance
  • The mesh topology
  • The average texture
  • The principal texture components
  • The texture variance

Essentially, every face is represented by a shape and a texture vector. Where the shape vector represents all the vertices of the 3d scanned images, so each face is a concatenated vector of (x,y,z) points and the corresponding (r,g,b) values for the texture for the same vertex in the corresponding shape vector. Running Principial Components Analysis on this (which consists of essentially doing a SVD decomposition of the sample data matrix (a technique we learned in class)) we are able to extract m-1 = 199 principal components of the faces where m=200 are the number of faces of real people used to train on. From a sample of 100 males and 100 females.

Arbitrary faces can be generated by linear combinations of the principal components, as you can see in my code, weighting each principal component by the eigenvalues (representing the variance value if we think of the PCA as fitting Gaussian distribution to the dataset of faces... i.e. finding the directions in the vector space of faces, which maximize the variance) so each principal component has the same shape of a face or texture, but is not an actual face, just components which can be thought of as capturing the most "information" from the set of faces.

Adding weightings of the principle components to the mean, we can generate random (realistic looking) faces that were not actual scanned faces, but completely different. This is really cool, as these faces could be used in many applications such as creating new faces for more training or testing data in applications such as facial recognition, etc...

In the images you can see in the repo, I have the mean face from different angles, (so no prinicpal components added), then I generate a random vector of weights sampled from a standard normal distribution, to create a more "realistic face", variance 0.5 provided good results of generating realistic looking faces, as you can see in the images after the mean images. This random face generations was the first step to understanding the data.

The next feature from the paper I (tried to) implement was the matching the morphable model to the face. As described in section 4 of the paper. This involved minimizing the loss function with respect to the different weights on the principle components which determines the shape of the face, and principal components of the texture as well. Another set of parameters relating to the camera, rotation, and various things relating to rotating etc were used to get the 3d model fit to the image. The paper said this was a simple differentiation, but unfortunately my gradient descent algorithm failed miserablly to converge to a fit matching. (These are the Putin images you see in the repo) First, I created a system where the user could fit the mean face approximately to sample face. With the user using the mouse to rotate the face to determine the intial paramaters, then run gradient descent using the analytic derivatives of the loss function with respect to the shape weights, texture weights, and various camera, angle, postioning paramaters to get a good fit to the face.

As you can see, once I alligned the mean face to the sample face, it was a matter of running the grad desc algorithm, but this resulted in not good results at all (I think the derivatives I calulated must've been wrong). I plan on getting this to work in the future, once I can actually figure out what the weight updates should be :o...

Ideally, once an image can be matched, we can take two images of say for example a smiling face and a frowning face. Then taking the difference between these two faces, we can actually determine different components which when blended from one to the other can morph from a smile to a frown, and back.