/Emotion-Detection

Uses OpenCV, Keras, TensorFlow and your front-facing camera to detect your emotion based on facial features

Primary LanguagePython

Emotion-Detection using OpenCV, Keras and Tensorflow

Uses your front-facing camera to detect emotion based on facial features. By accessing your live camera or webcam data, this model can detect some common emotions and send back a classification for live interaction.

Some examples

Here are some examples with different backgrounds and lighting.

Smiling

Model can still classify with some disturbance (eg. your annoying cat trying to jump into your photos)

no smile with noise

Glasses on

Not smiling

Glasses off and hoodie on

smile_with_noise

Features

• If you are smiling and you say 'Cheese!' the camera will snap and save a picture. This is a hands-free approach to getting high-quality and happy selfies.

• Works with friends! (multiple people supported)

• Live feed classification. This model accesses the camera using OpenCV so it can classify emotions very quickly and return the results to the screen.

• Still functional with different lighting and noise in picture (ex with cat or glasses on)

Technical

Implementation Details

I used a sequential neural network with 4 hidden layers to classify the emotions. The model uses Keras which is TensorFlow's high-level API. The first 3 hidden layers use 300 neurons and the ReLU activation function. The 4th hidden layer has 50 neurons and also uses the ReLU activation function. The final output layer uses the sigmoid activation function to determine whether the user is smiling or not.

The pipeline involves preprocessing the video in several stages. The first step is to constantly capture images from the video to process. One by one (very quickly of course), the images go into many different classifiers. The first classifier determines whether there are faces in the photo using haar cascade. If there are faces in the video the images are then sent to multiple other facial features recognizers including mouth recognition. Once a mouth(s) is recognized, the image is cropped around the mouth(s). This is important so that my model can focus on a specific area of the images. Once the model receives the mouth images, it removes the RGB values and converts it to grayscale. This reduces the dimensions of the image. At this point, the images are input into the model.

After the model is trained, it is saved in the h5 format so that other files may import it and use it without any training requied.

Using tensorboard to visualize some of the runs epochs

Epoch Accuracy

We can see that the number of epochs doesn't necessarily increase the accuracy. It is important to find a good balance for the number of epochs so that you aren't wasting time and resources on training the model when it isn't improving accuracy.

epoch accuracy

Epoch Loss

epoch loss

Alternative Runs

epoch accuracy 1

first_few-epochs_accuracy

first_few_epochs_loss

We see that the model quickly learns through the epochs

epochs_training

Full Training Curves

training_curves

ToDo: Implement early stopping for more efficient training

This was not initially implemented because I can already train the model very quickly and ended up with great performance accuracy. I am working with a small amount of data, so the training process was very quick.

Please note only the model is uploaded at this time. There are other supporting files used to interact with the camera and data which I am keeping private at this time.