/resnet-sign-language-recognition

Keras Functional API implementation of the 50-layer residual neural network (ResNet-50) and its application to sign language digit recognition

Primary LanguageJupyter Notebook

ResNet: Sign language recognition

Recognizing multiple classes of objects from images is a common computer vision task. Here we have a 6-class problem where we want to recognize 6 different digits (0, 1, 2, 3, 4, 5) of American Sign Language given an image of a hand. We build a convolutional neural network for this multi-class classification task, in particular the very deep 50-layer residual neural network (ResNet-50) using TensorFlow Keras Functional API. I did this project in the Convolutional Neural Networks course as part of the Deep Learning Specialization.

Datasets

We have 1080 training examples and 120 test examples, where each example is of shape (64, 64, 3) with each of RGB channel image is of size 64x64. The examples are labeled as one of 0, 1, 2, 3, 4, and 5 for the corresponding digits. These labels are one-hot encoded to be used as the target output of the recognizer, as shown below.

One-hot encoding of class labels

Residual neural network architecture

We used TensorFlow Keras Functional API to build the ResNet-50 model depicted below.

ResNet-50 architecture

We trained the convolutional neural network for 10 epochs with the Keras model's .fit() method and evaluated its performance with the model's evaluate() method. The model shows the training accuracy of 0.93 and the test accuracy of 0.94.