/emotion-app

Emotion Recognition Android App

Primary LanguageJava

Quickstart

EmotionApp is a simple image emotion classification application that demonstrates how to embed our pretrained model for emotion recognition in your own android app. This application runs TorchScript serialized pretrained emotion recognition model on static image which is packaged inside the app as android asset.

1. Cloning from github

git clone https://github.com/alvin870203/EmotionApp.git
cd EmotionApp

We recommend you to open this project in Android Studio 3.5.1+ (At the moment PyTorch Android and demo application use android gradle plugin of version 3.5.0, which is supported only by Android Studio version 3.5.1 and higher), in that case you will be able to install Android NDK and Android SDK using Android Studio UI.

2. Prepare Pre-build Model

If you don't want to build TorchScript model from source by yourself as described in Step 0. (You probably don't need to.) Just download our pre-build scripted and optimized emotion recognition model - EmotionRecognition_scripted.pt from Google Drive, and place it in the app/src/main/assests folder of EmotionApp.

More details about TorchScript you can find in tutorials on pytorch.org.

3. Gradle Dependencies

Pytorch android is added to the EmotionApp as gradle dependencies in build.gradle:

repositories {
    jcenter()
}

dependencies {
    implementation 'org.pytorch:pytorch_android_lite:1.10.0'
    implementation 'org.pytorch:pytorch_android_torchvision:1.9.0'
}

Where org.pytorch:pytorch_android_lite is the main dependency with PyTorch Android API, including libtorch native library for all 4 android abis (armeabi-v7a, arm64-v8a, x86, x86_64).

org.pytorch:pytorch_android_torchvision - additional library with utility functions for converting android.media.Image and android.graphics.Bitmap to tensors.

4 . Reading image from Android Asset

All the logic happens in org.pytorch.emotion.MainActivity. As a first step we read test.jpg to android.graphics.Bitmap using the standard Android API. (You can replaced it with other images provided in the assets folder or any other image for your purpose.)

Bitmap bitmap = BitmapFactory.decodeStream(getAssets().open("test.jpg"));

5. Loading TorchScript Model

Module module = LiteModuleLoader.load(assetFilePath(this, "EmotionRecognition_scripted.pt"));

org.pytorch.Module represents torch::jit::script::Module that can be loaded with load method specifying file path to the serialized-to-file model.

6. Preparing Input

Tensor inputTensor = TensorImageUtils.bitmapToFloat32Tensor(bitmap,
    TensorImageUtils.TORCHVISION_NORM_MEAN_RGB, TensorImageUtils.TORCHVISION_NORM_STD_RGB);

org.pytorch.torchvision.TensorImageUtils is part of org.pytorch:pytorch_android_torchvision library. The TensorImageUtils#bitmapToFloat32Tensor method creates tensors in the torchvision format using android.graphics.Bitmap as a source.

All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224. The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225]

inputTensor's shape is 1x3xHxW, where H and W are bitmap height and width appropriately.

7. Run Inference

Tensor outputTensor = module.forward(IValue.from(inputTensor)).toTensor();
float[] scores = outputTensor.getDataAsFloatArray();

org.pytorch.Module.forward method runs loaded module's forward method and gets result as org.pytorch.Tensor outputTensor with shape 1x7 if a face is detected or else with shape 1x1 if no face was detected in the image.

8. Processing results

Its content is retrieved using org.pytorch.Tensor.getDataAsFloatArray() method that returns java array of floats with scores for every emotion class if a face is detected.

After that we just find index with maximum score and retrieve predicted class name from EmotionClasses.EMOTION_CLASSES array that contains all emotion classes.

If there is no face detected, then the returned java array will only contain one float number.

String className = "";
if ( scores.length == 1) {
    className = "No face detected";
} else {
    // searching for the index with maximum score
    float maxScore = -Float.MAX_VALUE;
    int maxScoreIdx = -1;
    for (int i = 0; i < scores.length; i++) {
        if (scores[i] > maxScore) {
            maxScore = scores[i];
            maxScoreIdx = i;
        }
    }
    className = EmotionClasses.EMOTION_CLASSES[maxScoreIdx];
}

APK

You can also download the APK we build to install and run the EmotionApp.

Screenshots

Light         Dark