Mediapipe Avatar

Mediapipe Avatar is avatar blendshape library based on Mediapipe and Kalidokit. This module works in the following two modes.

Note: Mediapipe js solution is omitted because of the size of package.

image

Motivation

Using multiple TensorflowJS/Mediapipe models with GPU work slow. Maybe chrome handles them sequentially (ref). One the other hand, wasm-simd TFLite is faster than gpu for small model(ref). And, the models used by Kalidokit is relatively small. So my idea is simple. I try to run them on webworker with wasm-simd TFLite parallel to speed up.

Evalutation

I used my own PC, CPU:Intel(R) Core(TM) i9-9900KF CPU @ 3.60GHz, GPU:NVIDIA GeForce RTX 2080 Ti.

Definition: Frame update time is interval when some parts is updated.

(A)With webworker, average frame update time is about 20msec. (B)With mediapipe JS Solution, average frame update time is about 70msec. Webwoker is faster than mediapipe.

image

On webworker with wasm-simd TFLite

image

With mediapipe JS Solutions

image

Demo

demo demo(slow)

Module Usage

For more detail usage, please see demo src.

install

npm install @dannadori/mediapipe-avatar-js

use motion detector

(1) initialize

detector = new MotionDetector();
detector.initializeManagers(); // initialize

(2) configure

detector.setEnableFullbodyCapture(true);
detector.setUseTFLiteWebWorker(value);// set use or not webworker with wasm-simd tflite
detector.setUseMediapipe(value);// set use or not mediapipe
detector.setEnableFullbodyCapture(value);// set use or not full body capture


(3) predict

const { faceRig, leftHandRig, rightHandRig, poseRig, faceRigMP, leftHandRigMP, rightHandRigMP, poseRigMP } = await detector.predict(snap); //

Each "~Rig" is output from Kalidokit with the image input(snap). Prefix is the parts of body. The suffix "MP" means mediapipe. If mediapipe is not used, each "~RigMP" is null. If tflite is not used, each "~Rig" is null.

controle avatar

(1) initialize

avatar = new MediapipeAvator(vrm);

vrm is the avatar vrm loaded by GLTFLoader.

(2) moving

avatar.updatePose(faceRig, poseRig, leftHandRig, rightHandRig);

Each "~Rig" is output from motion detector. If you want to use mediapipe data, input "~RigMP".