BAYC-Animated-BoredApes: Speaker-Aware Talking-Head Animation

This is the code repository implementing the paper:

MakeItTalk: Speaker-Aware Talking-Head Animation

Yang Zhou, Xintong Han, Eli Shechtman, Jose Echevarria , Evangelos Kalogerakis, Dingzeyu Li

SIGGRAPH Asia 2020

Abstract We present a method that generates expressive talking-head videos from a single facial image with audio as the only input. In contrast to previous attempts to learn direct mappings from audio to raw pixels for creating talking faces, our method first disentangles the content and speaker information in the input audio signal. The audio content robustly controls the motion of lips and nearby facial regions, while the speaker information determines the specifics of facial expressions and the rest of the talking-head dynamics. Another key component of our method is the prediction of facial landmarks reflecting the speaker-aware dynamics. Based on this intermediate representation, our method works with many portrait images in a single unified framework, including artistic paintings, sketches, 2D cartoon characters, Japanese mangas, and stylized caricatures. In addition, our method generalizes well for faces and characters that were not observed during training. We present extensive quantitative and qualitative evaluation of our method, in addition to user studies, demonstrating generated talking-heads of significantly higher quality compared to prior state-of-the-art methods.

[Project page] [Paper] [Video] [Arxiv] [Colab Demo] [Colab Demo TDLR]

Installation:

Create environment and activate it.

conda create -n makeittalk_env python=3.6
conda activate makeittalk_env

Install FFMPEG Tool

sudo apt-get install ffmpeg

Install all the relevant packages.

pip install -r requirements.txt

You don't need wine for this implementation. It's been removed.

Download the following pre-trained models to models/ folder for testing your own animation.

Model	Link to the model
Voice Conversion	Link
Speech Content Module	Link
Speaker-aware Module	Link
Image2Image Translation Module	Link

Download pre-trained embedding [here] and save to models/dump folder.

Usage Details

Connect to the machine using Chrome Remote Desktop https://remotedesktop.google.com/access/ Follow all intructions to install and access your GCP machine using Chrome Remote Desktop

To produce samples:

place the source files generated from Landmarking Tool into ./input/character_data/
Remove all the audio files from ./input/audio/ and add only the one that's to be used.

In main.py change

the char_name to name of the file without .jpg extension.
Change image_input_dir
audio_name = "audio-to-be-used" without extension

Run the following command in the root directory of the project.

python main.py

Samples

https://drive.google.com/drive/folders/1p9-LWWVvVxB31GEYU-9aQ89KYqOlaVbf?usp=sharing

harisrab/BAYC-Animated-BoredApes

BAYC-Animated-BoredApes: Speaker-Aware Talking-Head Animation

Installation:

Usage Details

Samples