Webcam Live Portrait

🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥

Webcam result

My_img.mp4

Live_Portrait_Monitor

You can see github repo in here.

video6105138957194890677.mp4

video6105138957194890678.mp4

video6105138957194890680.mp4

video6105138957194890682.mp4

video6105138957194890681.mp4

🔥 Updates

2024/07/10: 🔥 I released the initial version of the inference code for webcam. Continuous updates, stay tuned!

Introduction

This repo, named Webcam Live Portrait, contains the official PyTorch implementation of author paper LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control. I am actively updating and improving this repository. If you find any bugs or have suggestions, welcome to raise issues or submit pull requests (PR) 💖. The webcam_live_portrait and Live_Portrait_Monitor github repos are hosted in two directories.

🔥 Getting Started

1. Clone the code and prepare the environment

git clone https://github.com/Mrkomiljon/Webcam_Live_Portrait.git
cd Webcam_Live_Portrait

# create env using conda
conda create -n LivePortrait python==3.9.18
conda activate LivePortrait
# install dependencies with pip
pip install -r requirements.txt

2. Download pretrained weights

Download pretrained LivePortrait weights and face detection models of InsightFace from Google Drive or Baidu Yun. We have packed all weights in one directory 😊. Unzip and place them in ./pretrained_weights ensuring the directory structure is as follows:

pretrained_weights
├── insightface
│   └── models
│       └── buffalo_l
│           ├── 2d106det.onnx
│           └── det_10g.onnx
└── liveportrait
    ├── base_models
    │   ├── appearance_feature_extractor.pth
    │   ├── motion_extractor.pth
    │   ├── spade_generator.pth
    │   └── warping_module.pth
    ├── landmark.onnx
    └── retargeting_models
        └── stitching_retargeting_module.pth

3. Inference 🚀

python inference.py

If the script runs successfully, you will get an output mp4 file named animations/s6--d0_concat.mp4. This file includes the following results: driving video, input image, and generated result.

Unrealtime result

My_photo--d6_concat.mp4

Or, you can change the input by specifying the -s and -d arguments come from webcam:

python inference.py -s assets/examples/source/MY_photo.jpg 

# or disable pasting back
python inference.py -s assets/examples/source/s9.jpg -d assets/examples/driving/d0.mp4 --no_flag_pasteback

# more options to see
python inference.py -h

4. Gradio interface

We also provide a Gradio interface for a better experience, just run by:

python app.py

5. Inference speed evaluation 🚀🚀🚀

We have also provided a script to evaluate the inference speed of each module:

python speed.py

Below are the results of inferring one frame on an RTX 4090 GPU using the native PyTorch framework with torch.compile:

Model	Parameters(M)	Model Size(MB)	Inference(ms)
Appearance Feature Extractor	0.84	3.3	0.82
Motion Extractor	28.12	108	0.84
Spade Generator	55.37	212	7.59
Warping Module	45.53	174	5.21
Stitching and Retargeting Modules	0.23	2.3	0.31

Note: the listed values of Stitching and Retargeting Modules represent the combined parameter counts and the total sequential inference time of three MLP networks.

Acknowledgements

I would like to thank the contributors of FOMM, Open Facevid2vid, SPADE, InsightFace repositories, for their open research and main authors.