/Webcam_Live_Portrait

Bring portraits to life via webcam!

Primary LanguagePythonMIT LicenseMIT

Webcam Live Portrait

showcase
🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥

Webcam result

My_img.mp4

🔥 Updates

  • 2024/07/10: 🔥 I released the initial version of the inference code for webcam. Continuous updates, stay tuned!

Introduction

This repo, named Webcam Live Portrait, contains the official PyTorch implementation of author paper LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control. I am actively updating and improving this repository. If you find any bugs or have suggestions, welcome to raise issues or submit pull requests (PR) 💖.

🔥 Getting Started

1. Clone the code and prepare the environment

git clone https://github.com/Mrkomiljon/Webcam_Live_Portrait.git
cd Webcam_Live_Portrait

# create env using conda
conda create -n LivePortrait python==3.9.18
conda activate LivePortrait
# install dependencies with pip
pip install -r requirements.txt

2. Download pretrained weights

Download pretrained LivePortrait weights and face detection models of InsightFace from Google Drive or Baidu Yun. We have packed all weights in one directory 😊. Unzip and place them in ./pretrained_weights ensuring the directory structure is as follows:

pretrained_weights
├── insightface
│   └── models
│       └── buffalo_l
│           ├── 2d106det.onnx
│           └── det_10g.onnx
└── liveportrait
    ├── base_models
    │   ├── appearance_feature_extractor.pth
    │   ├── motion_extractor.pth
    │   ├── spade_generator.pth
    │   └── warping_module.pth
    ├── landmark.onnx
    └── retargeting_models
        └── stitching_retargeting_module.pth

3. Inference 🚀

python inference.py

If the script runs successfully, you will get an output mp4 file named animations/s6--d0_concat.mp4. This file includes the following results: driving video, input image, and generated result.

Unrealtime result

My_photo--d6_concat.mp4

Or, you can change the input by specifying the -s and -d arguments come from webcam:

python inference.py -s assets/examples/source/MY_photo.jpg 

# or disable pasting back
python inference.py -s assets/examples/source/s9.jpg -d assets/examples/driving/d0.mp4 --no_flag_pasteback

# more options to see
python inference.py -h

4. Gradio interface

We also provide a Gradio interface for a better experience, just run by:

python app.py

5. Inference speed evaluation 🚀🚀🚀

We have also provided a script to evaluate the inference speed of each module:

python speed.py

Below are the results of inferring one frame on an RTX 4090 GPU using the native PyTorch framework with torch.compile:

Model Parameters(M) Model Size(MB) Inference(ms)
Appearance Feature Extractor 0.84 3.3 0.82
Motion Extractor 28.12 108 0.84
Spade Generator 55.37 212 7.59
Warping Module 45.53 174 5.21
Stitching and Retargeting Modules 0.23 2.3 0.31

Note: the listed values of Stitching and Retargeting Modules represent the combined parameter counts and the total sequential inference time of three MLP networks.

Acknowledgements

I would like to thank the contributors of FOMM, Open Facevid2vid, SPADE, InsightFace repositories, for their open research and main authors.