SocialBook-AnimateAnyone

We are SocialBook, you can experience our other products through these links.

SocialBook DreamPal
The first complete animate anyone code repository

Shunran Jia,Xuanhong Chen, Chen Wang, Chenxi Yan

We plan to provide a complete set of animate anyone training code and high-quality training data in the next few days to help the community implement its own high-performance animate anyone training.

Overview

SocialBook-AnimateAnyone is a generative model for converting images into videos, specifically designed to create virtual human videos driven by poses.

We have implemented this model based on the AnimateAnyone paper and further developed it based on Moore-AnimateAnyone.We are very grateful for their contributions.

Our contributions include:

  • Conducting secondary development on Moore-AnimateAnyone, where we applied various tricks and different training parameters and approaches compared to Moore, resulting in more stable generation outcomes.
  • Performing pose alignment work, allowing for better consistency across different facial expressions and characters during inference.
  • We plan to open-source our model along with detailed training procedures.

Demos

demo1.mp4
demo2.mp4
demo3.mp4
demo4.mov

Try it online

You can try it out on our demos page now! 1277035298

Click to try

TODO

  • Release Inference Dode
  • Gradio Demo
  • Add Face Enhancement
  • Build online test page
  • ReleaseTraining Code And Data

News

  • [05/27/2024] Release Inference Code
  • [05/31/2024] Add a Gradio Demo
  • [06/03/2024] Add facial repair
  • [06/05/2024] Release a demo page

Getting Started

Installation

Clone repo

git clone git@github.com:arceus-jia/SocialBook-AnimateAnyone.git --recursive

Setup environment

conda create -n aa python=3.10
conda activate aa
pip install -r requirements.txt
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"

Download weights

python tools/download_weights.py

#optional
mkdir -p pretrained_weights/inswapper
wget -O pretrained_weights/inswapper/inswapper_128.onnx  https://github.com/facefusion/facefusion-assets/releases/download/models/inswapper_128.onnx

mkdir -p pretrained_weights/gfp
wget -O pretrained_weights/gfp/GFPGANv1.4.pth https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.4.pth

pretrained_weights structure is:

./pretrained_weights/
|-- public_full
|   |-- denoising_unet.pth
|   |-- motion_module.pth
|   |-- pose_guider.pth
|   └── reference_unet.pth
|-- stable-diffusion-v1-5
|   └── unet
|       |-- config.json
|       └── diffusion_pytorch_model.bin
|-- image_encoder
|   |-- config.json
|   └── pytorch_model.bin
└── sd-vae-ft-mse
    |-- config.json
    └── diffusion_pytorch_model.bin

Quickstart

Inference

Prepare Data

Place the image, dance_video, and aligned_dance_image you prepared into the 'images', 'videos', and 'align_images' folders under the 'data' directory. (In general, 'dance_align_image' refers to a standard frame of a person's pose from the 'dance_video'.)

./data/
|-- images
|   └── human.jpg
└── videos
    └── dance.mp4
└── align_images
    └── dance.jpg

And modify the 'script/test_video.yaml' file according to your configuration.

Run inference

cd script
python test_video.py -L 48 --grid

Parameters:

-L: Frames count
--grid: Enable grid overlay with pose/original_image
--seed: seed
-W: video width
-H: video height
--skip: frame interpolation

And you can see the output results in ./output/

If you want to do facial repair on a video (only for videos of REAL PERSON)

python restore_face.py --ref_image xxx.jpg --input xxx.mp4 --output xxx.mp4

Gradio (beta, under developement)

python app.py

Training