Shurong yang*, Huadong Li*, Juhao Wu*, Minhao Jing*†, Linze Li, Renhe Ji‡, Jiajun Liang‡, Haoqiang Fan
MEGVII Technology
*Equal contribution †Lead this project ‡Corresponding author
-
[✅2024.05.24] Inference settings are released.
-
[❌] Data curation pipeline to be released .
-
[❌] Training setup to be released.
MegActor is an intermediate-representation-free portrait animator that uses the original video, rather than intermediate features, as the driving factor to generate realistic and vivid talking head videos. Specifically, we utilize two UNets: one extracts the identity and background features from the source image, while the other accurately generates and integrates motion features directly derived from the original videos. MegActor can be trained on low-quality, publicly available datasets and excels in facial expressiveness, pose diversity, subtle controllability, and visual quality.
demo.mp4
demo4.mp4
demo6.mp4
-
Environments
Detailed environment settings should be found with environment.yaml
conda env create -f environment.yaml pip install -U openmim mim install mmengine mim install "mmcv>=2.0.1" mim install "mmdet>=3.1.0" mim install "mmpose>=1.1.0" conda install -c conda-forge cudatoolkit-dev -y
-
Dataset
To be released.
-
Pretrained weights
Please find our pretrained weights at https://huggingface.co/HVSiniX/RawVideoDriven. Or simply use
git clone https://huggingface.co/HVSiniX/RawVideoDriven && ln -s RawVideoDriven/weights weights
To be released.
Currently only single-GPU inference is supported.
CUDA_VISIBLE_DEVICES=0 python eval.py --config configs/infer12_catnoise_warp08_power_vasa.yaml --source {source image path} --driver {driving video path}
For gradio interface, please run
python demo/run_gradio.py
Many thanks to the authors of mmengine, MagicAnimate, Controlnet_aux, and Detectron2.
If you have any questions, feel free to open an issue or contact us at 15066146083@163.com, lihuadong@megvii.com or wujuhao@megvii.com.