DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions
You can install required environments using below commands:
git clone https://github.com/deepbrainai-research/discohead
cd discohead
conda create -n discohead python=3.7
conda activate discohead
conda install pytorch==1.10.0 torchvision==0.11.1 torchaudio==0.10.0 cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt
- Download the pre-trained checkpoints from google drive and put into
weight
folder. - Download
dataset.zip
from google drive and unzip intodataset
. DisCoHead
directory should have the following structure.
DisCoHead/
├── dataset/
│ ├── grid/
│ │ ├── demo1/
│ │ ├── demo2/
│ ├── koeba/
│ │ ├── demo1/
│ │ ├── demo2/
│ ├── obama/
│ │ ├── demo1/
│ │ ├── demo2/
├── weight/
│ ├── grid.pt
│ ├── koeba.pt
│ ├── obama.pt
├── modules/
‥‥
- The
--mode
argument is used to specify which demo video you want to generate:
python test.py --mode {mode}
- Available modes:
obama_demo1, obama_demo2, grid_demo1, grid_demo2, koeba_demo1, koeba_demo2
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. You must not use this work for commercial purposes. You must not distribute it in modified material. You must give appropriate credit and provide a link to the license.
@INPROCEEDINGS{10095670,
author={Hwang, Geumbyeol and Hong, Sunwon and Lee, Seunghyun and Park, Sungwoo and Chae, Gyeongsu},
booktitle={ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
title={DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions},
year={2023},
volume={},
number={},
pages={1-5},
doi={10.1109/ICASSP49357.2023.10095670}}