The motion transfer task is essentially an image / video synthesis task based on motion information. This task can be divided into two sub-problems, one is to seek a representation that can properly express motion information, and the second is to try to find the mapping function between the motion information representation and the image domain.
Everybody Dance Now, the original paper use 2D skeleton model as motion information representation, now we try to use Part Map as motion information representation instead of 2D skeleton model. But later, I think this is not a good idea, because Part Map bring more identity information (such as height, fat and thin) of source people to target people. Intuitively speaking, the final result will not look like the target people.
Usually we need a pose estimation model and a generative model to reach our purpose.
In this project, we use: BodyPix + Pix2PixHD
The source code of BodyPix and Pix2PixHD is included, we can find it in src/
.
We need to install dependencies and prepare the build directory by running yarn
in src/body-pix/demos
. please make sure you have install yarn
before running the command.
If you still want to use 2D skeleton model as motion information representation, please refer to EverybodyDanceNow_reproduce_pytorch. This repository take 18 key point model to be the motion information representation , if you want to make the same labels like original paper(include face key points, hand key points and body key points), please refer to OpenPose Video Installation Guide.
Put the source video in src/body-pix/demos
folder, rename it to source_video.mp4
.
Put the target video in src/body-pix/demos
folder, rename it to target_video.mp4
.
-
Run
python 0. prepare_folder.py
to builddata
folder structure, which we used to store our data. If you run this command again, thedata
folder will be removed and all the data will be deleted. -
Prepare source dataset and target dataset.
- target dataset:
To obtain video frames:
Step 1: Find
segmentBodyInRealTime()
insrc/body-pix/demos/index.js
. Commentconst canvas = document.getElementById('output');
and uncommentconst canvas = document.getElementById('main');
. Commentline 654 to line 713
. Step 2: Open
index.html
insrc/body-pix/demos/
, changesrc
attribute's value totarget_video.mp4
. Step 3: Set your browser's download location as
data/target/ori_images/
. Step 4: In
src/body-pix/demos
folder, runyarn watch
. [ Be careful, make sure you have enough space on your hard drive. Not only the disk where this project placed, but also the boot disk, normally isC:
, browser cache will be generated here! ] To obtain corresponding labels:
Step 1: Find
segmentBodyInRealTime()
insrc/body-pix/demos/index.js
. Uncommentconst canvas = document.getElementById('output');
and commentconst canvas = document.getElementById('main');
. Uncommentline 654 to line 713
. Step 2: Set your browser's download location as
data/target/label_images/
. Step 3: In
src/body-pix/demos
folder, runyarn watch
. [ Be careful again! ]- source dataset: put labels of every frames in source video in
data/source/label_images/
.
To obtain labels:
Step 1: Find
segmentBodyInRealTime()
insrc/body-pix/demos/index.js
. Uncommentconst canvas = document.getElementById('output');
and commentconst canvas = document.getElementById('main');
. Uncommentline 654 to line 713
. Step 2: Set your browser's download location as
data/source/label_images/
. Step 3: In
src/body-pix/demos
folder, runyarn watch
. [ Be careful again and again! ]After all steps done above:
① run
python 1.1 resize_image.py
for/data/source/label_images/
,data/target/label_images/
anddata/target/ori_images/
. [ Remember to change the path in the script ]Check result in
data/source/resized_label_images/
,data/target/resized_label_images/
anddata/target/resized_ori_images/
. ② run
python 1.2 make_label.py
fordata/source/resized_label_images/
anddata/target/resized_label_images/
. [ Remember to change the path in the script ]Check result in
data/source/final_label_images/
anddata/target/final_label_images/
. ③ run
python 1.3 remake_label.py
fordata/source/final_label_images/
. Check result in the same place,data/source/final_label_images/
. Why we need this step? That's because the label of the source video and the label of the target video are not in a one-to-one correspondence! For example, the value of LEFT UPPER HAND region in target video is 7. But the value 7 in source video maybe represent RIGHT UPPER LEG region! We need to fix this issue by listing the correspondence between values and regions, both source video and target video. And correct the value of labels in
data/source/final_label_images/
. [ Need to make changes in the script.] -
Run
python 2. train.py
to train the generative model. -
Run
python 3. normalize.py
to normalize the labels indata/source/final_label_image/
, check the result indata/source/final_label_image_norm/
. -
Run
python 4. transfer.py
to get the generated frames, you can check it inresults/
. -
Run
python 5. make_gif.py
to get the final gif resultoutput.gif
in current folder.
EverybodyDanceNow_reproduce_pytorch