prasunroy/pose-transfer

Training on custom dataset

Closed this issue · 11 comments

Hi Prasun, thanks for open sourcing the project. I am interested in learning about neural pose transfer and came across your repository.
I wanted to know what steps needed to be followed if we wanted to train and test on custom dataset.

Also for generating test_img_pairs.csv and train_img_pairs.csv did you use the script similar to the one used in
https://github.com/tengteng95/Pose-Transfer/blob/master/tool/create_pairs_dataset.py

Hi, thank you for your interest in this work.

At present, the README is not fully updated with detailed instructions. However, I outlined the essential instructions in a similar issue. Please check #2 (comment).

In our paper, we directly use the train-test split provided by the authors of PATN. This enables us to directly compare our results with PATN. So yes, the image pairs are obtained from the official repository of PATN.

If you like to use a custom dataset with our code, you need to prepare the dataset as follows:

  • img is the root directory of all the images in the dataset. In our case, these images come from the DeepFashion dataset.
  • train_img_list.csv and test_img_list.csv contain list of all the train and test images respectively. In our case, these are same as PATN.
  • train_img_pairs.csv and test_img_pairs.csv contain list of all the train and test image pairs respectively. In our case, these are same as PATN.
  • train_img_keypoints.csv and test_img_keypoints.csv contain x and y coordinates of 18 body keypoints for every image listed in train_img_list.csv and test_img_list.csv respectively. The keypoints are estimated with OpenPose. An occluded keypoint is denoted as (-1, -1) and a visible keypoint is denoted as (x, y). An utility script for estimating keypoints can be found at https://github.com/prasunroy/pose-transfer/blob/main/utils/estimate_keypoints.py.
  • train_pose_maps and test_pose_maps directories contain 18 channel pose heatmaps for every image listed in train_img_list.csv and test_img_list.csv respectively. These heatmaps are generated from respective keypoints. An utility script for generating pose heatmaps can be found at https://github.com/prasunroy/pose-transfer/blob/main/utils/generate_posemaps.py.

A generic workflow on this repository looks like:

  • Copy all images in the dataset into img directory.
  • Create train_img_list.csv and test_img_list.csv.
  • Create train_img_pairs.csv and test_img_pairs.csv.
  • Create train_img_keypoints.csv and test_img_keypoints.csv from train_img_list.csv and test_img_list.csv respectively with estimate_keypoints.py.
  • Create train_pose_maps and test_pose_maps from train_img_keypoints.csv and test_img_keypoints.csv respectively with generate_posemaps.py.
  • Train with train.py.
  • Test with test.py.
  • Evaluate with eval.py

Thanks for replying back.
In estimate_keypoints.py the method being used is OpenPose. If I wanted to try another keypoint estimation algorithm like mediapipe, then what changes would need to be made?
Also, the train_img_pairs.csv, does this repo include code for generating that? Because in PATN, those pairs were created after the keypoints are estimated. So shouldn't the steps be first create train.csv, obtain keypoints using that csv and then create pairs.csv?

The utility script estimate_keypoints.py is based on an OpenPose api wrapper. You can certainly use other algorithms but we do not provide direct support for other keypoint estimators in this utility script. Please refer to respective algorithm's api documentation.

Please note, if the number of keypoints is other than 18 then you need to make some minor changes in the codebase.

No, this repo does not provide any script for generating the image pairs because we use the exact same image pairs provided by the authors of PATN. Also, the keypoint estimation and image pair creation are two independent tasks. So, the order of execution should not affect the workflow.

Ok. Is it okay if I keep this issue open, so that I can get back to you in case of any issues?
Also, in PATN the keypoint and pose map files were written in Python 2. Does the same apply to your codebase also?

Feel free to keep the issue open.
This codebase is fully implemented and tested in Python 3. You should be able to run without any major problem.

Could you please upload train_pose_maps again? Seems that it's not getting extracted.

[UPDATE] Was able to resolve it.

Is it possible to resume training from a certain checkpoint? Looking at train.py it seems there is no functionality for resuming training. Basically there are two models - one for generator and one for discriminator being saved in output directory currently after every 500 iterations. So if I wanted to resume training from certain iteration, how shall I go about it, and what changes should I make in the code?
Also, how much time did it take for you to train the full model for 270K iterations. And what steps can I take to reduce the training time?

[UPDATE] After looking at the code in some details and with some patience, it seems that load_checkpoint function answers my query. Thanks.

My current loss curves after 2 epochs look like this. Are these consistent? Could you share the final loss curves that you obtained?

Loss_D

LossD_lossD

Loss_G

LossG_lossG

LossG_GAN

LossG_lossG

By the looks of it seems that both losses are decreasing. Isn't the generator loss supposed to increase while the discriminator decreases?

Hi,
I was able to perform testing and obtain the results on DeepFashion. Results look good. Could you suggest what changes should be made in code for single image inference?

I am assuming the following needs to be done:

  1. Create csv that includes single image, target image pair.
  2. For both images obtain keypoints and create pose maps.
  3. Feed image and pose map to model to obtain prediction as shown in test.py.

Also as mentioned in #2 in order to resume training we need to mention ckpt_id and ckpt_dir. As such if the previous checkpoint was trained for 25500 iterations, then when we resume training shouldn't the epoch start from 2.

For inference on a single sample you can certainly follow the steps you have mentioned. But there are better ways to implement such function. A more clear approach should be as follows:

  1. Initialize a generator network and load the trained checkpoint. No need to initialize discriminator because we are only performing inference.
  2. Preprocess condition image in a similar way as in the data loader during training.
  3. Estimate condition and target keypoints. Load if already computed.
  4. Compute pose heatmaps and preprocess them in a similar way as in the data loader during training.
  5. Call generator's forward() method with condition image and pose heatmaps as inputs.

I will try to provide a code snippet when I get the time. Unfortunately, I am currently travelling so this may get delayed.

While resuming training, the print function does not consider the epoch offset. This is a minor bug which can be fixed by adding the ckpt_id as the epoch offset while printing debug messages. For example, if you are resuming from epoch 25500 then iteration 2 actually means iteration 25502 (25500 + 2). You can fix this by modifying line 85 of train.py.

Hi, Prasun. I am stuck at creating the image_pairs.csv file.
I am a bit confused with the naming conventions used in the original PATN make_pairs.py.
In PATN keypoint.csv, the format is as follows: Person ID, Keypoint Y, Keypoint X.
Whereas in the keypoint estimation file used in your repo, the format is as follows: File ID, Image width, Image height, Keypoints p0-p17.
It seems that PATN's make_pairs.py can't be directly used, so if you could suggest some other way to create the image_pairs.csv file, that would be helpful. Apart from that, I was able to make everything work on the custom dataset.

Hi @sparshgarg23 Is working for you on custom images without any face loss?