xrenaa/Music-Dance-Video-Synthesis

How can I use my own dataset to train network?

pstflw opened this issue · 5 comments

Hello. First of all, thank you for the great work. I'm so impressed by this project and trying to use my own dataset to train this network. However, I have no idea about making my dataset to json file that 'train.py' can read. How should I process my dataset? Thank you.

Sorry for reopening this again. I thought I could solve my problem by refering #4 but I failed. I used Openpose to make my own dataset into skeleton json file, and a lot of keypoints json files came out. How can I process these keypoints json files to train network? I checked data_usage.ipynb but i wasn't able to find answer. Thank you.

Same question.
@xrenaa @pstflw
If I have a music file and openpose result json file,
What should I do to create json file like the "ballet_revised_pose_pairs.json" ?
It seems to encode the music to audio_sequence.

Any help would be appreciated

@wtnan2003 Please refer to dataset_usage. This is the structure of the json file and indicates how to load the json file. Acutally, you can just split the music and openpose file into 5s pieces and then pair them.

@xrenaa Thanks for the quick reply!

I still got some questions about the dataset_usage:

  1. joint_coors shape is (100, 18, 2) in 10s, which means dance video is 10 fps ? so the meaning of shape is (frames, joints, joint_coors)?

  2. code: x_coor=(temp_pose[:,:,0]/320)-1 is for normalization?

  3. code: slices1=d[0:80000].view(50,1600) cut 10s pieces into 5s, but why shape to (50,1600)? just to pair the shape of joint?

Thanks

@wtnan2003 Please refer to dataset_usage. This is the structure of the json file and indicates how to load the json file. Acutally, you can just split the music and openpose file into 5s pieces and then pair them.

Hi, I have questions about processing audio, all I to do with audio is split it into 5s? Needn't I extract any audio feature such as onset, mel spectrum? And seemly there isn’t such code in processing audio feature?THX!