/GCDG

code for ICASSP2022 paper: GENRE-CONDITIONED LONG-TERM 3D DANCE GENERATION DRIVEN BY MUSIC

Primary LanguagePython

GENRE-CONDITIONED LONG-TERM 3D DANCE GENERATION DRIVEN BY MUSIC (ICASSP 2022)

Table of contents

Abstract

Dancing to music is an artistic behavior of humans, however, letting machines generate dances from music is still challenging. Most existing works have been made progress in tackling the problem of motion prediction conditioned by music, yet they rarely consider the importance of the musical genre. In this paper, we focus on generating long-term 3D dance from music with a specific genre. Specifically, we construct a pure transformer-based architecture to correlate motion features and music features. To utilize the genre information, we propose to embed the genre categories into the transformer decoder so that it can guide every frame. Moreover, different from previous inference schemes, we introduce the motion queries to output the dance sequence in parallel that significantly improves the efficiency.

Network

Data_Preparation

Download AIST++ from url. Run ./utils/extrac_audio.py to split the original audio to 240 seconds sequences, and run ./utils/ext_audio_features_raw.py to save the cache features. The final data path follows as:


├── AIST
│   ├── ext_audio (download from AIST)
│   ├── audio_sequence (split from ./ext_audio)
│   ├── motions (download from AIST)
│   ├── audio_sequence_features_raw (cache features extracted from ./audio_sequence)
│   ├── wav (download from AIST for testing)
│   ├── wav_features_aist (cache features extracted from ./wav)
      .
      .
      .

Run

For testing, u should download the wav.zip, and use ./utils/ext_audio_features_raw.py to extract the cache features. Then, input correct path dirs in ./config/configs_train.py and ./configs/configs_test.py.

Notice that the model v12 is used in the paper, and we present a new model v13 now in which the motion query is obtained by using a linear layer to project the initial motion. You can change the model version in run_cmtr.sh and run_test2.sh. For testing, please download the model weights for model v12: https://pan.baidu.com/s/1fGA9INeAQA0FAMbmLxMQdg?pwd=2050


For training:
sh run_cmtr.sh

For testing:
sh run_test2.sh

Results:

Q&A

If u have any questions, please concat with huangai@nudt.edu.cn.

Thanks

Thanks to Shuyan Liu for providing the base code and Yu Sun for smpl visualization.