Free-form Body Motion Generation from Speech (freeMo)

The repo for our work "Free-form Body Motion Generation from Speech" paper.

Video Demo

Directory Structure

|--src //source code
|   |--backup //A runnable implementation of our model
|   |
|   |--repro_nets //other baseline models will be updated soon
|   |      | //same model structure as backup
|   |      | //Audio to Body Dynamics
|   |      | //Speech2Gesture & SpeechDrivenTemplates
|   |
|   |--nets //Some modifications to *repro_nets* for further experiments
|   |   | //Similar macro structure as backup, with some details are different
|   |   | //Some different design choices to freeMo_old
|   |   ...
|   |
|   |--visualise 
|   |--data_utils
|   |--scripts // &
|   |--trainer //args and trainer
  • code
  • data preparation


python src/backup/ --model_name test --model_path pretrained_models/ --initial_pose sample_initial_pose/bill_initial.npy --audio_path sample_audio/clip000040_TWeBl1yQ1oI.wav --textgrid_path sample_audio/clip000040_TWeBl1yQ1oI.TextGrid --audio_decoding --normalization --noise_size 512 --sample_index 0 10 20

The result will be different every time you run the script. The results will be saved in "results/[model_name]", including the json file of 64 randomly generated motion sequences and the visualized videos.

For explanation of the flags, see here.


The will be usable once I upload the processed data . You can also modify the code to use publicly avaliable gesture dataset.