Inferring Speaking Styles from Multi-modal Conversational Context by Multi-scale Relational Graph Convolutional Networks

The source code for Inferring Speaking Styles from Multi-modal Conversational Context by Multi-scale Relational Graph Convolutional Networks on ACM Multimedia 2022.

This project also contains a re-implementation for Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-based Multi-modal Context Modeling on ICASSP 2022.

The source code for the multi-scale speaking style enhanced FastSpeech 2 is available at thuhcsi/mst-fastspeech2.