Pinned Repositories
Baselines
a personal baseline for myself
StyleDubber
[ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"
NeuralSpeech
StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
FlashSpeech
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Avatar
Avatar: An easy-to-use digital portrait PPT presentation video generation system based on Gradio
fine-grained-music-discriminators
[ICPR2024] Official implementation of paper "Generating High-Quality Symbolic Music Using Fine-Grained Discriminators"
Speaker2Dubber
[ACM MM24] Official implementation of paper "From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning"
StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
ZZDoog's Repositories
ZZDoog/Avatar
Avatar: An easy-to-use digital portrait PPT presentation video generation system based on Gradio
ZZDoog/Speaker2Dubber
[ACM MM24] Official implementation of paper "From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning"
ZZDoog/fine-grained-music-discriminators
[ICPR2024] Official implementation of paper "Generating High-Quality Symbolic Music Using Fine-Grained Discriminators"
ZZDoog/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models