ZZDoog

Visual Voice Cloning/TTS & Symbolic Music Generation

Hangzhou Dianzi University

Pinned Repositories

Baselines
a personal baseline for myself
Language:Python0 0 00
StyleDubber
[ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"
Language:Python33 5 22
NeuralSpeech
Language:Python1.4k 33 124185
StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Language:Python4.8k 78 191391
FlashSpeech
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Language:Python74 26 33
Avatar
Avatar: An easy-to-use digital portrait PPT presentation video generation system based on Gradio
Language:Jupyter Notebook17 1 11
fine-grained-music-discriminators
[ICPR2024] Official implementation of paper "Generating High-Quality Symbolic Music Using Fine-Grained Discriminators"
Language:Python40
Speaker2Dubber
[ACM MM24] Official implementation of paper "From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning"
Language:Python10 3 01
StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Language:Python00

ZZDoog/Avatar
Avatar: An easy-to-use digital portrait PPT presentation video generation system based on Gradio
Language:Jupyter Notebook17 1 11
ZZDoog/Speaker2Dubber
[ACM MM24] Official implementation of paper "From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning"
Language:Python10 3 01
ZZDoog/fine-grained-music-discriminators
[ICPR2024] Official implementation of paper "Generating High-Quality Symbolic Music Using Fine-Grained Discriminators"
Language:Python40
ZZDoog/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Language:Python00