Pinned Repositories
Attentron_FastSpeech2
An implementation of Attentron based on FastSpeech2 backbone
audio2mel_preprocessor
A tool for speech dataset to mel-spectrogram.
bert_phoneme_CN_Taco2
CDFSE_FastSpeech2
The Official Implementation of “Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis”
FastSpeech2
GNN_SemanticTaco2
The code of "Dependency Parsing based Semantic Representation Learning with Graph Neural Network for Enhancing Expressiveness of Text-to-Speech"
MultiSpeaker_FastSpeech2
PyTorch_phoneme_CN_Taco2
A PyTorch inplementation of phoneme-based Tacotron2 for Chinese/Mandarin
Tacotron_VAE
Multi-Speaker Tacotron2 with VAE
text2semantic_t5
Labmem-Zhouyx's Repositories
Labmem-Zhouyx/CDFSE_FastSpeech2
The Official Implementation of “Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis”
Labmem-Zhouyx/FastSpeech2
Labmem-Zhouyx/Attentron_FastSpeech2
An implementation of Attentron based on FastSpeech2 backbone
Labmem-Zhouyx/GNN_SemanticTaco2
The code of "Dependency Parsing based Semantic Representation Learning with Graph Neural Network for Enhancing Expressiveness of Text-to-Speech"
Labmem-Zhouyx/PyTorch_phoneme_CN_Taco2
A PyTorch inplementation of phoneme-based Tacotron2 for Chinese/Mandarin
Labmem-Zhouyx/text2semantic_t5
Labmem-Zhouyx/audio2mel_preprocessor
A tool for speech dataset to mel-spectrogram.
Labmem-Zhouyx/MultiSpeaker_FastSpeech2
Labmem-Zhouyx/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch
Labmem-Zhouyx/PyTorch_character_CN_Taco2
A PyTorch inplementation of character-based Tacotron2 for Chinese/Mandarin
Labmem-Zhouyx/WebDemo_Audio
Labmem-Zhouyx/bert_dependencyparsing_taco2
Labmem-Zhouyx/Chinese_TN_Dataset
Labmem-Zhouyx/continual-learning
PyTorch implementation of various methods for continual learning (XdG, EWC, online EWC, SI, LwF, GR, GR+distill, RtF, ER, A-GEM, iCaRL).
Labmem-Zhouyx/FastSpeech2-1
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Labmem-Zhouyx/FastSpeech2_ACL2022_reproducibility
Labmem-Zhouyx/forced-alignment-tools
A collection of links and notes on forced alignment tools
Labmem-Zhouyx/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Labmem-Zhouyx/icassp2022-zeroshot-tts-finegrained-spkemb
Labmem-Zhouyx/interspeech2022-cdfse-tts
Please visit: https://thuhcsi.github.io/interspeech2022-cdfse-tts
Labmem-Zhouyx/interspeech2022-dependency-semantic-tts
Please visit: https://thuhcsi.github.io/interspeech2022-dependency-semantic-tts
Labmem-Zhouyx/Meta-TTS
Official repository of https://arxiv.org/abs/2111.04040v1
Labmem-Zhouyx/pytorch-loss
label-smooth, amsoftmax, partial-fc, focal-loss, triplet-loss, lovasz-softmax. Maybe useful
Labmem-Zhouyx/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Labmem-Zhouyx/resume
Labmem-Zhouyx/speech-synthesis-paper
List of speech synthesis papers.
Labmem-Zhouyx/STYLER
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021
Labmem-Zhouyx/tacotron
PyTorch implementation of Tacotron and Tacotron2
Labmem-Zhouyx/TTS-Eval
Labmem-Zhouyx/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities