Pinned Repositories
asr_guided_tacotron
Use las to enhance the performance of tacotron, especially at the lack of the speaker labels.
auorange
Audio LPC (linear prediction code) using mel spectorgram, compatible for LPCNet
Bert-VITS2
vits2 backbone with bert
BigCiDian
Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.
bit-rnn
Quantize weights and activations in Recurrent Neural Networks.
ICASSP2021_demo
ICASSP2022_demo
TacoLPCNet-demo
TASLP
zhrtvc
Chinese real time voice cloning (VC) and Chinese text to speech (TTS). 好用的中文语音克隆兼中文语音合成系统,包含语音编码器、语音合成器、声码器和可视化模块。
gongchenghhu's Repositories
gongchenghhu/TASLP
gongchenghhu/ICASSP2022_demo
gongchenghhu/Bert-VITS2
vits2 backbone with bert
gongchenghhu/Comprehensive-E2E-TTS
A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS
gongchenghhu/Comprehensive-Transformer-TTS
A Non-Autoregressive Transformer based TTS, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS.
gongchenghhu/emotion2vec
Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
gongchenghhu/emotional-vits
无需情感标注的情感可控语音合成模型,基于VITS
gongchenghhu/espnet
End-to-End Speech Processing Toolkit
gongchenghhu/few-shot-transformer-tts
Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.
gongchenghhu/FG-transformer-TTS
Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis.
gongchenghhu/gongchenghhu.github.io
Portuguese audio samples
gongchenghhu/house
有完整版的PDF下载。
gongchenghhu/IMS-Toucan
Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.
gongchenghhu/iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform
gongchenghhu/Italian-demo
Low resources results for Italian
gongchenghhu/MSMC-TTS
Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS
gongchenghhu/NANSY
gongchenghhu/Polish-demo
gongchenghhu/Portuguese.github.io
Portuguese audio samples
gongchenghhu/PortugueseAudios
gongchenghhu/PortugueseAudios.github.io
gongchenghhu/radtts
Provides training, inference and voice conversion recipes for RADTTS and RADTTS++: Flow-based TTS models with Robust Alignment Learning, Diverse Synthesis, and Generative Modeling and Fine-Grained Control over of Low Dimensional (F0 and Energy) Speech Attributes.
gongchenghhu/SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
gongchenghhu/supplementary-results
gongchenghhu/TASLP-demo
Multi-lingual and multi-speaker audios
gongchenghhu/test
gongchenghhu/w2v2-how-to
How to use our public wav2vec2 dimensional emotion model
gongchenghhu/wav2vec2-codebook-indices
gongchenghhu/ZeroSpeech
VQ-VAE for Acoustic Unit Discovery and Voice Conversion
gongchenghhu/ZMM-TTS
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations