Labmem-Zhouyx's Stars
CompVis/latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
divymurli/VAEs
Variational autoencoders: VAE, gaussian mixture VAE (GMVAE), and a basic ladder VAE (LVAE)
ShannonAI/ChineseBert
Code for ACL 2021 paper "ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information"
Voice-Privacy-Challenge/Voice-Privacy-Challenge-2022
Baseline Recipe for VoicePrivacy Challenge 2022: anonymization systems and evaluation software
CoinCheung/pytorch-loss
label-smooth, amsoftmax, partial-fc, focal-loss, triplet-loss, lovasz-softmax. Maybe useful
NeuroWave-ai/CUCVAE-TTS
NATSpeech/NATSpeech
A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)
yerfor/SyntaSpeech
SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official code
yl4579/StarGANv2-VC
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
thuhcsi/tacotron
PyTorch implementation of Tacotron and Tacotron2
Labmem-Zhouyx/CDFSE_FastSpeech2
The Official Implementation of “Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis”
MLNLP-World/Paper-Writing-Tips
MLNLP社区用来帮助大家避免论文投稿小错误的整理仓库。 Paper Writing Tips
resemble-ai/Resemblyzer
A python package to analyze and compare voices with deep learning
thuhcsi/VAENAR-TTS
The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.
thuhcsi/SpanPSP
keonlee9420/STYLER
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021
Labmem-Zhouyx/FastSpeech2
RookieJunChen/FullSubNet-plus
The official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement".
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
MontrealCorpusTools/Montreal-Forced-Aligner
Command line utility for forced alignment using Kaldi
jik876/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
SungFeng-Huang/Meta-TTS
Official repository of https://doi.org/10.1109/TASLP.2022.3167258. More up-to-date code is in "refactor" branch.
pettarin/forced-alignment-tools
A collection of links and notes on forced alignment tools
ming024/FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Labmem-Zhouyx/GNN_SemanticTaco2
The code of "Dependency Parsing based Semantic Representation Learning with Graph Neural Network for Enhancing Expressiveness of Text-to-Speech"
Labmem-Zhouyx/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch
Labmem-Zhouyx/audio2mel_preprocessor
A tool for speech dataset to mel-spectrogram.
Labmem-Zhouyx/PyTorch_character_CN_Taco2
A PyTorch inplementation of character-based Tacotron2 for Chinese/Mandarin
Labmem-Zhouyx/PyTorch_phoneme_CN_Taco2
A PyTorch inplementation of phoneme-based Tacotron2 for Chinese/Mandarin
Labmem-Zhouyx/bert_phoneme_CN_Taco2