Pinned Repositories
adversarial-examples-pytorch
Implementation of Papers on Adversarial Examples
AlignTTS
Implementation of the AlignTTS
AnimeGAN
A Tensorflow implementation of AnimeGAN for fast photo animation ! This is the Open source of the paper 「AnimeGAN: a novel lightweight GAN for photo animation」, which uses the GAN framwork to transform real-world photos into anime images.
animegan2-pytorch
PyTorch implementation of AnimeGANv2
AnimeGANv2
[Open Source]. The improved version of AnimeGAN. Landscape photos/videos to anime
attentions
PyTorch implementation of some attentions for Deep Learning Researchers.
Attentions-in-Tacotron
AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
AuxiliaryASR
Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)
tsaifangsheng's Repositories
tsaifangsheng/AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
tsaifangsheng/audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
tsaifangsheng/AuxiliaryASR
Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)
tsaifangsheng/bddm
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
tsaifangsheng/bark
🔊 Text-Prompted Generative Audio Model
tsaifangsheng/CDiffuSE
Conditional Diffusion Probabilistic Model for Speech Enhancement
tsaifangsheng/Comprehensive-Transformer-TTS
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
tsaifangsheng/ControlNet
Let us control diffusion models!
tsaifangsheng/diffsptk
A differential version of SPTK
tsaifangsheng/diffusion_distiller
🚀 PyTorch Implementation of "Progressive Distillation for Fast Sampling of Diffusion Models(v-diffusion)"
tsaifangsheng/FastDiff
PyTorch Implementation of FastDiff (IJCAI'22)
tsaifangsheng/GeneFace
GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code
tsaifangsheng/google-research
Google Research
tsaifangsheng/GST-Tacotron
A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
tsaifangsheng/iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform
tsaifangsheng/MB-iSTFT-VITS
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
tsaifangsheng/MQTTS
tsaifangsheng/MSMC-TTS
Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS
tsaifangsheng/NeuralSVB
Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code
tsaifangsheng/nix-tts
🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation
tsaifangsheng/nnsvs
Neural network-based singing voice synthesis library for research
tsaifangsheng/PitchExtractor
Deep Neural Pitch Extractor for Voice Conversion and TTS Training
tsaifangsheng/ProDiff
PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline
tsaifangsheng/self-supervised-phone-segmentation
Phoneme segmentation using pre-trained speech models
tsaifangsheng/stable-diffusion
A latent text-to-image diffusion model
tsaifangsheng/StarGANv2-VC
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
tsaifangsheng/state-spaces
Sequence Modeling with Structured State Spaces
tsaifangsheng/StyleFlow
StyleFlow: Attribute-conditioned Exploration of StyleGAN-generated Images using Conditional Continuous Normalizing Flows (ACM TOG 2021)
tsaifangsheng/UUVC
tsaifangsheng/valle
Zero-Shot Text-To-Speech