nangongmujd

Pinned Repositories

Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python00
athena
an open-source implementation of sequence-to-sequence based speech processing engine
Language:Python00
bark
🔊 Text-Prompted Generative Audio Model
Language:Python0 0 00
Bert-VITS2
vits2 backbone with bert
Language:Python00
DenoiseNet
An implementation of DenoiseNet https://arxiv.org/pdf/1701.01687.pdf
Language:Python00
DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
Language:Python0 0 00
FaceSwap
3D face swapping implemented in Python
Language:Python00
ForwardTacotron
⏩ Generating speech in a single forward pass without any attention!
Language:Python0 0 00
GPT-SoVITS
1 mins voice data can also be used to train a good TTS model!
10
PaddleSpeech
Easy-to-use Speech Toolkit including SOTA/Streaming ASR with punctuation, influential TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Language:Python10

nangongmujd's Repositories

nangongmujd/GPT-SoVITS
1 mins voice data can also be used to train a good TTS model!
10
nangongmujd/PaddleSpeech
Easy-to-use Speech Toolkit including SOTA/Streaming ASR with punctuation, influential TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Language:Python10
nangongmujd/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python00
nangongmujd/bark
🔊 Text-Prompted Generative Audio Model
Language:Python0 0 00
nangongmujd/Bert-VITS2
vits2 backbone with bert
Language:Python00
nangongmujd/DenoiseNet
An implementation of DenoiseNet https://arxiv.org/pdf/1701.01687.pdf
Language:Python00
nangongmujd/DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
Language:Python0 0 00
nangongmujd/FaceSwap
3D face swapping implemented in Python
Language:Python00
nangongmujd/ForwardTacotron
⏩ Generating speech in a single forward pass without any attention!
Language:Python0 0 00
nangongmujd/FullSubNet
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Language:Python0 0 00
nangongmujd/GenerSpeech
PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.
Language:Python0 0 00
nangongmujd/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Language:Python00
nangongmujd/iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform
Language:Python00
nangongmujd/OpenTransformer
A No-Recurrence Sequence-to-Sequence Model for Speech Recognition
Language:Python00
nangongmujd/megatts2
Unoffical implementation of Megatts2
nangongmujd/MiniGPT-4
MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models
Language:Python0 0
nangongmujd/NATSpeech
A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)
nangongmujd/polyglot
Multilingual text (NLP) processing toolkit
nangongmujd/so-vits-svc
SoftVC VITS Singing Voice Conversion
nangongmujd/sru
Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)
Language:Python0 0
nangongmujd/StyleSpeech
Official implementation of Meta-StyleSpeech and StyleSpeech
nangongmujd/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
nangongmujd/SyntaSpeech
SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official code
nangongmujd/TransformerTTS
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
nangongmujd/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
nangongmujd/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E, WIP
Language:Python0 0
nangongmujd/VITS-fast-fine-tuning
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
nangongmujd/vits2_pytorch
unofficial vits2-TTS implementation in pytorch
nangongmujd/VocGAN
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
nangongmujd/voxelmorph
Unsupervised Learning for Image Registration