Pinned Repositories
AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
beaqlejs
*BeaqleJS* provides a framework to create browser based listening tests and is purely based on open web standards like HTML5 and Javascript.
book-text-to-speech
A book about Text-to-Speech (TTS) in Chinese.
ClariNet
A Pytorch Implementation of ClariNet
Concatenate_wav
Concatenate wavs(for unit selection)
CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
FloWaveNet
A Pytorch implementation of "FloWaveNet: A Generative Flow for Raw Audio"
sunxh16's Repositories
sunxh16/AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
sunxh16/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
sunxh16/book-text-to-speech
A book about Text-to-Speech (TTS) in Chinese.
sunxh16/ClariNet
A Pytorch Implementation of ClariNet
sunxh16/Concatenate_wav
Concatenate wavs(for unit selection)
sunxh16/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
sunxh16/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
sunxh16/FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
sunxh16/FloWaveNet
A Pytorch implementation of "FloWaveNet: A Generative Flow for Raw Audio"
sunxh16/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
sunxh16/NeuralVoicePuppetry
This github contains the network architectures of NeuralVoicePuppetry.
sunxh16/NNPACK
Acceleration package for neural networks on multi-core CPUs
sunxh16/nonparaSeq2seqVC_code
Implementation code of non-parallel sequence-to-sequence VC
sunxh16/onnxruntime
ONNX Runtime
sunxh16/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch
sunxh16/Python-Wrapper-for-World-Vocoder
A Python wrapper for the high-quality vocoder "World"
sunxh16/rigl
End-to-end training of sparse deep neural networks with little-to-no performance loss.
sunxh16/SincNet
SincNet is a neural architecture for efficiently processing raw audio samples.
sunxh16/so-vits-svc
SoftVC VITS Singing Voice Conversion
sunxh16/sp2si-code
Contains code for our work on speech to singing conversion (ICASSP 2020)
sunxh16/SqueezeWave
sunxh16/tacotron2_v1
DeepMind's Tacotron-2 Tensorflow implementation
sunxh16/tacotron2_v2
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
sunxh16/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
sunxh16/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
sunxh16/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
sunxh16/voice_conversion
sunxh16/wav2letter
Facebook AI Research Automatic Speech Recognition Toolkit
sunxh16/waveglow
A Flow-based Generative Network for Speech Synthesis
sunxh16/World
A high-quality speech analysis, manipulation and synthesis system