Hongjiang-Yu's Stars
fishaudio/fish-speech
Brand new TTS solution
WenetSpeech4TTS/wenetspeech4tts
MatsuriDayo/NekoBoxForAndroid
NekoBox for Android / sing-box / universal proxy toolchain for Android
githubvpn007/ClashX
ClashX,ClashX教程,ClashX配置教程,ClashX for mac
jasonppy/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
facebookresearch/ConvNeXt-V2
Code release for ConvNeXt V2 model
NVIDIA/BigVGAN
Official PyTorch implementation of BigVGAN (ICLR 2023)
maum-ai/univnet
Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)
2noise/ChatTTS
A generative speech model for daily dialogue.
TeaPoly/Conformer-Athena
Dynamic Chunk Streaming and Offline Conformer based on athena-team/Athena.
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
sony/bigvsan
Pytorch implementation of BigVSAN
keithito/tacotron
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
csukuangfj/kaldifeat
Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - Provide C++ & Python API
k2-fsa/icefall
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
lifeiteng/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
X-LANCE/UniCATS-CTX-vec2wav
[AAAI 2024] Code for CTX-vec2wav in UniCATS
Azure-Samples/cognitive-services-speech-sdk
Sample code for the Microsoft Cognitive Services Speech SDK
lhotse-speech/lhotse
Tools for handling speech data in machine learning projects.
pseeth/argbind
Simple package for binding functions to CLI or config files.
sh-lee-prml/HierSpeechpp
The official implementation of HierSpeech++
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
haoheliu/AudioLDM2
Text-to-Audio/Music Generation
facebookresearch/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
facebookresearch/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.