Moonmore's Stars
Hannibal046/Awesome-LLM
Awesome-LLM: a curated list of Large Language Model
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
huggingface/parler-tts
Inference and training library for high-quality TTS models.
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
libAudioFlux/audioFlux
A library for audio and music analysis, feature extraction.
haoheliu/AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Vaibhavs10/open-tts-tracker
lenML/Speech-AI-Forge
🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.
jishengpeng/WavTokenizer
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
gemelo-ai/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
ga642381/speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
sihyun-yu/REPA
Official Pytorch Implementation of Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
google/visqol
Perceptual Quality Estimator for speech and audio
ddlBoJack/emotion2vec
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
csteinmetz1/pyloudnorm
Flexible audio loudness meter in Python with implementation of ITU-R BS.1770-4 loudness algorithm
AI-Hobbyist/Genshin_Datasets
Genshin Datasets For SVC/SVS/TTS
yangdongchao/AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
facebookresearch/textlesslib
Library for Textless Spoken Language Processing
lucidrains/e2-tts-pytorch
Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
huggingface/dataspeech
b04901014/MQTTS
innnky/ar-vits
text to speech using autoregressive transformer and VITS
TongTong313/rectified-flow
从零手搓Flow Matching(Rectified Flow)
maum-ai/phaseaug
ICASSP 2023 Accepted
scutcsq/Neural-Transducers-for-Two-Stage-Text-to-Speech-via-Semantic-Token-Prediction
Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (arXiv:2401.01498)
asappresearch/simple-tts
Contains the code associated with the ICLR submission for our text-to-speech diffusion model
nii-yamagishilab/PartialSpoof
omine-me/LaughterSegmentation
Latest laughter detection & segmentaion model. Paper: "Robust Laughter Segmentation with Automatic Diverse Data Synthesis", Interspeech 2024
YangAi520/APCodec