dariadiatlova's Stars
Textualize/rich
Rich is a Python library for rich text and beautiful formatting in the terminal.
netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
metavoiceio/metavoice-src
Foundational model for human-like, expressive TTS
Stability-AI/stable-audio-tools
Generative models for conditional audio generation
lucidrains/voicebox-pytorch
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
shivammehta25/Matcha-TTS
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
TaoRuijie/ECAPA-TDNN
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
facebookresearch/textlesslib
Library for Textless Spoken Language Processing
audeering/w2v2-how-to
How to use our public wav2vec2 dimensional emotion model
X-LANCE/VoiceFlow-TTS
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
dongzhuoyao/awesome-flow-matching
A summary of related works about flow matching, stochastic interpolants
fschmid56/EfficientAT
This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.
p0p4k/pflowtts_pytorch
Unofficial implementation of NVIDIA P-Flow TTS paper
jishengpeng/Languagecodec
Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models
sony/bigvsan
Pytorch implementation of BigVSAN
keonlee9420/DailyTalk
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023
corl-team/rebased
Official implementation of the paper "Linear Transformers with Learnable Kernel Functions are Better In-Context Models"
X-LANCE/UniCATS-CTX-vec2wav
[AAAI 2024] Code for CTX-vec2wav in UniCATS
nii-yamagishilab/ZMM-TTS
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
theodorblackbird/lina-speech
lina-speech : linear attention based text-to-speech
seastar105/pflow-encodec
Implementation of TTS model based on NVIDIA P-Flow TTS Paper
DanielLin94144/StyleTalk
Official release of StyleTalk dataset.
shang0712/HierTTS
ECNU-Cross-Innovation-Lab/ShiftSER
[ICASSP 2023] Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-trained Representations
deepvk/NISQA-s
Lallapallooza/fast-audiomentations
⚡ Blazing fast audio augmentation in Python, powered by GPU for high-efficiency processing in machine learning and audio analysis tasks.
nivibilla/efficient-vits-finetuning
Finetuning VITS Efficiently
EMOsuperb/EMO-SUPERB-submission
EMO-SUPERB submission
MSP-UTD/MSP-Podcast_Challenge
MSP-Podcast Challenge Baseline Code
deepvk/muse
🎵 muse: Music Separation