windforestfiremountain's Stars
2noise/ChatTTS
A generative speech model for daily dialogue.
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
gpt-omni/mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
r9y9/wavenet_vocoder
WaveNet vocoder
LCAV/pyroomacoustics
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
BytedanceSpeech/seed-tts-eval
yeyupiaoling/VoiceprintRecognition-Pytorch
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods
facebookresearch/WavAugment
A library for speech data augmentation in time-domain
OlaWod/FreeVC
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
HarryVolek/PyTorch_Speaker_Verification
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
schmiph2/pysepm
Python implementation of performance metrics in Loizou's Speech Enhancement book
LSimon95/megatts2
Unoffical implementation of Megatts2
tarepan/SpeechMOS
Easy-to-Use Speech MOS predictors
yistLin/FragmentVC
Any-to-any voice conversion by end-to-end extracting and fusing fine-grained voice fragments with attention
sarulab-speech/UTMOSv2
UTokyo-SaruLab MOS Prediction System
unilight/s3prl-vc
S3PRL-VC: A Voice Conversion Toolkit based on S3PRL
audiolabs/rir-generator
yistLin/universal-vocoder
A PyTorch implementation of the universal neural vocoder
XiangLi2022/CM-TTS
[Findings of NAACL 2024] Source code of paper CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models
YuanGongND/python-compute-eer
Simple Python script to compute equal error rate (EER) for machine learning model evaluation.
OlaWod/PitchVC
PitchVC: Pitch Conditioned Any-to-Many Voice Conversion
theshi-1128/llm-defense
An easy-to-use Python framework to defend against jailbreak prompts.
SandyPanda-MLDL/-Evaluation-Metrics-Used-For-The-Performance-Evaluation-of-Voice-Conversion-VC-Models
Evaluation Metrics Used For The Performance Evaluation of Voice Conversion (VC) Models
Edresson/Speech2Phone
Speech2Phone: A Multilingual and Text Independent Speaker Identification Model
daved01/Adversarial_Examples
Review and analysis of selected adversarial attacks. We implement common attack methods and evaluate them with a GoogleNet network on ImageNet like data.
gino0950150/RW_VoiceShield
MaxMax2016/DeepSpeaker_RawNet_GE2E
分别在VCTK、AISHELL1 和 VoxCeleb1 三个标准公开数据集上对三种端到端声纹模型框架(Deep Speaker, RawNet, GE2E)进行实验比较。
VoicePrivacy/Adeversarial-Speech-with-YourTTS
ztMotaLee/previous_homepage
Baiang Li's homepage.