aaaqeczyh's Stars
2noise/ChatTTS
A generative speech model for daily dialogue.
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
andrewyng/translation-agent
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
megvii-research/NAFNet
The state-of-the-art image restoration model without nonlinear activation functions.
jik876/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
swz30/Restormer
[CVPR 2022--Oral] Restormer: Efficient Transformer for High-Resolution Image Restoration. SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.
LAION-AI/CLAP
Contrastive Language-Audio Pretraining
haoheliu/voicefixer
General Speech Restoration
TencentGameMate/chinese_speech_pretrain
chinese speech pretrained models
descriptinc/melgan-neurips
GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis
NVIDIA/BigVGAN
Official PyTorch implementation of BigVGAN (ICLR 2023)
gemelo-ai/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
lmnt-com/diffwave
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
pltrdy/rouge
A full Python Implementation of the ROUGE Metric (not a wrapper)
seungwonpark/melgan
MelGAN vocoder (compatible with NVIDIA/tacotron2)
jitsi/jiwer
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
TaoRuijie/ECAPA-TDNN
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
OlaWod/FreeVC
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
sp-uhh/sgmse
Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation
haoheliu/voicefixer_main
General Speech Restoration
google-research-datasets/cvss
CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus
AndreevP/wvmos
MOS score prediction by fine-tuned wav2vec2.0 model
YouTaoBaBa/Chinese-Dialogue-Dataset
用于汇总目前的开源中文对话数据集
epfl-dlab/llm-latent-language
Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".
fpaissan/tinyCLAP
Implementation of tinyCLAP.
sunzewei2715/Doc2Doc_NMT
The repository for the paper: Rethinking Document-level Neural Machine Translation
google/df-conformer
Audio samples accompanying publications related to DF-Conformer, a speech enhancement model.
PKU-ONELab/Themis
The official repository for our NLG evaluation LLM Themis and the paper Themis: Towards Flexible and Interpretable NLG Evaluation.