hwRG's Stars
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
meta-llama/llama
Inference code for Llama models
Stability-AI/stablediffusion
High-Resolution Image Synthesis with Latent Diffusion Models
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
ggerganov/whisper.cpp
Port of OpenAI's Whisper model in C/C++
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
mlfoundations/open_clip
An open source implementation of CLIP.
triton-inference-server/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
lixin4ever/Conference-Acceptance-Rate
Acceptance rates for the major AI conferences
enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
davabase/whisper_real_time
Real time transcription with OpenAI Whisper.
wiseman/py-webrtcvad
Python interface to the WebRTC Voice Activity Detector
lifeiteng/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
tsurumeso/vocal-remover
Vocal Remover using Deep Neural Networks
Edresson/YourTTS
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
NVIDIA/BigVGAN
Official PyTorch implementation of BigVGAN (ICLR 2023)
maum-ai/nuwave
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling @ INTERSPEECH 2021
andersonba/yve-bot
Smart rule-based bot. For Browser & Node.
junhsss/consistency-models
A Toolkit for OpenAI's Consistency Models.
hayeong0/DDDM-VC
Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion" (AAAI 2024)
keonlee9420/StyleSpeech
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
keonlee9420/Cross-Speaker-Emotion-Transfer
PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
dhchoi99/NANSY
keonlee9420/STYLER
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021
ncsoft/avocodo
Official implementation of "Avocodo: Generative Adversarial Network for Artifact-Free Vocoder" (AAAI2023)
haoheliu/ssr_eval
Evaluation and Benchmarking of Speech Super-resolution Methods
neonbjb/tts-scores
Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models
SMART-TTS/SMART-G2P
richardbaihe/a3t
Code for paper A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing
scarletcho/KoLM
Korean text normalization and language preparation package for LM in Kaldi-based ASR system