5Hyeons

5Hyeons's Stars

yl4579/StyleTTS-ZS
StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion
962
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
Language:Python67.8k8k
sh-lee-prml/PeriodWave
The official Implementation of PeriodWave and PeriodWave-Turbo
1117
thuhcsi/mm2022-conversational-tts
Language:Python9
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Language:Python4.9k498
walker-hyf/ECSS
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)
Language:Python484
Camb-ai/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
Language:Jupyter Notebook2.5k199
yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Language:Python4.7k386
hongleizhang/RSPapers
A Curated List of Must-read Papers on Recommender System.
6.1k1.3k
p0p4k/vits2_pytorch
unofficial vits2-TTS implementation in pytorch
Language:Python47285
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Language:Python4.4k554
daniilrobnikov/vits2
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
Language:Jupyter Notebook46348
shivammehta25/Matcha-TTS
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
Language:Jupyter Notebook62378
kwonminki/One-sentence_Diffusion_summary
The repo for studying and sharing diffusion models.
38735
microsoft/SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
Language:Python1.2k114
yistLin/dvector
Speaker embedding (d-vector) trained with GE2E loss
Language:Python27246
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Language:Python6.7k1.2k
sony/ai-research-code
Language:Python34765
WegraLee/deep-learning-from-scratch-3
『밑바닥부터 시작하는 딥러닝 ❸』(한빛미디어, 2020)
Language:Python15593
microsoft/DNS-Challenge
This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
Language:Python1.1k409
OlaWod/FreeVC
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
Language:Python590109
kkoutini/PaSST
Efficient Training of Audio Transformers with Patchout
Language:Python29450
fschmid56/EfficientAT
This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.
Language:Python21941
dhchoi99/NANSY
Language:Python16120
lRomul/argus-freesound
Kaggle | 1st place solution for Freesound Audio Tagging 2019
Language:Python31355
hash2430/pitchtron
TTS for pitch-accented language. Korean dialect DB.
Language:Python15631
KinglittleQ/GST-Tacotron
A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Language:Python35772
Kyubyong/g2pK
g2pK: g2p module for Korean
Language:Python23343
boostcampaitech2/object-detection-level2-cv-17
object-detection-level2-cv-17 created by GitHub Classroom
Language:Python25