keonlee9420

Everything towards Conversational AI

KRAFTON Inc.Seoul, Republic of Korea

Pinned Repositories

ditto-tts.github.io
Official Demo Page for DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer
Language:HTML281
Comprehensive-Transformer-TTS
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
Language:Python318 12 2041
DailyTalk
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023
Language:Python196 7 313
DiffGAN-TTS
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Language:Python310 9 2744
DiffSinger
PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)
Language:Python229 4 830
Expressive-FastSpeech2
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.
Language:Python277 4 2047
Parallel-Tacotron2
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Language:Python188 13 1944
PortaSpeech
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
Language:Python328 20 2936
STYLER
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021
Language:Python158 6 631
StyleSpeech
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
Language:Python190 6 1723

keonlee9420's Repositories

keonlee9420/PortaSpeech
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
Language:Python328 20 2936
keonlee9420/Comprehensive-Transformer-TTS
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
Language:Python318 12 2041
keonlee9420/DiffGAN-TTS
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Language:Python310 9 2744
keonlee9420/Expressive-FastSpeech2
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.
Language:Python277 4 2047
keonlee9420/DiffSinger
PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)
Language:Python229 4 830
keonlee9420/DailyTalk
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023
Language:Python196 7 313
keonlee9420/StyleSpeech
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
Language:Python190 6 1723
keonlee9420/Parallel-Tacotron2
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Language:Python188 13 1944
keonlee9420/Cross-Speaker-Emotion-Transfer
PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
Language:Python181 6 1726
keonlee9420/STYLER
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021
Language:Python158 6 631
keonlee9420/Comprehensive-E2E-TTS
A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS
Language:Python143 11 519
keonlee9420/Soft-DTW-Loss
PyTorch implementation of Soft-DTW: a Differentiable Loss Function for Time-Series in CUDA
Language:Python121 3 39
keonlee9420/VAENAR-TTS
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.
Language:Python72 4 314
keonlee9420/FastPitchFormant
PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis
Language:Python71 2 314
keonlee9420/WaveGrad2
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Language:Python66 6 516
keonlee9420/Daft-Exprt
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis
Language:Python56 3 713
keonlee9420/evaluate-zero-shot-tts
Evaluation Protocol for Large-Scale Zero-Shot TTS Literature
Language:Python47 6 06
keonlee9420/Comprehensive-Tacotron2
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.
Language:Python45 4 1015
keonlee9420/Robust_Fine_Grained_Prosody_Control
PyTorch Implementation of Robust and fine-grained prosody control of end-to-end speech synthesis
Language:Python41 3 411
keonlee9420/Stepwise_Monotonic_Multihead_Attention
PyTorch Implementation of Stepwise Monotonic Multihead Attention similar to Enhancing Monotonicity for Robust Autoregressive Transformer TTS
Language:Python31 2 26
keonlee9420/Deep-Learning-TTS-Template
This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).
Language:Python15 1 0
keonlee9420/FullSubNet
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Language:Python5 0 0
keonlee9420/tacotron2_MMI
Another PyTorch implementation of Tacotron2 MMI (with waveglow) which supports n_frames_per_step>1 mode(reduction windows) and diagonal guided attention for robust alignments.
Language:Jupyter Notebook5 1 1
keonlee9420/FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Language:Python4 0 03
keonlee9420/audio-flamingo
PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.
Language:Python2
keonlee9420/cs231n
cs231n 2020 Spring assignments implementation
Language:Jupyter Notebook2 1 0
keonlee9420/pintos
KAIST CS330 OS pintos Project
Language:HTML1 1 00
keonlee9420/cs224n
cs224n 2020 Winter assignments implementation
Language:JavaScript1 0
keonlee9420/speech-recognition-app
for mom's speaking practice
Language:JavaScript0 0
keonlee9420/STYLER-Demo
Language:HTML1 0