keonlee9420

Everything towards Conversational AI

KRAFTON Inc.Seoul, Republic of Korea

keonlee9420's Stars

openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
Language:Python68.1k 571 08k
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
Language:Jupyter Notebook35.5k 327 4354.2k
chenfei-wu/TaskMatrix
Language:Python34.5k 300 3523.3k
google-research/tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
26.6k 285 412.2k
huggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
Language:Python25.3k 195 4.1k5.2k
mlfoundations/open_clip
An open source implementation of CLIP.
Language:Python9.9k 77 470955
facebookresearch/ImageBind
ImageBind One Embedding Space to Bind Them All
Language:Python8.2k 99 89758
lucidrains/vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
Language:Python2.4k 30 119196
haoheliu/AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Language:Python2.4k 42 105221
archinetai/audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
Language:Python1.9k 39 43167
microsoft/NeuralSpeech
Language:Python1.4k 33 124185
LAION-AI/CLAP
Contrastive Language-Audio Pretraining
Language:Python1.3k 28 88133
lucidrains/soundstorm-pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
Language:Python1.3k 50 2180
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
Language:Python1.1k 27 74103
NVIDIA/BigVGAN
Official PyTorch implementation of BigVGAN (ICLR 2023)
Language:Python848 71 096
LAION-AI/audio-dataset
Audio Dataset for training CLAP and other models
Language:Python617 21 5853
facebookresearch/AudioMAE
This repo hosts the code and models of "Masked Autoencoders that Listen".
Language:Python525 32 2844
microsoft/CLAP
Learning audio concepts from natural language supervision
Language:Python460 14 2135
arpitbansal297/Universal-Guided-Diffusion
Language:Jupyter Notebook443 6 1737
NVlabs/DiffiT
[ECCV 2024] Official Repository for DiffiT: Diffusion Vision Transformers for Image Generation
438 55 414
liusongxiang/Large-Audio-Models
Keep track of big models in audio domain, including speech, singing, music etc.
434 43 125
MasayaKawamura/MB-iSTFT-VITS
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Language:Python416 17 2664
Rongjiehuang/GenerSpeech
PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.
Language:Python313 17 2845
NVIDIA/radtts
Provides training, inference and voice conversion recipes for RADTTS and RADTTS++: Flow-based TTS models with Robust Alignment Learning, Diverse Synthesis, and Generative Modeling and Fine-Grained Control over of Low Dimensional (F0 and Energy) Speech Attributes.
Language:Roff281 15 3040
yangdongchao/SoundStorm
The reproduced code for Google's SoundStorm
Language:Python242 20 2718
chomeyama/SiFiGAN
Official implementation of the source-filter HiFiGAN vocoder
Language:Python233 9 1234
voidful/Codec-SUPERB
Audio Codec Speech processing Universal PERformance Benchmark
Language:Python203 12 1722
keonlee9420/DailyTalk
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023
Language:Python196 7 313
tetrzim/diffusion-human-feedback
Censored Sampling of Diffusion Models Using 3 Minutes of Human Feedback
Language:Python25 3 03
krafton-ai/mini-batch-cl
Language:Python10 1 00

keonlee9420

keonlee9420's Stars

openai/whisper

suno-ai/bark

chenfei-wu/TaskMatrix

google-research/tuning_playbook

huggingface/diffusers

mlfoundations/open_clip

facebookresearch/ImageBind

lucidrains/vector-quantize-pytorch

haoheliu/AudioLDM

archinetai/audio-diffusion-pytorch

microsoft/NeuralSpeech

LAION-AI/CLAP

lucidrains/soundstorm-pytorch

descriptinc/descript-audio-codec

NVIDIA/BigVGAN

LAION-AI/audio-dataset

facebookresearch/AudioMAE

microsoft/CLAP

arpitbansal297/Universal-Guided-Diffusion

NVlabs/DiffiT

liusongxiang/Large-Audio-Models

MasayaKawamura/MB-iSTFT-VITS

Rongjiehuang/GenerSpeech

NVIDIA/radtts

yangdongchao/SoundStorm

chomeyama/SiFiGAN

voidful/Codec-SUPERB

keonlee9420/DailyTalk

tetrzim/diffusion-human-feedback

krafton-ai/mini-batch-cl