SoonbeomChoi's Stars
sammccord/solid-pixi
Create PIXI applications with JSX and Signals
polm/cutlet
Japanese to romaji converter in Python
taishi-i/awesome-japanese-nlp-resources
A curated list of resources dedicated to Python libraries, LLMs, dictionaries, and corpora of NLP for Japanese
keonlee9420/Comprehensive-E2E-TTS
A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS
heatz123/naturalspeech
A fully working pytorch implementation of NaturalSpeech (Tan et al., 2022)
lucidrains/naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
aparrish/pronouncingjs
a simple javascript interface to the CMU pronouncing dictionary (for node and browser!)
tauri-apps/tauri
Build smaller, faster, and more secure desktop applications with a web frontend.
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
xinjli/allosaurus
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
LuChengTHU/dpm-solver
Official code for "DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps" (Neurips 2022 Oral)
YatingMusic/ddsp-singing-vocoders
Official implementation of SawSing (ISMIR'22)
acids-ircam/creative_ml
Creative Machine Learning course and notebook tutorials in JAX, PyTorch and Numpy
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
superpoweredSDK/web-audio-javascript-webassembly-SDK-interactive-audio
🌐 Superpowered Web Audio JavaScript and WebAssembly SDK for modern web browsers. Allows developers to implement low-latency interactive audio features into web sites and web apps with a friendly Javascript API. https://superpowered.com
archinetai/audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
cainesap/syllabify
Automatically convert plain text into phonemes (US English pronunciation) and syllabify
repp/big-phoney
Get phonetic spellings and syllable counts for any english word. Works with made-up and non-dictionary words
bentoml/BentoML
The easiest way to serve AI apps and models - Build reliable Inference APIs, LLM apps, Multi-model chains, RAG service, and much more!
teticio/audio-diffusion
Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead of images.
keonlee9420/DiffGAN-TTS
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
csteinmetz1/auraloss
Collection of audio-focused loss functions in PyTorch
chomeyama/SiFiGAN
Official implementation of the source-filter HiFiGAN vocoder
chq1155/A-Survey-on-Generative-Diffusion-Model
revsic/torch-nansypp
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis
rishikksh20/HiFiplusplus-pytorch
HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement
SoonbeomChoi/BEGANSing
Korean Singing Voice Synthesis based on Auto-regressive Boundary Equilibrium GAN
lawrencecchen/solid-konva