DongChanS

DongChanS's Stars

myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
Language:Python30.2k 217 2543k
HumanSignal/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Language:JavaScript19.9k 178 2.3k2.5k
microsoft/nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Language:Python14.1k 284 2.1k1.8k
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
13.3k 257 128839
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python12.5k 210 2.3k2.6k
aishwaryanr/awesome-generative-ai-guide
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
9.9k 338 102.1k
espnet/espnet
End-to-End Speech Processing Toolkit
Language:Python8.6k 177 2.4k2.2k
Vaibhavs10/insanely-fast-whisper
Language:Jupyter Notebook7.9k 68 199553
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Language:Jupyter Notebook6.6k 73 1k799
huggingface/parler-tts
Inference and training library for high-quality TTS models.
Language:Python4.8k 54 124494
openai/transformer-debugger
Language:Python4k 25 14241
google/lyra
A Very Low-Bitrate Codec for Speech Compression
Language:C++3.8k 113 127356
huggingface/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
Language:Python3.7k 66 104304
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
Language:Python1.2k 28 80117
juanmc2005/diart
A python package to build AI-powered real-time audio applications
Language:Python1.1k 22 15290
Vaibhavs10/open-tts-tracker
1.1k 65 1669
lhotse-speech/lhotse
Tools for handling speech data in machine learning projects.
Language:Python966 43 428221
hollobit/GenAI_LLM_timeline
ChatGPT, GenerativeAI and LLMs Timeline
946 85 459
ZhangXInFD/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
Language:Python507 16 2245
huggingface/dataspeech
Language:Python319 13 1651
JeffC0628/awesome-voice-conversion
A curated list of awesome voice conversion, projects and communities.
212 13 213
Takaaki-Saeki/DiscreteSpeechMetrics
Reference-aware automatic speech evaluation toolkit
Language:Python135 6 210
Wataru-Nakata/miipher
Unofficial implementation of miipher
Language:Python114 4 816
sanyalsunny111/LLM-Inheritune
This is the official repository for Inheritune.
Language:Python105 4 29
DavidMChan/Anim400K
Anim-400K: A dataset designed from the ground up for automated dubbing of video
102 7 01
AILab-CVC/M2PT
[CVPR'24] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
Language:Python97 8 24
IDRnD/VoxTube
The VoxTube dataset official repository
Language:HTML62 5 41
huckiyang/awesome-neural-reprogramming-prompting
A curated list of awesome adversarial reprogramming and input prompting methods for neural networks since 2022
Language:Python36 5 00
tal-z/SoundsLike
A python package for finding words that sound like other words. Useful for entity resolution and poetry, among other things.
Language:Python14 1 01
actionpower/google_cloud_storage
Deno Library to upload files to GCS and obtain signed url
Language:TypeScript11 2 00