DongChanS's Stars
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
HumanSignal/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
microsoft/nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
aishwaryanr/awesome-generative-ai-guide
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
espnet/espnet
End-to-End Speech Processing Toolkit
Vaibhavs10/insanely-fast-whisper
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
huggingface/parler-tts
Inference and training library for high-quality TTS models.
openai/transformer-debugger
google/lyra
A Very Low-Bitrate Codec for Speech Compression
huggingface/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
juanmc2005/diart
A python package to build AI-powered real-time audio applications
Vaibhavs10/open-tts-tracker
lhotse-speech/lhotse
Tools for handling speech data in machine learning projects.
hollobit/GenAI_LLM_timeline
ChatGPT, GenerativeAI and LLMs Timeline
ZhangXInFD/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
huggingface/dataspeech
JeffC0628/awesome-voice-conversion
A curated list of awesome voice conversion, projects and communities.
Takaaki-Saeki/DiscreteSpeechMetrics
Reference-aware automatic speech evaluation toolkit
Wataru-Nakata/miipher
Unofficial implementation of miipher
sanyalsunny111/LLM-Inheritune
This is the official repository for Inheritune.
DavidMChan/Anim400K
Anim-400K: A dataset designed from the ground up for automated dubbing of video
AILab-CVC/M2PT
[CVPR'24] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
IDRnD/VoxTube
The VoxTube dataset official repository
huckiyang/awesome-neural-reprogramming-prompting
A curated list of awesome adversarial reprogramming and input prompting methods for neural networks since 2022
tal-z/SoundsLike
A python package for finding words that sound like other words. Useful for entity resolution and poetry, among other things.
actionpower/google_cloud_storage
Deno Library to upload files to GCS and obtain signed url