hwRG

Speech AI Engineer

@AITRICSSEOUL, REPUBLIC OF KOREA

hwRG's Stars

facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Language:Python20.6k 203 3722.1k
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
11.7k 269 108758
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
Language:Python11.4k 121 681950
mlfoundations/open_clip
An open source implementation of CLIP.
Language:Python9.8k 78 466952
noisetorch/NoiseTorch
Real-time microphone noise suppression on Linux.
Language:Go9.2k 70 317229
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Language:Jupyter Notebook5.9k 71 988756
microsoft/promptbase
All things prompt engineering
Language:Python5.3k 59 13293
sanchit-gandhi/whisper-jax
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
Language:Jupyter Notebook4.4k 43 178369
MahmoudAshraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Language:Jupyter Notebook3.3k 46 164274
camenduru/text-generation-webui-colab
A colab gradio web UI for running Large Language Models
Language:Jupyter Notebook2.1k 32 35367
collabora/WhisperLive
A nearly-live implementation of OpenAI's Whisper.
Language:Python1.8k 28 171244
ming024/FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Language:Python1.8k 28 212527
LAION-AI/CLAP
Contrastive Language-Audio Pretraining
Language:Python1.3k 28 86129
lucidrains/naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
Language:Python1.3k 53 31100
cuda-mode/resource-stream
CUDA related news and material links
1.1k 37 267
gabrielmittag/NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
Language:Python658 25 46117
Zasder3/train-CLIP
A PyTorch Lightning solution to training OpenAI's CLIP from scratch.
Language:Python654 16 3778
lucidrains/voicebox-pytorch
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
Language:Python593 49 2550
mistralai-sf24/hackathon
Language:Python449 11 237
rtzr/Awesome-Korean-Speech-Recognition
한국어 음성인식 STT API 리스트. 각 성능 벤치마크.
323 6 116
PolyAI-LDN/pheme
Language:Python244 11 1822
Srijith-rkr/Whispering-LLaMA
EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction
Language:Jupyter Notebook222 5 1115
sony/bigvsan
Pytorch implementation of BigVSAN
Language:Python196 29 616
gmltmd789/UnitSpeech
An official implementation of "UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data"
Language:Jupyter Notebook131 11 812
ryeoat3/gomin
GOMIN; Gaudio Open Mel-spectrogram Inversion Network
Language:Python109 6 06
Audio-WestlakeU/ATST-SED
This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".
Language:Jupyter Notebook80 3 1611
luferrer/ConfidenceIntervals
Confidence interval computation for evaluation in machine learning using the bootstrapping approach
Language:Jupyter Notebook64 3 07
Wadaboa/titanet
Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO
Language:Jupyter Notebook56 2 715
PINTO0309/whisper-onnx-tensorrt
ONNX and TensorRT implementation of Whisper
Language:Python55 4 05
naver-ai/facetts
Language:Python45 2 66

hwRG

hwRG's Stars

facebookresearch/audiocraft

BradyFU/Awesome-Multimodal-Large-Language-Models

SYSTRAN/faster-whisper

mlfoundations/open_clip

noisetorch/NoiseTorch

pyannote/pyannote-audio

microsoft/promptbase

sanchit-gandhi/whisper-jax

MahmoudAshraf97/whisper-diarization

camenduru/text-generation-webui-colab

collabora/WhisperLive

ming024/FastSpeech2

LAION-AI/CLAP

lucidrains/naturalspeech2-pytorch

cuda-mode/resource-stream

gabrielmittag/NISQA

Zasder3/train-CLIP

lucidrains/voicebox-pytorch

mistralai-sf24/hackathon

rtzr/Awesome-Korean-Speech-Recognition

PolyAI-LDN/pheme

Srijith-rkr/Whispering-LLaMA

sony/bigvsan

gmltmd789/UnitSpeech

ryeoat3/gomin

Audio-WestlakeU/ATST-SED

luferrer/ConfidenceIntervals

Wadaboa/titanet

PINTO0309/whisper-onnx-tensorrt

naver-ai/facetts