haloha123

haloha123's Stars

suno-ai/bark
🔊 Text-Prompted Generative Audio Model
Language:Jupyter Notebook36.3k4.3k
Arafat-Dipto/coderbyte-codes
Language:Python21
austrian-code-wizard/SuperiEAR
CS229 Final Project
Language:Jupyter Notebook1
cychomatica/AudioPure
Defending against Adversarial Audio via Diffusion Model (ICLR 2023)
Language:Python261
ga642381/Taiwanese-Whisper
fine-tune Whipser model for Taiwanese speech recognition
Language:Python278
kevinkevin556/a2b
Replace arXiv links by their corresponding bibliography in markdowns / Notion database
Language:Python222
haloha123/faster-whisper
Faster Whisper transcription with CTranslate2
1
Team-Potion/transcribe-audio-files
Transcribe a Collection of Waveform Audio Files using whisper_timestamped
Language:Python1
linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Language:Python2.1k160
Megumin6626/LLRT_whisper
A not very efficient attempt to create a real time openai/whisper (Audio to Text Transcriber)
Language:Python31
ASR-project/Multilingual-PR
Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three different self-supervised models, Wav2vec (2019, 2020), HuBERT (2021) and WavLM (2022) pretrained on a corpus of English speech that we will use in various ways to perform phoneme recognition for different languages with a network trained with Connectionist Temporal Classification (CTC) algorithm.
Language:Python21018
WassimTenachi/PhySO
Physical Symbolic Optimization
Language:Python1.8k253
iwangjian/Paper-Reading-ConvAI
📖 Paper reading list in conversational AI (constantly updating 🤗).
988164
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Language:Python10.9k691
lucidrains/bit-diffusion
Implementation of Bit Diffusion, Hinton's group's attempt at discrete denoising diffusion, in Pytorch
Language:Python33517
jonatasgrosman/wav2vec2-sprint
Language:Jupyter Notebook17732
gymeee0715/ACSSR
Language:Python11
TencentGameMate/chinese_speech_pretrain
chinese speech pretrained models
Language:Shell1k87
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Language:Python20.3k2.6k
RF5/transfusion-asr
Transcribing Speech with Multinomial Diffusion, training code and models.
Language:Python765
judiebig/DR-DiffuSE
Revisiting Denoising Diffusion Probabilistic Models for Speech Enhancement: Condition Collapse, Efficiency and Refinement, Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023.
Language:Python353
audiolabs/torch-pesq
PyTorch implementation of the Perceptual Evaluation of Speech Quality for wideband audio
Language:Python15215
archinetai/audio-diffusion-pytorch-trainer
Trainer for audio-diffusion-pytorch
Language:Python12822
hchen605/ast_inst_cls
Language:Python4
ankile/Adversarial-Diffusion
Code for a paper exploring using diffusion models to defend neural networks against adversarial attacks
Language:Jupyter Notebook81
KevinMusgrave/pytorch-adapt
Domain adaptation made easy. Fully featured, modular, and customizable.
Language:Python36015