haloha123's Stars
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
Arafat-Dipto/coderbyte-codes
austrian-code-wizard/SuperiEAR
CS229 Final Project
cychomatica/AudioPure
Defending against Adversarial Audio via Diffusion Model (ICLR 2023)
ga642381/Taiwanese-Whisper
fine-tune Whipser model for Taiwanese speech recognition
kevinkevin556/a2b
Replace arXiv links by their corresponding bibliography in markdowns / Notion database
haloha123/faster-whisper
Faster Whisper transcription with CTranslate2
Team-Potion/transcribe-audio-files
Transcribe a Collection of Waveform Audio Files using whisper_timestamped
linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Megumin6626/LLRT_whisper
A not very efficient attempt to create a real time openai/whisper (Audio to Text Transcriber)
ASR-project/Multilingual-PR
Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three different self-supervised models, Wav2vec (2019, 2020), HuBERT (2021) and WavLM (2022) pretrained on a corpus of English speech that we will use in various ways to perform phoneme recognition for different languages with a network trained with Connectionist Temporal Classification (CTC) algorithm.
WassimTenachi/PhySO
Physical Symbolic Optimization
iwangjian/Paper-Reading-ConvAI
📖 Paper reading list in conversational AI (constantly updating 🤗).
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
lucidrains/bit-diffusion
Implementation of Bit Diffusion, Hinton's group's attempt at discrete denoising diffusion, in Pytorch
jonatasgrosman/wav2vec2-sprint
gymeee0715/ACSSR
TencentGameMate/chinese_speech_pretrain
chinese speech pretrained models
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
RF5/transfusion-asr
Transcribing Speech with Multinomial Diffusion, training code and models.
judiebig/DR-DiffuSE
Revisiting Denoising Diffusion Probabilistic Models for Speech Enhancement: Condition Collapse, Efficiency and Refinement, Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023.
audiolabs/torch-pesq
PyTorch implementation of the Perceptual Evaluation of Speech Quality for wideband audio
archinetai/audio-diffusion-pytorch-trainer
Trainer for audio-diffusion-pytorch
hchen605/ast_inst_cls
ankile/Adversarial-Diffusion
Code for a paper exploring using diffusion models to defend neural networks against adversarial attacks
KevinMusgrave/pytorch-adapt
Domain adaptation made easy. Fully featured, modular, and customizable.