HIN0209's Stars
spotify/pedalboard
🎛 🔊 A Python library for audio.
facebookresearch/AugLy
A data augmentations library for audio, image, text, and video.
huggingface/parler-tts
Inference and training library for high-quality TTS models.
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
usefulsensors/moonshine
Fast and accurate automatic speech recognition (ASR) for edge devices
Plachtaa/seed-vc
State-of-the-Art zero-shot voice conversion & singing voice conversion, with real-time support
mbird1258/Audio-Decomposition
Hypotheses-Paradise/Hypo2Trans
Single-blind supplementary materials for NeurIPS 2023 submission
burchim/AVEC
[WACV 2023] Audio-Visual Efficient Conformer (AVEC) for Robust Speech Recognition
kirill-markin/repo-to-text
Convert a repository structure and its contents into a single text file, including the tree output and file contents in markdown code blocks. It may be useful to chat with LLM about your code.
pde-rent/repo2txt
Dump an entire github repository inside a descriptive file to be used as llm input or training data
joannahong/AV-RelScore
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring" in CVPR23
kotoba-tech/kotoba-whisper
mispchallenge/MISP-ICME-AVSR
hongfeixue/StutteringSpeechChallenge
SLT 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge
Hypotheses-Paradise/UADF
GeorgeEfstathiadis/LLM-Diarize-ASR-Agnostic
Repository for "LLM-based speaker diarization correction: A generalizable approach" paper
impresso/llm-transcript-postcorrection
A repository for preliminary work on HTR/OCR/ASR post-correction based on GPT models.
greeeenmouth/LRDWWS
itsuki8914/Wav2Vec2-JSUT
Speech recognition using Wav2Vec2 with JSUT
jiangjin1999/TAP_ASR
[ICASSP 2024] Cross Modal Training for ASR Error Correction With Contrastive Learning
usc-sail/SynthAudio
Can Synthetic Audio From Generative Foundation Models Assist Audio Recognition and Speech Modeling?
nlp-waseda/SlideAVSR
This is the repository of our paper "SlideAVSR: A Dataset of Paper Explanation Videos for Audio-Visual Speech Recognition"
rithiksachdev/PostASR-Correction-SLT2024
sungnyun/avsr-temporal-dynamics
(SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
sythello/SpeakQL-DBATI-ICASSP2023
Repository of paper "Database-Aware ASR Error Correction for Speech-to-SQL Parsing", at ICASSP 2023
tatsunidas/RadiomicsJ
Java library to compute radiomics features.
Devashree21/Data-Augmentation-using-AugLy
shreeyeets/ASR-Correction-Agent
An agent that uses a local search approach with phoneme corrections and word insertions to refine ASR transcriptions based on a cost function implemented using OpenAI Whisper model.
Santhosh642003/LLMs-for-ASR-Errror-Correction