HIN0209

HIN0209's Stars

spotify/pedalboard
🎛 🔊 A Python library for audio.
Language:C++5.3k 58 192262
facebookresearch/AugLy
A data augmentations library for audio, image, text, and video.
Language:Python5k 79 74301
huggingface/parler-tts
Inference and training library for high-quality TTS models.
Language:Python4.7k 55 117477
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
Language:Python4.2k 89 1k1.1k
usefulsensors/moonshine
Fast and accurate automatic speech recognition (ASR) for edge devices
Language:Python2.3k 34 21103
Plachtaa/seed-vc
State-of-the-Art zero-shot voice conversion & singing voice conversion, with real-time support
Language:Python701 19 4081
mbird1258/Audio-Decomposition
Language:Python37116
Hypotheses-Paradise/Hypo2Trans
Single-blind supplementary materials for NeurIPS 2023 submission
Language:Python95 7 23
burchim/AVEC
[WACV 2023] Audio-Visual Efficient Conformer (AVEC) for Robust Speech Recognition
Language:Jupyter Notebook92 2 168
kirill-markin/repo-to-text
Convert a repository structure and its contents into a single text file, including the tree output and file contents in markdown code blocks. It may be useful to chat with LLM about your code.
Language:Python68 1 312
pde-rent/repo2txt
Dump an entire github repository inside a descriptive file to be used as llm input or training data
Language:Python45 2 06
joannahong/AV-RelScore
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring" in CVPR23
Language:Python30 2 11
kotoba-tech/kotoba-whisper
Language:Python25 1 14
mispchallenge/MISP-ICME-AVSR
Language:Python17 1 01
hongfeixue/StutteringSpeechChallenge
SLT 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge
Language:Python122
Hypotheses-Paradise/UADF
Language:Python110
GeorgeEfstathiadis/LLM-Diarize-ASR-Agnostic
Repository for "LLM-based speaker diarization correction: A generalizable approach" paper
Language:Jupyter Notebook10 1 00
impresso/llm-transcript-postcorrection
A repository for preliminary work on HTR/OCR/ASR post-correction based on GPT models.
Language:Jupyter Notebook10 6 201
greeeenmouth/LRDWWS
Language:Python7 1 00
itsuki8914/Wav2Vec2-JSUT
Speech recognition using Wav2Vec2 with JSUT
Language:Jupyter Notebook7 2 10
jiangjin1999/TAP_ASR
[ICASSP 2024] Cross Modal Training for ASR Error Correction With Contrastive Learning
Language:Python71
usc-sail/SynthAudio
Can Synthetic Audio From Generative Foundation Models Assist Audio Recognition and Speech Modeling?
Language:Python7 3 01
nlp-waseda/SlideAVSR
This is the repository of our paper "SlideAVSR: A Dataset of Paper Explanation Videos for Audio-Visual Speech Recognition"
Language:Python6 1 00
rithiksachdev/PostASR-Correction-SLT2024
Language:Python6 1 00
sungnyun/avsr-temporal-dynamics
(SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
Language:Python50
sythello/SpeakQL-DBATI-ICASSP2023
Repository of paper "Database-Aware ASR Error Correction for Speech-to-SQL Parsing", at ICASSP 2023
Language:Jupyter Notebook3 1 10
tatsunidas/RadiomicsJ
Java library to compute radiomics features.
Language:Java3 2 00
Devashree21/Data-Augmentation-using-AugLy
Language:Jupyter Notebook21
shreeyeets/ASR-Correction-Agent
An agent that uses a local search approach with phoneme corrections and word insertions to refine ASR transcriptions based on a cost function implemented using OpenAI Whisper model.
Language:Python2
Santhosh642003/LLMs-for-ASR-Errror-Correction
Language:Python1 1 00