speech-processing
There are 686 repositories under speech-processing topic.
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
pliang279/awesome-multimodal-ml
Reading list for research topics in multimodal machine learning
microsoft/torchscale
Foundation Architecture for (M)LLMs
linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
r9y9/wavenet_vocoder
WaveNet vocoder
r9y9/deepvoice3_pytorch
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
resemble-ai/resemble-enhance
AI powered speech denoising and enhancement
wq2012/awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
DigitalPhonetics/IMS-Toucan
Controllable and fast Text-to-Speech for over 7000 languages!
TEN-framework/ten-vad
Voice Activity Detection (VAD) : low-latency, high-performance and lightweight
coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
haoheliu/voicefixer
General Speech Restoration
mravanelli/SincNet
SincNet is a neural architecture for efficiently processing raw audio samples.
ictnlp/StreamSpeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
midas-research/audino
Open source audio annotation tool for humans
X-LANCE/SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
Ryuk17/SpeechAlgorithms
You can find the speech algorithms you want here
nyrahealth/CrisperWhisper
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
nanahou/Awesome-Speech-Enhancement
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
drethage/speech-denoising-wavenet
A neural network for end-to-end speech denoising
breizhn/DTLN
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
huawei-noah/Speech-Backbones
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Audio-WestlakeU/FullSubNet
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
ddlBoJack/Speech-Resources
语音方向实验室/公司/资源/实习等,欢迎推荐或自荐
pliang279/MultiBench
[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
SuperKogito/spafe
:sound: spafe: Simplified Python Audio Features Extraction
arjo129/uSpeech
Speech recognition toolkit for the arduino
microsoft/UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech
gemengtju/Tutorial_Separation
This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.
r9y9/pysptk
A python wrapper for Speech Signal Processing Toolkit (SPTK).
santi-pdp/pase
Problem Agnostic Speech Encoder
novoic/surfboard
Novoic's audio feature extraction library
SforAiDl/Neural-Voice-Cloning-With-Few-Samples
This repository has implementation for "Neural Voice Cloning With Few Samples"
r9y9/nnmnkwii
Library to build speech synthesis systems designed for easy and fast prototyping.