speech-processing

There are 686 repositories under speech-processing topic.

speechbrain/speechbrain
A PyTorch-based Speech Toolkit
Language:Python10.4k 134 1.2k1.6k
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Language:Jupyter Notebook8.3k 78 1k938
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Language:Python6.8k 58 292631
pliang279/awesome-multimodal-ml
Reading list for research topics in multimodal machine learning
6.6k 178 16891
microsoft/torchscale
Foundation Architecture for (M)LLMs
Language:Python3.1k 44 86219
linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Language:Python2.6k 34 161197
r9y9/wavenet_vocoder
WaveNet vocoder
Language:Python2.4k 95 193496
r9y9/deepvoice3_pytorch
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Language:Python2k 92 195487
resemble-ai/resemble-enhance
AI powered speech denoising and enhancement
Language:Python2k 22 59232
wq2012/awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
1.8k 75 8236
DigitalPhonetics/IMS-Toucan
Controllable and fast Text-to-Speech for over 7000 languages!
Language:Python1.6k 22 171180
TEN-framework/ten-vad
Voice Activity Detection (VAD) : low-latency, high-performance and lightweight
Language:C1.4k116
coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
1.4k 56 199148
haoheliu/voicefixer
General Speech Restoration
Language:Python1.2k 16 62147
mravanelli/SincNet
SincNet is a neural architecture for efficiently processing raw audio samples.
Language:Python1.2k 33 106269
ictnlp/StreamSpeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Language:Python1.1k 14 1786
midas-research/audino
Open source audio annotation tool for humans
Language:JavaScript1.1k 24 62137
X-LANCE/SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
Language:Python890 23 7377
Ryuk17/SpeechAlgorithms
You can find the speech algorithms you want here
Language:C830 22 11248
nyrahealth/CrisperWhisper
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
Language:Python814 16 4145
nanahou/Awesome-Speech-Enhancement
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
Language:MATLAB786 32 5152
drethage/speech-denoising-wavenet
A neural network for end-to-end speech denoising
Language:Python702 19 42164
breizhn/DTLN
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
Language:Python645 9 83166
huawei-noah/Speech-Backbones
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Language:Jupyter Notebook594 23 32124
Audio-WestlakeU/FullSubNet
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Language:Python576 8 63157
ddlBoJack/Speech-Resources
语音方向实验室/公司/资源/实习等，欢迎推荐或自荐
573 20 2268
pliang279/MultiBench
[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
Language:HTML572 15 3785
SuperKogito/spafe
:sound: spafe: Simplified Python Audio Features Extraction
Language:Python477 11 4679
arjo129/uSpeech
Speech recognition toolkit for the arduino
Language:C++475 66 44102
microsoft/UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech
Language:Python467 18 4574
gemengtju/Tutorial_Separation
This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.
Language:MATLAB465 21 294
r9y9/pysptk
A python wrapper for Speech Signal Processing Toolkit (SPTK).
Language:Python446 21 6678
santi-pdp/pase
Problem Agnostic Speech Encoder
Language:Python444 21 4685
novoic/surfboard
Novoic's audio feature extraction library
Language:Python437 15 2047
SforAiDl/Neural-Voice-Cloning-With-Few-Samples
This repository has implementation for "Neural Voice Cloning With Few Samples"
Language:Python436 30 22122
r9y9/nnmnkwii
Library to build speech synthesis systems designed for easy and fast prototyping.
Language:Python398 22 6572

speech-processing

speechbrain/speechbrain

pyannote/pyannote-audio

snakers4/silero-vad

pliang279/awesome-multimodal-ml

microsoft/torchscale

linto-ai/whisper-timestamped

r9y9/wavenet_vocoder

r9y9/deepvoice3_pytorch

resemble-ai/resemble-enhance

wq2012/awesome-diarization

DigitalPhonetics/IMS-Toucan

TEN-framework/ten-vad

coqui-ai/open-speech-corpora

haoheliu/voicefixer

mravanelli/SincNet

ictnlp/StreamSpeech

midas-research/audino

X-LANCE/SLAM-LLM

Ryuk17/SpeechAlgorithms

nyrahealth/CrisperWhisper

nanahou/Awesome-Speech-Enhancement

drethage/speech-denoising-wavenet

breizhn/DTLN

huawei-noah/Speech-Backbones

Audio-WestlakeU/FullSubNet

ddlBoJack/Speech-Resources

pliang279/MultiBench

SuperKogito/spafe

arjo129/uSpeech

microsoft/UniSpeech

gemengtju/Tutorial_Separation

r9y9/pysptk

santi-pdp/pase

novoic/surfboard

SforAiDl/Neural-Voice-Cloning-With-Few-Samples

r9y9/nnmnkwii