speech-processing
There are 566 repositories under speech-processing topic.
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
pliang279/awesome-multimodal-ml
Reading list for research topics in multimodal machine learning
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
microsoft/torchscale
Foundation Architecture for (M)LLMs
r9y9/wavenet_vocoder
WaveNet vocoder
r9y9/deepvoice3_pytorch
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
wq2012/awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
mravanelli/SincNet
SincNet is a neural architecture for efficiently processing raw audio samples.
midas-research/audino
Open source audio annotation tool for humans
resemble-ai/resemble-enhance
AI powered speech denoising and enhancement
haoheliu/voicefixer
General Speech Restoration
Ryuk17/SpeechAlgorithms
Speech Algorithms
nanahou/Awesome-Speech-Enhancement
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
drethage/speech-denoising-wavenet
A neural network for end-to-end speech denoising
breizhn/DTLN
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
huawei-noah/Speech-Backbones
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Audio-WestlakeU/FullSubNet
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
arjo129/uSpeech
Speech recognition toolkit for the arduino
DigitalPhonetics/IMS-Toucan
Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.
pliang279/MultiBench
[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
ddlBoJack/Speech-Resources
语音方向实验室/公司/资源/实习等,欢迎推荐或自荐
santi-pdp/pase
Problem Agnostic Speech Encoder
r9y9/pysptk
A python wrapper for Speech Signal Processing Toolkit (SPTK).
SuperKogito/spafe
:sound: spafe: Simplified Python Audio Features Extraction
novoic/surfboard
Novoic's audio feature extraction library
SforAiDl/Neural-Voice-Cloning-With-Few-Samples
This repository has implementation for "Neural Voice Cloning With Few Samples"
gemengtju/Tutorial_Separation
This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.
microsoft/UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech
r9y9/nnmnkwii
Library to build speech synthesis systems designed for easy and fast prototyping.
speechbrain/speechbrain.github.io
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
ddlBoJack/SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
rishikksh20/VocGAN
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
haoxiangsnr/Wave-U-Net-for-Speech-Enhancement
Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.
seanwood/gcc-nmf
Real-time GCC-NMF Blind Speech Separation and Enhancement