diarization
There are 110 repositories under diarization topic.
Purfview/whisper-standalone-win
Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.
R3gm/SoniTranslate
Synchronized Translation for Videos. Video dubbing
transcriptionstream/transcriptionstream
turnkey self-hosted offline transcription and diarization service with llm summary
microsoft/UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech
revdotcom/reverb
Open source inference code for Rev's model
gong-io/gecko
Gecko - A Tool for Effective Annotation of Human Conversations
thewh1teagle/sherpa-rs
Rust bindings to https://github.com/k2-fsa/sherpa-onnx
SuyashMore/MevonAI-Speech-Emotion-Recognition
Identify the emotion of multiple speakers in an Audio Segment
narcotic-sh/senko
Very fast, accurate speaker diarization
cvqluu/simple_diarizer
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
taresh18/TTSizer
🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨
desh2608/dover-lap
Python package for combining diarization system outputs.
thewh1teagle/pyannote-rs
pyannote audio diarization in rust
bunyaminergen/Callytics
Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.
wq2012/SimpleDER
A lightweight library to compute Diarization Error Rate (DER).
JSchmie/ScrAIbe
Tool for automatic transcription and speaker diarization based on whisper and pyannote.
Picovoice/falcon
On-device speaker diarization powered by deep learning
cvqluu/nn-similarity-diarization
Neural network based similarity scoring for diarization (pytorch implementation of "LSTM based Similarity Measurement with Spectral Clustering for Speaker Diarization")
jeanjerome/EchoInStone
EchoInStone is an audio processing tool that transcribes, diarizes, and aligns speaker segments from audio files, prioritizing accuracy and reliability.
desh2608/spyder
Simple Python package for fast DER computation
empenoso/offline-audio-transcriber
Локальное и бесплатное распознавание речи с помощью OpenAI Whisper. Автоматизируйте расшифровку лекций и совещаний на вашем ПК без облачных сервисов и подписок
exemplaryai/ai-engine
Easy to use Multi-Provider ASR/Speech To Text and NLP engine
jakariaemon/WSI
Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.
chimechallenge/chime-utils
Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.
harmlessman/PAFTS
PAFTS : Library That Preprocessing Audio For TTS.
pulijon/Sttcast
Transcription from mp3 files to html with or without embedded player
shahruk10/kaldi-tflite
Convert kaldi feature extraction and nnet3 models into Tensorflow Lite models. Currently aimed at converting kaldi's x-vector models and diarization pipelines to tensorflow models.
KaddaOK/TASMAS
Free open-source transcriber and summarizer for file-per-speaker recordings, such as Discord calls recorded by the Craig bot
cadia-lvl/kaldi-speaker-diarization
This repository creates speaker diarization recipes to be used within the egs folder of kaldi.
mmaudet/speaker-splitter
A Python tool to separate audio files by speaker using diarization data.
thewh1teagle/loud.cpp
Whisper.cpp with diarization
ElmiraGhorbani/gpt-speaker-diarization
Conversational Speaker Diarization using OpenAI AI Language Models(gpt-4) and OpenAI Whisper.
orianemartin/WhispGrid
A Whisper to TextGrid script that I use to automatize Corpus Annotation on Praat, with speaker diarization.
SEERNET/Multi-Speaker-Diarization
Automated Multi Speaker diarization API for meetings, calls, interviews, press-conference etc.
CrispStrobe/Susurrus
speech to text gui for different (mostly Whisper, also Voxtral) models and backends, including whisper.cpp, mlx-whisper, faster-whisper, ctranslate2; applies pyannote for diarization
LianaMikael/SpeechDatasets
Large publicly available speech datasets