diarization
There are 84 repositories under diarization topic.
Purfview/whisper-standalone-win
Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.
R3gm/SoniTranslate
Synchronized Translation for Videos. Video dubbing
transcriptionstream/transcriptionstream
turnkey self-hosted offline transcription and diarization service with llm summary
microsoft/UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech
revdotcom/reverb
Open source inference code for Rev's model
gong-io/gecko
Gecko - A Tool for Effective Annotation of Human Conversations
thewh1teagle/sherpa-rs
Rust bindings to https://github.com/k2-fsa/sherpa-onnx
SuyashMore/MevonAI-Speech-Emotion-Recognition
Identify the emotion of multiple speakers in an Audio Segment
cvqluu/simple_diarizer
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
desh2608/dover-lap
Python package for combining diarization system outputs.
thewh1teagle/pyannote-rs
pyannote audio diarization in rust
bunyaminergen/Callytics
Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.
wq2012/SimpleDER
A lightweight library to compute Diarization Error Rate (DER).
Picovoice/falcon
On-device speaker diarization powered by deep learning
cvqluu/nn-similarity-diarization
Neural network based similarity scoring for diarization (pytorch implementation of "LSTM based Similarity Measurement with Spectral Clustering for Speaker Diarization")
JSchmie/ScrAIbe
Tool for automatic transcription and speaker diarization based on whisper and pyannote.
desh2608/spyder
Simple Python package for fast DER computation
jeanjerome/EchoInStone
EchoInStone is an audio processing tool that transcribes, diarizes, and aligns speaker segments from audio files, prioritizing accuracy and reliability.
exemplaryai/ai-engine
Easy to use Multi-Provider ASR/Speech To Text and NLP engine
jakariaemon/WSI
Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.
chimechallenge/chime-utils
Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.
pulijon/Sttcast
Transcription from mp3 files to html with or without embedded player
shahruk10/kaldi-tflite
Convert kaldi feature extraction and nnet3 models into Tensorflow Lite models. Currently aimed at converting kaldi's x-vector models and diarization pipelines to tensorflow models.
cadia-lvl/kaldi-speaker-diarization
This repository creates speaker diarization recipes to be used within the egs folder of kaldi.
harmlessman/PAFTS
PAFTS : Library That Preprocessing Audio For TTS.
KaddaOK/TASMAS
Free open-source transcriber and summarizer for file-per-speaker recordings, such as Discord calls recorded by the Craig bot
ElmiraGhorbani/gpt-speaker-diarization
Conversational Speaker Diarization using OpenAI AI Language Models(gpt-4) and OpenAI Whisper.
orianemartin/WhispGrid
A Whisper to TextGrid script that I use to automatize Corpus Annotation on Praat, with speaker diarization.
thewh1teagle/loud.cpp
Whisper.cpp with diarization
SEERNET/Multi-Speaker-Diarization
Automated Multi Speaker diarization API for meetings, calls, interviews, press-conference etc.
LianaMikael/SpeechDatasets
Large publicly available speech datasets
bunyaminergen/WavLMMSDD
This repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvidia.
Rehan-Ahmad/Speech-Music-Segmentation
This repository consists of unsupervised segmentation of audio files consist of music and speech.
theshajha/whisper-realtime-speech-to-text-summary
Transcribe real-world speech with an API call. Based on Whisper(ASR by OpenAI) - https://openai.com/blog/whisper/
mmaudet/speaker-splitter
A Python tool to separate audio files by speaker using diarization data.
DTDwind/RTTManz
A simple Python package for analyzing the necessary data in Speaker Diarization using oracle RTTM files and audio files.