speaker-diarization

There are 125 repositories under speaker-diarization topic.

  • speechbrain/speechbrain

    A PyTorch-based Speech Toolkit

    Language:Python9.1k1341.1k1.4k
  • espnet/espnet

    End-to-End Speech Processing Toolkit

    Language:Python8.6k1792.4k2.2k
  • modelscope/FunASR

    A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

    Language:Python7.4k671.2k795
  • pyannote/pyannote-audio

    Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

    Language:Jupyter Notebook6.6k731k800
  • MahmoudAshraf97/whisper-diarization

    Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

    Language:Jupyter Notebook3.9k48212349
  • linto-ai/whisper-timestamped

    Multilingual Automatic Speech Recognition with word-level timestamps and confidence

    Language:Python2.1k31158163
  • awesome-diarization

    wq2012/awesome-diarization

    A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

  • uis-rnn

    google/uis-rnn

    This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

    Language:Python1.6k10187320
  • Purfview/whisper-standalone-win

    Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

  • modelscope/3D-Speaker

    A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

    Language:Python1.4k17113111
  • diart

    juanmc2005/diart

    A python package to build AI-powered real-time audio applications

    Language:Python1.1k2215290
  • wenet-e2e/wespeaker

    Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

    Language:Python77016133124
  • transcriptionstream/transcriptionstream

    turnkey self-hosted offline transcription and diarization service with llm summary

    Language:Python76681645
  • SpectralCluster

    wq2012/SpectralCluster

    Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.

    Language:Python518194573
  • taylorlu/Speaker-Diarization

    speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition

    Language:Python4731461120
  • nuaazs/VAF_2

    Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.

    Language:Python4034021
  • google/speaker-id

    This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

    Language:Python38618739
  • hitachi-speech/EEND

    End-to-End Neural Diarization

    Language:Python381174659
  • revdotcom/reverb

    Open source inference code for Rev's model

    Language:Python347111524
  • manojpamk/pytorch_xvectors

    Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

    Language:Python30791565
  • DongKeon/Awesome-Speaker-Diarization

    Some comprehensive papers about speaker diarization

  • cvqluu/TDNN

    Time delay neural network (TDNN) implementation in Pytorch using unfold method

    Language:Python1987340
  • IBM-Cloud/chatbot-watson-android

    An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.

    Language:Java195220181
  • NavodPeiris/speechlib

    speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names

    Language:Python16831515
  • cvqluu/Factorized-TDNN

    PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks" and Kaldi

    Language:Python1448534
  • cvqluu/simple_diarizer

    Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

    Language:Python14281627
  • yufan-aslp/AliMeeting

    The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.

    Language:Python11531117
  • Appen/UHV-OTS-Speech

    A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

    Language:Forth1007119
  • Audio-WestlakeU/FS-EEND

    The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024] and "LS-EEND: long-form streaming end-to-end neural diarization with online attractor extraction"

    Language:Python964144
  • yuyq96/D-TDNN

    PyTorch implementation of Densely Connected Time Delay Neural Network

    Language:Python8551324
  • cvqluu/GE2E-Loss

    Pytorch implementation of Generalized End-to-End Loss for speaker verification

    Language:Python834116
  • nezhar/speech-condenser

    A tool for summarizing dialogues from videos or audio

    Language:Python804110
  • FlorianKrey/DNC

    Discriminative Neural Clustering for Speaker Diarisation

    Language:Python789714
  • VidyasagarMSC/WatBot

    An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.

    Language:Java72101753
  • FrenchKrab/IS2023-powerset-diarization

    Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.

    Language:Jupyter Notebook71574