speaker-diarization

There are 103 repositories under speaker-diarization topic.

  • speechbrain/speechbrain

    A PyTorch-based Speech Toolkit

    Language:Python8.1k1281k1.3k
  • espnet/espnet

    End-to-End Speech Processing Toolkit

    Language:Python8k1762.3k2.1k
  • pyannote/pyannote-audio

    Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

    Language:Jupyter Notebook5.3k65966709
  • modelscope/FunASR

    A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

    Language:Python4.1k49859468
  • MahmoudAshraf97/whisper-diarization

    Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

    Language:Jupyter Notebook2.3k42148231
  • linto-ai/whisper-timestamped

    Multilingual Automatic Speech Recognition with word-level timestamps and confidence

    Language:Python1.6k24133136
  • uis-rnn

    google/uis-rnn

    This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

    Language:Python1.5k10286319
  • awesome-diarization

    wq2012/awesome-diarization

    A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

  • diart

    juanmc2005/diart

    A python package to build AI-powered real-time audio applications

    Language:Python8562013972
  • modelscope/3D-Speaker

    A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

    Language:Python837167570
  • transcriptionstream/transcriptionstream

    turnkey self-hosted offline transcription and diarization service with llm summary

    Language:Python5924926
  • wenet-e2e/wespeaker

    Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

    Language:Python5771889100
  • SpectralCluster

    wq2012/SpectralCluster

    Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.

    Language:Python495194573
  • taylorlu/Speaker-Diarization

    speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition

    Language:Python4531761124
  • nuaazs/VAF_2

    Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.

    Language:Python4045021
  • hitachi-speech/EEND

    End-to-End Neural Diarization

    Language:Python354174657
  • google/speaker-id

    This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

    Language:Python32819240
  • manojpamk/pytorch_xvectors

    Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

    Language:Python30281566
  • cvqluu/TDNN

    Time delay neural network (TDNN) implementation in Pytorch using unfold method

    Language:Python1947340
  • IBM-Cloud/chatbot-watson-android

    An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.

    Language:Java194220182
  • cvqluu/Factorized-TDNN

    PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks" and Kaldi

    Language:Python1438534
  • DongKeon/Awesome-Speaker-Diarization

    Some comprehensive papers about speaker diarization

  • cvqluu/simple_diarizer

    Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

    Language:Python12381526
  • yufan-aslp/AliMeeting

    The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.

    Language:Python1073917
  • NavodPeiris/speechlib

    speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names

    Language:Python10331310
  • Appen/UHV-OTS-Speech

    A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

    Language:Forth997118
  • yuyq96/D-TDNN

    PyTorch implementation of Densely Connected Time Delay Neural Network

    Language:Python8251324
  • cvqluu/GE2E-Loss

    Pytorch implementation of Generalized End-to-End Loss for speaker verification

    Language:Python803115
  • FlorianKrey/DNC

    Discriminative Neural Clustering for Speaker Diarisation

    Language:Python789714
  • nezhar/speech-condenser

    A tool for summarizing dialogues from videos or audio

    Language:Python77419
  • VidyasagarMSC/WatBot

    An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.

    Language:Java72101754
  • Audio-WestlakeU/FS-EEND

    The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024]

    Language:Python623114
  • vishalshar/SpeakerDiarization_RNN_CNN_LSTM

    Speaker Diarization is the problem of separating speakers in an audio. There could be any number of speakers and final result should state when speaker starts and ends. In this project, we analyze given audio file with 2 channels and 2 speakers (on separate channels).

    Language:Python613139
  • wq2012/SimpleDER

    A lightweight library to compute Diarization Error Rate (DER).

    Language:Python60429
  • FrenchKrab/IS2023-powerset-diarization

    Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.

    Language:Jupyter Notebook55562