yahcong's Stars
rabitt/ismir2017-deepsalience
Companion code for ISMIR 2017 paper "Deep Salience Representations for $F_0$ Estimation in Polyphonic Music"
philipperemy/speaker-change-detection
Paper: https://arxiv.org/abs/1702.02285
BornInWater/Overlap-Detection
Overlapped Speech detection in Multi-party Conversations
shvmshukla/Speaker-Change-Detection
Speaker Diarization is the first step in many early audio processing and aims to solve the problem ”who spoke when”. It therefore relies on efficient use of temporal information from extracted audio features.
yinruiqing/change_detection
Code for Speaker Change Detection in Broadcast TV using Bidirectional Long Short-Term Memory Networks
kaldi-asr/kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
Jamiroquai88/VBDiarization
Speaker diarization based on Kaldi x-vectors, tuned for 16k microphone data
Janghyun1230/Speaker_Verification
Tensorflow implementation of "Generalized End-to-End Loss for Speaker Verification"
Suhee05/Text-Independent-Speaker-Verification
Text Independent Speaker Verification Using GE2E Loss
HaiFengZeng/GE2E
funcwj/ge2e-speaker-verification
Pytorch implementation of "Generalized End-to-End Loss for Speaker Verification"
astorfi/3D-convolutional-speaker-recognition
:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification
wangleiai/dVectorSpeakerRecognition
基于dVector的说话人识别keras
espnet/espnet
End-to-End Speech Processing Toolkit
google/end-to-end
End-To-End is a crypto library to encrypt, decrypt, digital sign, and verify signed messages (implementing OpenPGP)
keras-team/keras
Deep Learning for humans
AKBoles/Deep-Learning-Speech-Recognition
Project to learn about speech recognition - both Speaker Diarization and other Speech Recognition applications.
pyannote/pyannote-metrics
A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems
crystal-method/Looking-to-Listen
google/uis-rnn
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
hbredin/TristouNet
TristouNet: Triplet Loss for Speaker Turn Embedding
juanjobosch/SourceFilterContoursMelody
Melody extraction based on source-filter modelling
mozilla/DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
yinruiqing/diarization_with_neural_approach
Felix-Yan/FastICA
A python version of fast and robust ICA based on the paper of Aapo Hyvärinen.
ZhihaoDU/speech_feature_extractor
Some useful features of speech process, such as MFCC, gammatone filterbank, GFCC, spectrum(power spectrum and log-power spectrum), Amplitude Modulation Spectrum(AMS) and so on.
justinsalamon/melosynth
Synthesize a continuous pitch sequence
justinsalamon/scaper
A library for soundscape synthesis and augmentation
marl/crepe
CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)
ankitshah009/Task-4-Large-scale-weakly-supervised-sound-event-detection-for-smart-cars
Task 4 Large-scale weakly supervised sound event detection for smart cars