/speaker-verification-tools

This module contains some tools to make a simple speaker verification

Primary LanguagePythonMIT LicenseMIT

speaker-verification-tools

This module contains some tools to make a simple speaker verification.

You can download it with PyPI:

$ pip3 install speaker-verification-tools

To use in your own projects:

import speaker_verification_tools.tools as svt

Basic usage

find_nearest_voice_data(voice_data_list, voice_sample)

Find the nearest voice data based on this voice sample. Could be used to make the naive Accept/Reject decision.

voice_data_list: a list containing all voices data from the dataset. voice_sample: the voice sample reference.

returns: the index of the element from voice_data_list that represents the nearest voice data.

compute_distance(sample1, sample3)

Compute the distance between sample1 and sample2 using O(n) DTW algorithm

sample1: the mfcc data extracted from the audio signal 1. sample2: the mfcc data extracted from the audio signal 2.

returns: Float number representing the minimum distance between sample1 and sample2.

extract_mfcc(signal_data, samplerate=16000, winlen=0.025, winstep=0.01)

Compute MFCC features from an audio signal

signal: the audio signal from which to compute features. Should be an N*1 array. samplerate: the sample rate of the signal we are working with, in Hz. winlen: the length of the analysis window in seconds. Default is 0.025s (25 milliseconds). winstep: the step between successive windows in seconds. Default is 0.01s (10 milliseconds).

returns: A numpy array of size (NUMFRAMES by numcep) containing features. Each row holds 1 feature vector.

extract_mfcc_from_wav_file(path, samplerate=16000, winlen=0.025, winstep=0.01)

Compute MFCC features from a wav file

path: the wav file path to be open. samplerate: the wanted sample rate, in Hz. Default is 16000. If you want no resampling fill this argument with None. winlen: the length of the analysis window in seconds. Default is 0.025s (25 milliseconds). winstep: the step between successive windows in seconds. Default is 0.01s (10 milliseconds).

returns: A numpy array of size (NUMFRAMES by numcep) containing features. Each row holds 1 feature vector.