Speaker Identification on VoxCeleb dataset using statistical moments, covariance analysis, PCA based on covariance matrix computation and K-means
sudo apt install ffmpeg
python3 -m venv
venv . venv/bin/activate
pip install requirements.txt
python extract_ds.py /path/to/dataset
python mp42wav.py /path/to/dataset
https://www.robots.ox.ac.uk/~vgg/data/voxceleb/vox1.html
All the audio files in the dataset are stored as *.wav with the following characteristics:
- 16 kHz sampling rate
- mono format
- audio codec = pcm_s16le
To run everything in the script it took 2.44s