VGGVox-PyTorch

Implementing VGGVox for VoxCeleb1 dataset in PyTorch.

Train

pip install -r requirements.txt
python3 train.py --dir ./Data/

Specify data dir with --dir

Notes

81.79% Top-1 & 93.17 Top-5 Test-set accuracy, pretty satisfactory. Find details in results.txt.
Training on the V100 takes 4 mins per epoch.

Model

Run python3 vggm.py for model architecture.
Best model weights uploaded VGGM300_BEST_140_81.99.pth

What i've done so far:

All the data preprocessed exactly as author's matlab code. Checked and verified online on matlab
Random 3s cropped segments for training.
Copy all hyperparameter... LR, optimizer params, batch size from the author's net.
Stabilize PyTorch's BatchNorm and test variants. Improved results by a small percentage.
Try onesided spectrogram input as mentioned on the author's github.
Port the authors network from matlab and train. The matlab model has 1300 outputs dimension, will test it later.
~~Copy weights from the matlab network and test.~~

References and Citations:

VGGVox
linhdvu14's vggvox-speaker-identification
jameslyons's python_speech_features

@InProceedings{Nagrani17,
 author       = "Nagrani, A. and Chung, J.~S. and Zisserman, A.",
 title        = "VoxCeleb: a large-scale speaker identification dataset",
 booktitle    = "INTERSPEECH",
 year         = "2017",
}


@InProceedings{Nagrani17,
 author       = "Chung, J.~S. and Nagrani, A. and Zisserman, A.",
 title        = "VoxCeleb2: Deep Speaker Recognition",
 booktitle    = "INTERSPEECH",
 year         = "2018",
}

zhangshengHust/VGGVox-PyTorch-1

VGGVox-PyTorch

Train

Specify data dir with --dir

Notes

Model

What i've done so far:

References and Citations: