[How to?] Embeddings for each .wav file in dataset folder
Closed this issue · 2 comments
Hi! I am using a custom data which is represented as folder of .wav type files. I am curios how to create x-vector for each of the file in dataset. I tried using 'extract.py' and pytorch_run.sh
only with changing according dirnames and others.
But for now, I understood there must be some feature extraction and splitting processes first. So there is a help request of how to solve the problem above using your pretrained models (I use xvec_preTrained/checkpoint_step309.tar
from readme link).
If you need to only extract x-vectors, check out egs/sv_voices.sh
.
The first part of the script performs x-vector extraction (before scoring, line 62) - you might need to add an exit statement to stop the script at that point. You will need to supply these variables: kaldiDir
, wavDir
, modelDir
and transformDir
Thank for your point! That helped a lot!