v-iashin/VoxCeleb

about the iden_split.txt

hktxt opened this issue · 2 comments

hktxt commented

Hi~
your code is really helpful. could you please tell me more about the iden_split.txt?
is it a text file that contains file paths, one row one path?

Hey!

I am really glad that you found my code to be helpful.

Regarding your question, I think you are correct, one row one path.

iden_split.txt is the file that VGG provided with the dataset: VoxCeleb1 look for Dataset split for Identification. Also, you may take a look at preprocessing.ipynb notebook which processes the raw downloaded files.

It appears to me that this file consists of a phase (train: 1 and 2; test: 3) and an audio path (of a format: id/track/segment). To verify the fact that the first column is a phase, you may count the number of rows and compare it with the values that are mentioned in the paper in Table 5. Also, it is a reasonable assumption as identification is

identification is treated as a simple classification task, the output of the last layer is fed into a 1,251-way softmax in order to produce a distribution over the 1,251 different speakers.

hktxt commented

@vdyashin
I just realized that it was provided by VGG. thanks~~~ anyway~~~
more help will be asked for if I get stucked~~~hahaha