Question about custom dataset

Question

LucasRotsen opened this issue 5 years ago · 2 comments

Hi everyone!

Firstly, thank you for the great implementation.

I haven't understood yet how should I prepare my data for training, so I'd appreciate if someone clarifies that for me. My assumptions are:

If I have data from 10 speakers, I should divide it into 2 files in the "filelist" directory (train and val)
Each of those files should contain a representative sample of all speakers
The txt file format should be: path_to_audio|transcripts|speaker_id

Are my assumptions correct?

Answer 1 · 2020-05-06T17:02:47.000Z

Yes, that's a good start!
Make sure you trim silences at the beginning and end of each of the audio files and the transcript matches the audio file.

Answer 2 · 2020-05-06T17:04:35.000Z

Thanks for the quick reply, @rafaelvalle !