mdangschat/ctc-asr

how cant i train with my dataset?

dangvansam opened this issue · 4 comments

i have dataset: 1 folder 'wav' (.wav file), 1 text file have lines = num of wav file with format name_wav text_of_wav
so, how can i train with this data. thanks so much,, im beginer

p225_001 Please call Stella.
p225_002 Ask her to bring these things with her from the store.
p225_003 Six spoons of fresh snow peas, five thick slabs of blue cheese, and maybe a snack for her brother Bob
.....

Hi @dangvansam98, thanks for your question. It appears that my documentation is lacking behind since I switched from a TXT file to CSV. I'll try to update it over the weekend.

If I remember correctly, the model expects a data/train.csv in its data directory.
With the following format:

path;label;length
timit/TIMIT/TEST/DR1/FAKS0/SI1573.WAV;his captain was thin and haggard and his beautiful boots were worn and shabby;4.9728125
timit/TIMIT/TEST/DR1/FAKS0/SI2203.WAV;the reasons for this dive seemed foolish now;3.513625
...

Where path is the relative WAV path from the DATA_DIR/corpus/ directory. By default label is the all lower case transcription without punctuation and length is the audio length in seconds.

thank you for your answer 👍

Glad to hear. Btw. in case you are using a free speech corpus, could you link it?