Trained custom data on mini_librispeech recipe but inference just gives 1 speaker for whole audio file.
saumyaborwankar opened this issue · 1 comments
saumyaborwankar commented
SPEAKER aaak 1 11.40 0.10 <NA> <NA> aaak_4 <NA>
SPEAKER aaak 1 14.00 0.10 <NA> <NA> aaak_4 <NA>
This is the hyp_0.3_1.rttm I got after scoring. For the entire aaak.wav file only aaak_4 speaker is detected.
"main/DER": 0.4484034770634306,
"validation/main/DER": 0.5290581162324649,
This is the DER after 200 epochs. Can someone help me understand why the inference is detecting just one speaker.
aaaa wav_8/aaaa.wav
aaab wav_8/aaab.wav
This is wav.scp (first 2 lines)
aaab-000521-000625 Khanna
aaab-000829-000923 Khanna
This is the utt2spk file
aaab-000521-000625 aaab 5.21 6.25
aaab-000829-000923 aaab 8.29 9.23
This is the segments file
kli017 commented
Hello, I met the same problem while training on mini_librispeech recipe. I made a 2 speaker no overlap dataset and with the epoch increase the model just detect 1 speaker. Do you find the reason?