Is there speaker annotations in unlabeled data?
DongChanS opened this issue · 2 comments
DongChanS commented
Thanks for data creation!
I have one question.
Is the speaker information only in the transcribed data?
if not, is there any unlabeled data that it have speaker information?
kahne commented
Thanks for checking with us.
Unfortunately, there is no metadata for the speakers in the unlabelled data. However, you may leverage speaker diarization/identification models to classify the speakers. The set of speakers is likely small (they are certificated interpreters). Also every speech (by speaker) is usually minutes long and speaker change is not very often.
kahne commented
I will close this issue for now. Please feel free to reopen or create a new one if you get more questions.