Is there speaker annotations in unlabeled data?

Question

Is there speaker annotations in unlabeled data?

DongChanS opened this issue 4 years ago · 2 comments

DongChanS commented 4 years ago

Thanks for data creation!

I have one question.

Is the speaker information only in the transcribed data?

if not, is there any unlabeled data that it have speaker information?

Answer 1 · 2021-03-15T15:56:10.000Z

Thanks for checking with us.

Unfortunately, there is no metadata for the speakers in the unlabelled data. However, you may leverage speaker diarization/identification models to classify the speakers. The set of speakers is likely small (they are certificated interpreters). Also every speech (by speaker) is usually minutes long and speaker change is not very often.

Answer 2 · 2021-03-17T15:52:55.000Z

I will close this issue for now. Please feel free to reopen or create a new one if you get more questions.