wenet-e2e/WenetSpeech

The unlabeled data open source?

rookie0607 opened this issue · 2 comments

Thank you very much for your work, which has brought me many conveniences. It seems that the download method you provide can only download 10000 hours of labeled data. Is the unlabeled data open source?

Yes, the unlabeled data is aslo given in the data. However, we only gives the labeled segments in the meta file.

You can extract the unlabeled data according to the labeled time stamp.