What is CEDAC?
Chinese English DAily Conversation is a subtitle corpus with speakerID and sceneID labels made by Center for Speech and Language Technologies of Tsinghua University. At present,we open source partial data and we will continuously update more. The paper with detail has been published on CCL-2019. If it is helpful to your research, please indicate the citation of the paper. Welcome star the repository, thanks.
这是一个中英双语带有说话者ID和说话场景ID的对话数据集,有近百万的对话。 目前先开源部分数据,我们会持续更新,欢迎关注和标星。 该研究论文《自动构建基于电视剧字幕和剧本的日常会话基础标注库》已经在CCL-2019会议上发表。 关于数据集的详细信息可以阅读论文。如果对您的研究有帮助,请注明论文引用。