thu-coai/KdConv

关于数据集具体信息的了解

Closed this issue · 0 comments

您好。请问KdConv数据集表格当中,Avg. # tokens per utterance是指"分词"后的词数吗?另外,Avg. # characters per uttenrace是指按字符切分的话,是指比如出现英文utterance,则统计为长度是9吗?谢谢!