modelscope/3D-Speaker

关于切分subseg的问题

hao-qiang opened this issue · 1 comments

https://github.com/alibaba-damo-academy/3D-Speaker/blob/9a455b3e429519aae91a63f36ae82f9b41423ad5/egs/3dspeaker/speaker-diarization/local/prepare_subseg_json.py#L47

在划分片段时,当取到音频末尾时,片段时长小于subseg_dur,是否应该从后往前取subseg_dur,即subseg_st = min(ed-subseg_dur, subseg_st)。如果按照当前的代码取到的片段时长极短时, embedding模型是否会报错呢?