What do timings denote
zgerrard opened this issue · 1 comments
zgerrard commented
Hi, do timings that are returned from the ctc_segmentation() function denote the time when the corresponding character starts or the time when it is in the middle of that character (highest probability).
Thank you.
lumaku commented
In traditional hybrid DNN/HMM ASR, phoneme classes have a duration over multiple time frames. In CTC-based ASR, characters "occur". So, timings denote the most probable time of "occurrence" of a character. This corresponds to a_t in the paper.