How to use timesteps?

Question

How to use timesteps?

blankspark opened this issue 3 years ago · 1 comments

I have noticed the output of ctcdecode includes timesteps, which the description says it can be used as alignment.
But I just get shape (Batchsize，N_beams，N_timesteps). I don't know how to use it.

timesteps - Shape: BATCHSIZE x N_BEAMS

The timestep at which the nth output character has peak probability. Can be used as alignment between the audio and the transcript.

Thanks in advance.

Answer 1 · 2023-12-07T11:49:26.000Z

@blankspark have you ever figured out how to use them? I am looking to get word-level time alignments, but I don't know how to calculate this information from the timesteps returned by ctcdecode.