parlance/ctcdecode

'beamscores' is strange when decoding with 4-gram lm-model

Nian-Chen opened this issue · 1 comments

The test code in as follow:

from ctcdecode import CTCBeamDecoder
lm_path="/tsdata/xsp/w2v2/lm_4gram_fisher.arpa"
pred_decoder = CTCBeamDecoder(
    labels=vocabulary,
    model_path=lm_path,
    alpha=0.3,
    beta=0.0,
    cutoff_top_n=20,
    beam_width=20,
    num_processes=1,
    blank_id=0,
    log_probs_input=False)
output = torch.randn(1,10,32).softmax(dim=-1)
beam_results, beam_scores, timesteps, out_lens = pred_decoder.decode(output)
print(beam_scores)
)

when alpha is set to 0.0 or lm_path is None, there is no problem with beam_scores . I mean the best index is 0 and the corresponding beamscore is the lowest.

tensor([[21.1820, 21.3059, 21.3574, 21.3628, 21.3686, 21.4190, 21.4813, 21.5382,
         21.5440, 21.5464, 21.5944, 21.6032, 22.4335, 22.5348, 22.5574, 22.6143,
         22.6201, 22.6586, 22.6705, 22.7156]])

but when alpha is 0.3, It looks like that the beam_scores hasn't been sorted.

tensor([[-274.8671, -274.6832,   24.9853,   25.2312, -273.8381, -273.6203,
           25.8654, -273.5138,   25.6890, -273.3648, -273.3346,   26.1823,
         -273.0129,   25.3955,   26.1171, -272.7036, -272.6336, -272.5328,
           27.2088, -272.3132]])

I want to know whether the beam_scores only represent the pure prefix ctc beam score?
How can i get the sorted beam_scores with lm-decoding?
By the way, The same problem esxits when decoding real logits(not rand). Some of the scores are normal but some are very large.
The vocabulary(labels) is as follow:

['_',
 '_',
 '_',
 '_',
 ' ',
 'E',
 'T',
 'A',
 'O',
 'N',
 'I',
 'H',
 'S',
 'R',
 'D',
 'L',
 'U',
 'M',
 'W',
 'C',
 'F',
 'G',
 'Y',
 'P',
 'B',
 'V',
 'K',
 "'",
 'X',
 'J',
 'Q',
 'Z']

Hope someone can help me~

I'm having a same question on this, @Nian-Chen did you solved this?