Integer overflow

Question

Integer overflow

rproskuryakov opened this issue 4 years ago · 2 comments

Hello, I've been trying to use ctcdecode and got stuck with the following problem. Outputs from the decoder seem to be overflowed. There are values like 159942440 and -288524424. Input's shape is (2, 255, 37). Input values are log probabilities. Minimum and maximum values in particular input tensor are -4.0183 and -3.4422. Could you help me? Maybe I do something wrong?

labels list:
["_", " ", ...] (length = 37)
Thanks!

Answer 1 · 2020-10-20T19:27:01.000Z

It seems that the overflow depends on labels size. If the length of an alphabet is lower than 15, everything works fine. If add one more symbol, everything breaks.

T = 255
N = 2
alphabet = list("_ abcdefghijklm")
C = len(alphabet)
decoder = CTCBeamDecoder(
        alphabet,
        beam_width=20,
        num_processes=4,
        blank_id=0,
        log_probs_input=True
)
batch_outputs = torch.randn(N, T, C).log_softmax(2).detach().cpu()
decoded, *_ = decoder.decode(batch_before_softmax)
print(decoded.min())
print(decoded.max())

Updated: Assumption above is wrong. Overflow is observed even with an alphabet with a length lower than 15.

Answer 2 · 2022-01-11T02:35:27.000Z

It seems that the overflow depends on labels size. If the length of an alphabet is lower than 15, everything works fine. If add one more symbol, everything breaks.
T = 255
N = 2
alphabet = list("_ abcdefghijklm")
C = len(alphabet)
decoder = CTCBeamDecoder(
        alphabet,
        beam_width=20,
        num_processes=4,
        blank_id=0,
        log_probs_input=True
)
batch_outputs = torch.randn(N, T, C).log_softmax(2).detach().cpu()
decoded, *_ = decoder.decode(batch_before_softmax)
print(decoded.min())
print(decoded.max())
Updated: Assumption above is wrong. Overflow is observed even with an alphabet with a length lower than 15.

I meet the same problem. Please ask if you solved it?