Integer overflow
rproskuryakov opened this issue · 2 comments
Hello, I've been trying to use ctcdecode and got stuck with the following problem. Outputs from the decoder seem to be overflowed. There are values like 159942440 and -288524424. Input's shape is (2, 255, 37). Input values are log probabilities. Minimum and maximum values in particular input tensor are -4.0183 and -3.4422. Could you help me? Maybe I do something wrong?
labels list:
["_", " ", ...] (length = 37)
Thanks!
It seems that the overflow depends on labels size. If the length of an alphabet is lower than 15, everything works fine. If add one more symbol, everything breaks.
T = 255
N = 2
alphabet = list("_ abcdefghijklm")
C = len(alphabet)
decoder = CTCBeamDecoder(
alphabet,
beam_width=20,
num_processes=4,
blank_id=0,
log_probs_input=True
)
batch_outputs = torch.randn(N, T, C).log_softmax(2).detach().cpu()
decoded, *_ = decoder.decode(batch_before_softmax)
print(decoded.min())
print(decoded.max())
Updated: Assumption above is wrong. Overflow is observed even with an alphabet with a length lower than 15.
It seems that the overflow depends on labels size. If the length of an alphabet is lower than 15, everything works fine. If add one more symbol, everything breaks.
T = 255 N = 2 alphabet = list("_ abcdefghijklm") C = len(alphabet) decoder = CTCBeamDecoder( alphabet, beam_width=20, num_processes=4, blank_id=0, log_probs_input=True ) batch_outputs = torch.randn(N, T, C).log_softmax(2).detach().cpu() decoded, *_ = decoder.decode(batch_before_softmax) print(decoded.min()) print(decoded.max())Updated: Assumption above is wrong. Overflow is observed even with an alphabet with a length lower than 15.
I meet the same problem. Please ask if you solved it?