lemon0830/TIM

计算r_loss时,似乎会把padding_idx也考虑进去,如果output和bad_output长度相差太大会对结果有较大影响吗?

Opened this issue · 0 comments