IDEA-CCNL/Fengshenbang-LM

questions about the padding value

iridescentee opened this issue · 1 comments

pad_token_id: -100
decoder_start_token_id: 0

for k, v in batch.items():
if k != "labels" and k != "idx":
batch[k] = pad_sequence(
v, batch_first=True, padding_value=self.pad_token_id
)
elif k == "labels":
batch[k] = pad_sequence(v, batch_first=True, padding_value=-100)

First of all, thank you for your code. Your code helps me a lot.

I have a small question on how you pad the input sequences. In Lines 97-98, you set the pad token id -100. usually, setting the token label to -100 means its loss should be ignored. I do not see why you set the padding value of input_ids and attention_mask [line 115 - 121] to -100. Are these lines wrong and I should change pad value into 0 ?