Some problems about Bert
tfighting opened this issue · 2 comments
tfighting commented
line 70: index = randint(0, vocab_size - 1) # random index in vocabulary.
I think the replace index can't involve 'cls' ,'sep' and 'mask'!
bruce1408 commented
line 70: index = randint(0, vocab_size - 1) # random index in vocabulary.
I think the replace index can't involve 'cls' ,'sep' and 'mask'!
Yes, it`s right. so the code should change like this :
if random() < 0.8: # 80%
input_ids[pos] = word_dict['[MASK]'] # make mask
elif random() > 0.9:
index = randint(0, vocab_size - 1)
while index < 4: # cause {'[PAD]': 0, '[CLS]': 1, '[SEP]': 2, '[MASK]': 3} are all meanless
index = randint(0, vocab_size - 1)
input_ids[pos] = index
lukysummer commented
How about just :
index = randint(4, vocab_size - 1)