dhlee347/pytorchic-bert

Padding bugs on data preprocess

Closed this issue · 2 comments

On this code line,
the pad index 0 is same with first segment index.
So, it may not offer segment information exactly.

I merged your pull requests.

The pad index 0 does not affect to the accuracy since the paddings will be masked.
Further, such requests cause the out-of-bound problem when loading the pre-trained models.