How to support padding in the train dataset for training ?
mrhimanshu opened this issue · 2 comments
mrhimanshu commented
How to support padding in the train dataset for training ?
dustinwloring1988 commented
I would just add it in the fineweb.py script when you are tokenizing the rows.
dustinwloring1988 commented
@mrhimanshu sorry forgot to tag you