Custom Dataset
appledora opened this issue · 0 comments
appledora commented
Trying to use this repo to train electra from scratch for Bangla. I have my dataset as a csv where each row is a document.
Would the default openwebtext/preprocess.py
file would help here? Where else might I need to modify? Thanks!