
How to split the data?

Closed this issue · 2 comments

Hi Alexander,

Thanks for publishing the dataset. After downloading the data, do you know how to split the data to train/valid/test for a fair comparison with your experimental results?


Hi Tyson,

Thank you for your interest in our work! We provide lists of the IDs and their split in a .json file which correspond to the example_id in the jsonl file. For example, data-non-processed/nyt.ids.json has lists of train, val, and test ids which can be matched in data-non-processed/nyt.jsonl.

Thanks for your clear explanation!