grammatical/pretraining-bea2019

Can you provide us the data use in the experiments?

Lavine24 opened this issue · 2 comments

That's an awesome work, Thanks for sharing this code, Can you please give us the data you describe in the README? Thank you.

Do you have any plan to release your synthetic data?

Hi,
The synthetic part of data is available from: http://data.statmt.org/romang/gec-bea19/synthetic/
A better/newer version of the data (noise applied before subword splitting) can be found here: http://data.statmt.org/romang/gec-wnut19/data.en.tgz
The complete parallel data I've been sharing via email due to licensing, so if you need it, please drop an email.
Best