Apostrophes removed in preprocessing?
nschneid opened this issue · 2 comments
nschneid commented
Looking through the data, there are a LOT of sentences where clitics are tokenized off but lack an apostrophe. Is that just the genre or did they get lost in preprocessing?
nschneid commented
This is indeed a preprocessing issue. Will try to fix along with some others.
nschneid commented
Corrected wordforms: 7dad014#commitcomment-57416463