mgormley/agiga

Input XML data contains unexpected whitespace around certain tokens.

Opened this issue · 1 comments

Issue #1 and pull request #2 demonstrated and addressed this issue for the specific case of the CoNLL style printout. However, it should be fixed in the reader so that there aren't odd newlines appearing elsewhere.

This is fixed in the branch fb/trim_whitespace, but it's not yet tested on the full corpus.