knaw-huc/globalise-tools

Use manually added paragraph data in conll export

Closed this issue · 1 comments

Currently, the sentence division is based on the sentences recognized by spacy.
Make a new conll export whereby the paragraph endings as defined in globalise-word-joins-MH.csv are used, and the sentences as deduced by spacy are ignored

brambg commented

We're not using conll as import/export format anymore.