nert-nlp/streusle

Data format consistency: non-initial word of strong MWE duplicated in LEXLEMMA field

nschneid opened this issue · 2 comments

LEXLEMMA should be empty for non-initial tokens of a strong MWE. But this is not the case for 2 tokens ("appointment", "in").

There is also 1 token that is not part of a weak MWE yet has a WLEMMA ("all").

Fixed in 93fb01b, but not yet propagated to splits

Fully fixed in #47