Can't run demo "Jaccard Similarity between Dependency Trees"
Closed this issue · 2 comments
knit-bee commented
knit-bee commented
The error seems to occur for sentences with contractions of article and preposition (e.g. 'im', 'zur'). The conll-data has extra rows for the contracted form and the underlying isolated forms, thus the index of a token becomes for example (13, '-', 14) (see traceback).
Maybe you can just skip the row with the contracted form?
ulf1 commented
I see
...
((13, '-', 14), None, '_', 'im'),
(13, 16, 'case', 'in'),
(14, 16, 'det', 'dem'),
...
(13, '-', 14)
is actually 13-14
and additional information that we don't need at this stage.
https://universaldependencies.org/format.html#words-tokens-and-empty-nodes