amir-zeldes/DepEdit

Fails for conllu ellipsis token with ID 10.1

Closed this issue · 2 comments

Yes, currently DepEdit does not support 'ellipsis' tokens, the assumptions is that all token IDs are integers. I'll need to check and see how difficult it is to support this. Maybe just cast to float?

One consequence of this is that token distance definitions for rules like #1.1,5#2 might be compromised (if you say 'within 5 tokens', does that include ellipsis tokens?). But in practice I suppose the main objective is just to make sure it doesn't crash due to these... Thanks for reporting!

It seems changing it to float basically works, though I'm rounding the ellipsis ID for distance checking purposes. But maybe that's actually desirable, if we have ID 10 and ID 11, we may want to say that they're adjacent, irrespective of ellipsis tokens (just a caveat). I'll commit a fix.