marty-oehme/pubs-extract

Improve annotation duplication detection in notes

Opened this issue · 0 comments

Currently, the plugin does a 1:1 exact match for existing annotations in notes.

Since we already have Levenshtein distance calculation we can potentially use this to get a more lenient annotation comparison (e.g. fixed spelling mistakes, or added words missing in auto-extraction).

The hardest part would presumably be to find the range which should be compared to the extracted note - we can not compare the whole document, so which existing annotations should we compare and how do we delimit annotations?