acoli-repo/olia

Add system:hasLemma and system:hasLemmaMatching for word-specific tags in annotation models

Opened this issue · 0 comments

Extension to make sure that word-specific tags can be reproduced when mapping tags via OLiA.

Suggestion:

  • If hasLemma is defined in an annotation model, a particular instance is restricted to words with this exact lemma. hasLemma must be unique, if multiple lemmas are to be matched, use hasLemmaMatching
  • if hasLemmaMatching is defined in an annotation model, a particular instance is restricted to words whose lemma matches this exact regular expression.

From a review to an LREC-2020 paper:

. Some annotation schemes devise classes so that particular words
will always have the same tag, even though in particular sentences
they have different uses.  In LOB, for example (which I take as an
example because the manual is handy), the occurrences of "all" in
the two sentences "All mothers go there" and "Let all pray for
peace" are both tagged ABN.

Other annotation schemes may be devised which attempt to tag the
attributive and pronominal uses of "all" with different tags.

In the one scheme, the concepts of the tag set relate (at least in
these cases) to word forms like "all", and the intension of the
tag is, roughly "a word form which may sometimes be used as a
determiner and sometimes as a pronoun and ... (further
specification)".  In the other, the concepts of the tag set
relate, in the same cases, not to word forms but to word
occurrences, and the intension of the tags will be "a word token
used as determiner of a noun ..." and "a word token used
pronominally ...".