Wrong POS for "keine": PRON instead of DET
GeorgeS2019 opened this issue · 7 comments
Ich habe keine Übungen gemacht, weil ich keine Lust habe.
Stanza states keine as DET
CoreNLP 4.5.6 (with corresponding 4.5.6 German model) states keine as PRON
The data used to train the Stanza tagger was
ud-treebanks-v2.13/UD_German-GSD/de_gsd-ud-train.conllu
where keine
is treated as DET
The CoreNLP tagger has not been retrained since UD 2.4, where the standard was to treat keine
as PRON
Retraining taggers with updated data is less of a hassle than the general feature adds you've been requesting, so, we'll put updated data for some of those models on the list
I have tried to connect to @manning through Linkedin regarding CoreNLP 4.5.6 with specific interest on German model 4.5.6
I also have issue with the result of dependency parsing. Hopefully, this will go away when the German POS assignment is correct.
@AngledLuffa
I am comparing the CoreNLP German output through code with that of Stanza.
I understand that CoreNLP run online is no longer running. It will take extra few steps to compare between CoreNLP 4.5.6 and the latest Stanza.
Does german parser in CoreNLP support XPOS? I can ONLY find UPOS
CoreNLP
props.setProperty("annotators", "tokenize, ssplit, mwt, pos, lemma, ner, depparse");