clips/pattern

Wrong tagging on lowercase words ?

vsraptor opened this issue · 1 comments

"peter" is wrongly tagged as VB ....


In [205]: from pattern.en import *

In [206]: incorrect = Sentence(parse('peter owns a house'))

In [207]: pprint(incorrect)

      WORD   TAG    CHUNK   ROLE   ID     PNP    LEMMA   
                                                         
     peter   VB     VP      -      -      -      -       
      owns   VBZ    VP ^    -      -      -      -       
         a   DT     NP      -      -      -      -       
     house   NN     NP ^    -      -      -      -       

 In [208]: correct = Sentence(parse('Peter owns a house'))

 In [209]: pprint(correct)

      WORD   TAG    CHUNK   ROLE   ID     PNP    LEMMA   
                                                         
     Peter   NNP    NP      -      -      -      -       
      owns   VBZ    VP      -      -      -      -       
         a   DT     NP      -      -      -      -       
     house   NN     NP ^    -      -      -      -       

Google defines the verb form of 'peter' as 'decrease or fade gradually before coming to an end.'
So, when not used as a name (lowercase), it is a verb. And thus, the tagging is not incorrect.

Screenshot from 2020-01-12 01-15-49