dariusk/pos-js

Fail: 'I' -> NN and not PRP

AmenRa opened this issue · 3 comments

I tried this script:

var words = new pos.Lexer().lex('I love NodeJS');
var tagger = new pos.Tagger();
var taggedWords = tagger.tag(words);
for (i in taggedWords) {
    var taggedWord = taggedWords[i];
    var word = taggedWord[0];
    var tag = taggedWord[1];
    console.log(word + " /" + tag);
}

This is the log:

I /NN
love /NN
NodeJS /NN
undefined /undefined
undefined /undefined

It fails with the pronoun 'I'.
I don't know why it print two undefined elements, no words remaining.

I tried your script and did not get the undefined/undefined results:
node scriptByElias.js
I /NN
love /NN
NodeJS /NN

Maybe you need to declare variable i first?

The problem with I not correctly being tagged probably is caused by the entry in the lexicon:
"i": [
"NN",
"FW",
"NNP",
"NNS"
]
It will tag I as "NN" initially. If you change the entry to PRP it will tag I correctly.

Regards,
Hugo

Thank you man!

Thanks for this, just pushed a fix, publishing to NPM now, version 0.4.1.