dariusk/pos-js

Didn't detect CD and LS

quanghoc opened this issue · 1 comments

var pos = require('pos');
var words = new pos.Lexer().lex('Twenty gardening tips from six experts for a fruitful summer');
var tagger = new pos.Tagger();
var taggedWords = tagger.tag(words);

console.log(taggedWords);

Gave results:

[ [ 'Twenty', 'NN' ],
  [ 'gardening', 'VBG' ],
  [ 'tips', 'NNS' ],
  [ 'from', 'IN' ],
  [ 'six', 'NN' ],
  [ 'experts', 'NNS' ],
  [ 'for', 'IN' ],
  [ 'a', 'DT' ],
  [ 'fruitful', 'JJ' ],
  [ 'summer', 'NN' ] ]

"Twenty" and "Six" must be CD. Am I wrong?

Words 'six' and 'twenty' are not in the lexicon. Unknown words are assigned with 'NN'. You can add numbers to lexicon.js to solve this.

Hugo