/n-gram

Get n-grams from text

Primary LanguageJavaScriptMIT LicenseMIT

n-gram Build Status Coverage Status

Get n-grams in JavaScript.

Installation

npm:

$ npm install n-gram

Component.js:

$ component install wooorm/n-gram

Bower:

$ bower install n-gram

Duo:

var nGram = require('wooorm/n-gram');

UMD (globals/AMD/CommonJS) (uncompressed and compressed):

<script src="path/to/n-gram.js"></script>
<script>
  nGram.bigram('n-gram'); // ['n-', '-g', 'gr', 'ra', 'am']
</script>

Usage

var nGram = require('n-gram');

nGram.bigram('n-gram'); // ['n-', '-g', 'gr', 'ra', 'am']
nGram(2)('n-gram'); // ['n-', '-g', 'gr', 'ra', 'am']

nGram.trigram('n-gram'); // ['n-g', '-gr', 'gra', 'ram']

nGram(6)('n-gram'); // ['n-gram']
nGram(7)('n-gram'); // []

API

nGram(n)

Factory returning a function that converts a given string to n-grams.

Want padding? Use something like the following: nGram(2)(' ' + value + ' ');

nGram.bigram(value)

Shortcut for nGram(2)

nGram.trigram(value)

Shortcut for nGram(3)

Benchmark

On a MacBook Air, it runs about 583,367 op/s on a sentence.

               nGram -- this module
  583,367 op/s »   bigrams on a sentence
    4,250 op/s »   bigrams on an article
  566,931 op/s »  trigrams on a sentence
    4,204 op/s »  trigrams on an article
  542,756 op/s » ten-grams on a sentence
    3,597 op/s » ten-grams on an article

               madbence/ngram
  538,421 op/s »   bigrams on a sentence
    9,842 op/s »   bigrams on an article
  525,198 op/s »  trigrams on a sentence
    9,253 op/s »  trigrams on an article
  539,926 op/s » ten-grams on a sentence
    6,403 op/s » ten-grams on an article

License

MIT © Titus Wormer