save yourself from regex-whackamole🤞:
nlp(entireNovel).sentences().if('the #Adjective of times').out()
// "it was the blurst of times??"
move things around:
nlp('she sells seashells by the seashore.').sentences().toFutureTense().out()
// 'she will sell seashells...'
respond to text input:
if( doc.has('^simon says (shoot|fire) #Determiner lazer') ){
fireLazer()
} else {
dontFire()
}
<script src> |
🙏
npm install compromise
|
86%
on the Penn treebank
|
IE9+
caniuse, youbetcha
|
⚡️ on the Client-side
<script src="https://unpkg.com/compromise@latest/builds/compromise.min.js"></script>
<script>
var doc = nlp('dinosaur')
var str = doc.nouns().toPlural().out('text')
console.log(str)
// 'dinosaurs'
</script>
🌋 Server-side!
var nlp = require('compromise')
var doc = nlp('London is calling')
doc.sentences().toNegative()
// 'London is not calling'
Get the hang of things:
Input → output
|
Match & transform
|
Making a bot
|
Detailed docs:
Examples:
nouns! verbs! adjectives!
|
people, places, organizations
|
seven hundred and fifty == 750
|
like a regex for a sentence
|
all your base are belong
|
case, whitespace, contractions..
|
- Plural/singular: - grab the noun-phrases, make em plural:
doc = nlp('a bottle of beer on the wall.')
doc.nouns(0).toPlural()
doc.out('text')
//'The bottles of beer on the wall.'
- Number parsing: - parse written-out numbers, and change their form:
doc = nlp('ninety five thousand and fifty two')
doc.values().toNumber().out()
// '95052'
doc = nlp('the 23rd of December')
doc.values().add(2).toText()
doc.out('text')
// 'the twenty fifth of December'
- Normalization: - handle looseness & variety of random text:
doc = nlp("the guest-singer's björk at seven thirty.").normalize().out('text')
// 'The guest singer is Bjork at 7:30.'
- Tense: - switch to/from conjugations of any verb
let doc = nlp('she sells seashells by the seashore.')
doc.sentences().toFutureTense().out('text')
//'she will sell seashells...'
doc.verbs().conjugate()
// [{ PastTense: 'sold',
// Infinitive: 'sell',
// Gerund: 'selling', ...
// }]
- Contractions: - grab, expand and contract:
doc = nlp("we're not gonna take it, no we ain't gonna take it.")
doc.has('going') // true
doc.match('are not').length // == 2
doc.contractions().expand().out()
//'we are not going to take it, no we are not going to take it'
- Named-entities: - get the people, places, organizations:
doc = nlp('the opera about richard nixon visiting china')
doc.topics().data()
// [
// { text: 'richard nixon' },
// { text: 'china' }
// ]
- Custom lexicon: - make it do what you'd like:
var lexicon={
'boston': 'MusicalGroup'
}
doc = nlp('i heard Boston\'s set in Chicago', lexicon)
//alternatively, fix it 'in-post':
doc.match('heard #Possessive set').terms(1).tag('MusicalGroup')
- Handy outputs: - get sensible data:
doc = nlp('We like Roy! We like Roy!').sentences().out('array')
// ['We like Roy!', 'We like Roy!']
doc = nlp('Tony Hawk').out('html')
/*
<span>
<span class="nl-Person nl-FirstName">Tony</span>
<span> </span>
<span class="nl-Person nl-LastName">Hawk</span>
</span>
*/
- Plugins: - allow adding vocabulary, fixing errors, and setting context quickly:
var plugin = {
tags:{
Character:{
isA: 'Noun'
}
},
words:{
itchy: 'Character',
scratchy: 'Character'
}
}
nlp.plugin(plugin)
nlp(`Couldn't Itchy share his pie with Scratchy?`).debug()
/*
couldn't - #Modal, #Verb
itchy - #Character, #Noun
share - #Infinitive, #Verb
...
*/
a lot more stuff.
of course, there'sJoin in - we're fun, using semver, and moving fast:
Twitter
|
Slack group
|
Mailing-list
|
Projects
|
Pull-requests
|
☂️ Isn't javascript too...
💃 Can it run on my arduino-watch?
-
Only if it's water-proof!
Read quickStart for all sorts of funny environments.
🌎 Other Languages?
✨ Partial builds?
-
compromise is one function so can't really be tree-shaken.
.. and the tagging methods are competitive, so it's not recommended to pull things out.
It's best to load the library fully, given it's smaller than this gif.
A plug-in scheme is in the works.
Also:
- naturalNode - fancier statistical nlp in javascript
- superScript - clever conversation engine in js
- nodeBox Linguistics - conjugation, inflection in javascript
- reText - very impressive text utilities in javascript
- jsPos - javascript build of the time-tested Brill-tagger
- spaCy - speedy, multilingual tagger in C/python
For the former promise-library, see jnewman/compromise (Thanks Joshua!)