Current version: 3.2.1
processors-server
What is it?
An akka-http
server exposing a REST API for text annotation via the processors
library
Requirements
How is this useful?
This might be useful to people wanting to do NLP in a non-JVM
language without a good existing parser. Currently there are services for using processors
' CluProcessor
, FastNLPProcessor
(a wrapper for CoreNLP
) and BioNLPProcessor
.
processors-server
Running git clone https://github.com/clu-ling/processors-server.git
Fire up the server. This may take a minute or so to load the large model files.
cd processors-server
sbt "runMain NLPServer"
By default, the server will run on port 8888
and localhost
, though you can start the server using a different port and host:
sbt "runMain NLPServer --host <your favorite host here> --port <your favorite port here>"
Building a docker container
sbt docker
This will create a container named parsertongue/processors-server:latest
, which you can run with docker-compose up
using the included docker-compose.yml
file.
You can find all of the official containers published on Docker Hub for this project in this repo.
Logging
A server log is written to processors-server.log
in home directory of the user who launches the server.
Communicating with the server
_NOTE: Once the server has started, a summary of the services currently available (including links to demos) can be found at the following url: http://<your host name here>:<your port here>
Annotating text
The following services are available:
- Text annotation (open-domain or biomedical) involving:
- sentence splitting
- tokenization
- lemmatization
- PoS tagging
- NER
- dependency parsing
- Sentiment analysis
- Rule-based IE using
Odin
Text can be annotated by sending a POST request containing json
with a "text"
field to one of the following annotate
endpoints (see example).
You may also send text already segmented into sentences by posting a SegmentedMessage
(see example) to the same annotate
endpoint. This is just a json
frame with a "sentences"
field pointing to an array of strings.
CluProcessor
http://localhost:<your port here>/api/clu/annotate
FastNLPProcessor
http://localhost:<your port here>/api/annotate
http://localhost:<your port here>/api/fastnlp/annotate
BioNLPProcessor
The resources (model files) for this processor are loaded lazily when the first call is made.
Text can be annotated by sending a POST request containing json
with a "text"
field to the following endpoint (see example):
http://localhost:<your port here>/api/bionlp/annotate
CoreNLP
Sentiment analysis with http://localhost:<your port here>/api/corenlp/sentiment/score
You can also send text that has already been segmented into sentences:
- post a
SegmentedMessage
(see example) tohttp://localhost:<your port here>/api/corenlp/sentiment/score/segmented
Responses will be SentimentScores
(see example)
Odin
Rule-based IE with http://localhost:<your port here>/odin/extract
- Requires one of the following
json
POST requests:- text with rules (see example)
- text with rules url (see example)
- document with rules (see example)
- document with rules url (see example)
- Requires one of the following
For more info on Odin
, see the manual
Responses
A POST
to an /api/annotate
endpoint will return a Document
of the form specified in document.json
.
cURL
An example using To see it in action, you can try to POST
json
using cuRL
. The text to parse should be given as the value of the json
's text
field:
curl -H "Content-Type: application/json" -X POST -d '{"text": "My name is Inigo Montoya. You killed my father. Prepare to die."}' http://localhost:8888/api/annotate
{
"text": "My name is Inigo Montoya. You killed my father. Prepare to die.",
"sentences": [
{
"words": [
"My",
"name",
"is",
"Inigo",
"Montoya",
"."
],
"startOffsets": [
0,
3,
8,
11,
17,
24
],
"endOffsets": [
2,
7,
10,
16,
24,
25
],
"lemmas": [
"my",
"name",
"be",
"Inigo",
"Montoya",
"."
],
"tags": [
"PRP$",
"NN",
"VBZ",
"NNP",
"NNP",
"."
],
"entities": [
"O",
"O",
"O",
"PERSON",
"PERSON",
"O"
],
"dependencies": {
"edges": [
{
"destination": 0,
"source": 1,
"relation": "poss"
},
{
"destination": 1,
"source": 4,
"relation": "nsubj"
},
{
"destination": 2,
"source": 4,
"relation": "cop"
},
{
"destination": 3,
"source": 4,
"relation": "nn"
},
{
"destination": 5,
"source": 4,
"relation": "punct"
}
],
"roots": [
4
]
}
},
{
"words": [
"You",
"killed",
"my",
"father",
"."
],
"startOffsets": [
26,
30,
37,
40,
46
],
"endOffsets": [
29,
36,
39,
46,
47
],
"lemmas": [
"you",
"kill",
"my",
"father",
"."
],
"tags": [
"PRP",
"VBD",
"PRP$",
"NN",
"."
],
"entities": [
"O",
"O",
"O",
"O",
"O"
],
"dependencies": {
"edges": [
{
"destination": 2,
"source": 3,
"relation": "poss"
},
{
"destination": 3,
"source": 1,
"relation": "dobj"
},
{
"destination": 4,
"source": 1,
"relation": "punct"
},
{
"destination": 0,
"source": 1,
"relation": "nsubj"
}
],
"roots": [
1
]
}
},
{
"words": [
"Prepare",
"to",
"die",
"."
],
"startOffsets": [
48,
56,
59,
62
],
"endOffsets": [
55,
58,
62,
63
],
"lemmas": [
"prepare",
"to",
"die",
"."
],
"tags": [
"VB",
"TO",
"VB",
"."
],
"entities": [
"O",
"O",
"O",
"O"
],
"dependencies": {
"edges": [
{
"destination": 2,
"source": 0,
"relation": "xcomp"
},
{
"destination": 3,
"source": 0,
"relation": "punct"
},
{
"destination": 1,
"source": 2,
"relation": "aux"
}
],
"roots": [
0
]
}
}
]
}
json
schema for responses
Response schema can be found at src/main/resources/json/schema
Examples of each can be found at src/main/resources/json/examples
Other Stuff
Shutting down the server
You can shut down the server by posting anything to /shutdown
Checking the server's build
send a GET
to /buildinfo
py-processors
If you're a Python user, you may be interested in using py-processors
in your NLP project.
jar
?
Where can I get the latest and greatest fat Cloning the project and running sbt jarify
ensures the latest jar
. Published jars are available at this URL: http://py-processors.parsertongue.com/v?.?.?/processors-server.jar
(substitute your desired version for ?.?.?
).