ClearTK/cleartk

Provide example of how to wrap a ClearTK AE into a simple API that any java program can call

Opened this issue · 0 comments

Original issue 405 created by ClearTK on 2014-06-22T20:02:32.000Z:

We should provide an example of how to create a ClearTK wrapper that would allow one to call a ClearTK-based analysis engine from any java program. An example of this could be used for something like NER in which a java program simply wants to call an NER routine with some text and get back some results. Here's a sketch of what such an example might look like.

NER myNer = new NER()
List<MyNerPojo> nerResults = myNer.doNer(myText)

Where doNer() has a signature something like this:

public List<MyNerPojo> doNer(String myText)

The implementation of doNer() would look something like this:

1: AnalysisEngine nerAE = builder.createAggregate;
2: JCas myJcas = nerAE.newJCas();
3: myJcas.setDocumentText(text);
4: nerAE.process(myJcas);
5: for (NamedEntityMention mention : JCasUtil.select(jCas, NamedEntityMention.class)) {
6: //collect whatever you want from each mention into your pojo return type
7: }

Line 1 makes use of an AggregateBuilder. See RunNamedEntityChunker.main for a nice example of this
Line 2 is an expensive call (~100 ms) so you may want to pass in the JCas instance to doNer() or have it be a member variable. You can reset a JCas instance with the reset() method. Note that this would have synchronization implications - i.e. doNer() should be marked as synchronized.

Some of the above text was copied from a recent post on the cleartk-users list.