/openNLP-1

This is a very simple and easy java-based NLP guide for quick start to NLP model creation and use for entity extraction.

Primary LanguageJava

openNLP

This is a very simple and easy java-based NLP guide for quick start to NLP model creation and use for entity extraction.

#NLP Model creation and use:
Below steps are for openNLP model creation, training the model with simple training set and then using the same trained model from entity extraction.

  • openNLP custom model creation using annotated training set.
    Example for annotated training text :
    " The highest temperature recorded in <START:location> Delhi <END> before this was 47.4 degree Celsius at Palam on June 16, 1995. "
    Here location is the key and the Delhi is the annotated sample value for it.So next time when an input document finds Delhi then using the above trained model you can extracts entity as Delhi :location ,means Delhi is a location.
  • Recieves input for entity extraction.
  • Tokenize the input text fragment using pre tained tokenizer model provided by openNLP.
  • Pass the token array and newly created model for entity extraction.
  • Return the extracted key-value pair.

Go through the code and you will understand every thing.

This code can be used to create any kind of simple trained model,but the precision depends on the more you train the more the entity extraction becomes accurate.

#Using the code
In order to use OpenNLP in your project, you must define below maven dependency of opennlp.tools :

<dependency>
<groupId>org.apache.opennlp</groupId>
<artifactId>opennlp-tools</artifactId>
<version>1.5.3</version>
</dependency>