/ltlangpack

Tools for Lithuanian language processing

Primary LanguageShell

Lithuanian language processing tools to be used in NLP, search or other applications.

Sentence detection

Folder: sentence-detect

OpenNLP model for Lithuanian sentence detection.

Scripts to help with building the model:

  • add - append new text into the model (see comment inside the script)
  • train - build model based on example corpora
  • evaluate - evaluate detection quality

Snowball

Snowball version of Porter stemmer for Lithuanian language was moved to this page.

Language detection

Folder: language-detect

N-grams for Lithuanian language detection. Used in Apache Tika https://issues.apache.org/jira/browse/TIKA-582

License

Copyright (C) 2011 UAB TokenMill

Distributed under the Eclipse Public License.