/LVSegmenter

Domain name segmenter for Latvian.

Primary LanguageHTMLGNU General Public License v3.0GPL-3.0

LVSegmenter

Domain name segmenter for Latvian.

For usage sample see SegmenterUI.java

As a word list for Latvian we sugest to use filtered result from https://github.com/PeterisP/morphology/blob/master/src/tools/java/lv/semti/Vardnicas/VarduSaraksts.java and for English e.g. http://www-01.sil.org/linguistics/wordlists/english/wordlist/wordsEn.txt Lists are handled using Patricia Trie from Apache Commons https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/trie/PatriciaTrie.html