mavenized lucene-gosen
- extract java version dictionary converter (some dictionaries are different)
mavenize dictionary creation- packaging dictionary is needed?
back port unit tests to senhttps://github.com/lucene-gosen/lucene-gosen/-> backport some code
Installation With Apache Solr 3.6:
- run 'ant'. this will make lucene-gosen-{version}.jar
- create example/solr/lib and put this jar file in it.
- copy stopwords_ja.txt and stoptags_ja.txt into example/solr/conf
- add "text_ja_gosen" fieldtype: see example/schema.xml.snippet for example configuration.
refer to example/ for an example japanese configuration with comments explaining what the various configuration options are.
Installation with Apache Lucene 3.6:
- run 'ant'. this will make lucene-gosen-{version}.jar
- add this jar file to your classpath, and use GosenAnalyzer, or make your own analyzer from the various filters. Its recommended you extend ReusableAnalyzerBase to make any custom analyzer!