/solr-jdbc-synonyms

A Solr synonym and stopwotd filter for reading synonyms out of JDBC

Primary LanguageJavaApache License 2.0Apache-2.0

solr-jdbc-synonyms

travis ci build status

DEPRECATED! From solr-jdbc version 2.1.0 it includes solr-jdbc-synonyms. Please use solr-jdbc instead.

A Solr synonym filter for reading synonyms out of JDBC. The DataSource to retrieve synonyms from is injected via JNDI.

Installing the synonym filter (Apache Tomcat)

Configuring the synonym filter

The JdbcSynonymFilterFactory behaves exactly like the Solr SynonymFilterFactory, except that it does load the synonyms from a JDBC database and not from a file resource. Configure the filter in your Solr analyzer chain like this:

<filter class="com.s24.search.solr.analysis.jdbc.JdbcSynonymFilterFactory"   
   sql="SELECT concat(left, '=>', array_to_string(right, ',')) as line FROM synonyms;" 
   jndiName="jdbc/synonyms" ignoreCase="false" expand="true" />

The filter takes two arguments over the SynonymFilterFactory:

  • jndiName: The JNDI name of your JDBC DataSource as configured in your solr.xml or server.xml. In the example above, this would be jdbc/synonyms.

  • sql: A SQL statement returning valid Solr synonym lines in the first SQL result column.

    • Valid synonym formats include x=>a, x=>a,b,c, x,y=>a,b,c or x,a,b,c.
    • You might have your left and right hand side of your synonym definitions stored in separate columns in your database. Use a concat function to create a valid synonm line.
      • In PostgreSQL, you might use SELECT concat(lhs, '=>', rhs) as line FROM synonyms;
      • In PostgreSQL with arrays, you might use SELECT concat(lhs, '=>', array_to_string(rhs, ',')) as line FROM synonyms;
      • In Mysql your might use SELECT concat(lhs, '=>', rhs) as line FROM synonyms;

A complete fieldtype might look like example:

<fieldType name="synonym_test" class="solr.TextField">
     <analyzer>
        <tokenizer class="solr.PatternTokenizerFactory" pattern="[\s]+" />
        <filter class="com.s24.search.solr.analysis.jdbc.JdbcSynonymFilterFactory"   
           sql="SELECT concat(left, '=>', array_to_string(right, ',')) as line FROM synonyms;" 
           jndiName="jdbc/synonyms" ignoreCase="false" expand="true" />
     </analyzer>
  </fieldType>

Configuring the stop word filter

Since version 1.1 there's a JdbcStopFilterFactory available, that reads stopwords from a JDBC database. It behaves exactly like the Solr StopFilterFactory and is meant to be a drop-in replacement:

<filter class="com.s24.search.solr.analysis.jdbc.JdbcStopFilterFactory"   
   sql="SELECT stopword FROM stopwords" 
   jndiName="jdbc/synonyms"/>

The filter has the same configuration parameters as the JdbcSynonymFilterFactory.

Building the project

This should install the current version into your local repository

$ export JAVA_HOME=$(/usr/libexec/java_home -v 1.7)
$ export MAVEN_OPTS="-Dmaven.wagon.http.ssl.insecure=true -Dmaven.wagon.http.ssl.allowall=true -Dmaven.wagon.http.ssl.ignore.validity.dates=true"
$ mvn clean install

Releasing the project to maven central

Define new versions

$ export NEXT_VERSION=<version>
$ export NEXT_DEVELOPMENT_VERSION=<version>-SNAPSHOT

Then execute the release chain

$ mvn org.codehaus.mojo:versions-maven-plugin:2.0:set -DgenerateBackupPoms=false -DnewVersion=$NEXT_VERSION
$ git commit -a -m "pushes to release version $NEXT_VERSION"
$ mvn -P release

Wait for the relase to be accepted. Then, increment to next development version:

$ git tag -a v$NEXT_VERSION -m "`curl -s http://whatthecommit.com/index.txt`"
$ mvn org.codehaus.mojo:versions-maven-plugin:2.0:set -DgenerateBackupPoms=false -DnewVersion=$NEXT_DEVELOPMENT_VERSION
$ git commit -a -m "pushes to development version $NEXT_DEVELOPMENT_VERSION"
$ git push origin tag v$NEXT_VERSION && git push origin

Some link regarding Maven central deployment:

License

This project is licensed under the Apache License, Version 2.