/LuceneQueryExpansion

TREC evaluation demonstration/Query Expansion module for Lucene for a lecture on Information Retrieval; About parsing the TREC 10G dataset(indexing), searching, query expansion using Rocchio/LDA Methods

Primary LanguageJava

#Lucene-QueryExpansion-Modules

This is a Lucene's module - Query Expansion using Rocchio/LDA algorithm - for demonstrating its effectiveness/feasibility on Lucene framework in my IR homework.

Following open source projects were referenced for this:

###Prerequisite Java 1.7 or higher version

Maven 3.0.4 or higher version

MacOS/Linux supported**

  • Please note: We tested it on Windows but it failed.
  • Source code is organized based on maven's structure. With the following command files, you can easily test the code and adjust the parameter quickly.

###About Command files (Execution) Quick shell execution files are available and you can easily run it to test our code as follows:

(To execute it, type './build.sh' or 'sh build.sh' on command-line)

  • build.sh - Download all necessary files into local space from the net and install them. (mvn install)

  • IndexTrec.sh - Make index using the Trec dataset. The root directory for the dataset should be given in argument(parameter) part when it executed. (You may edit this when you gonna run it on your own running environment, to another directory)

  • SearchFiles.sh - Searching through the index file that made in the above procedure. Basic interpreter has been made for instant testing.

  • QueryExpansion-rocchio.sh - Conduct query expansion using the index file in 'index' sub-folder, with a command-line user interface. This uses the rocchio algorithm to expand given query.

  • QueryExpansion-LDA.sh - Conduct query expansion in LDA approach.

###Support and Feedback

Currently it's not available for any support. You can use it freely, taking your own risk.