/kamikaze

DocId set compression and set operation library

Primary LanguageJavaApache License 2.0Apache-2.0

What is Kamikaze

Kamikaze is a utility package wrapping set implementations on document lists.

It also implements the PForDelta compression algorithm for sorted integer segments to enable Inverted List compression for search engines like Lucene (http://lucene.apache.org/core/4_5_1/core/org/apache/lucene/util/PForDeltaDocIdSet.html).

Kamikaze is based on the PForDelta algorithm proposed in the following paper: Inverted Index Compression and Query Processing with Optimized Document Ordering Hao Yan, S.Ding and T.Suel. The 18th International World Wide Web Conference (WWW'09), Madrid, Spain, April 2009

Kamikaze is open sourced by LinkedIn Corp : http://data.linkedin.com/opensource/kamikaze.

The principal committer of Kamikaze is Hao Yan. If you have any questions regarding Kamikaze, please email him at hyan@linkedin.com.


Wiki

Wiki is available HERE

Issues

Issues are tracked HERE