Clusion is an easy to use software library for searchable symmetric encryption (SSE). Its goal is to provide modular implementations of various state-of-the-art SSE schemes. Clusion includes constructions that handle single, disjunctive, conjunctive and (arbitrary) boolean keyword search. All the implemented schemes have optimal asymptotic search complexity in the worst-case.
Clusion is provided as-is under the Modified BSD License (BSD-3).
Clusion is written in Java and has the following dependencies:
-
Bouncy Castle https://www.bouncycastle.org/
-
Apache Lucene https://lucene.apache.org/core/
-
Apache PDFBox https://pdfbox.apache.org/
-
Apache POI https://poi.apache.org/
-
Google Guava https://poi.apache.org/
-
SizeOF (needed to calculate object size in Java) http://sizeof.sourceforge.net/
-
Hadoop-2.7.1 was used for our distributed implementation of the IEX-2Lev setup algorithm. Earlier releases of Hadoop may work as well but were not tested
Clusion was tested with Java version 1.7.0_75
.
Indexing. The indexer takes as input a folder that can contain pdf files, Micorosft files such .doc, .ppt, media files such as pictures and videos as well as raw text files such .html and .txt. The indexing step outputs two lookup tables. The first associates keywords to document filenames while the second associates filenames to keywords. For the indexing, we use Lucene to tokenize the keywords and get rid of noisy words. For this phase, Apache Lucene, PDFBox and POI are required. For our data structures, we use Google Guava.
Cryptographic primitives. All the implementations make use of the Bouncy
Castle library. The code is modular and all cryptographic primitives are
gathered in the CryptoPrimitives.java
file. The file contains AES-CTR,
HMAC_SHA256/512, AES-CMAC, key generation based on PBE PKCS1 and random string
generation based on SecureRandom. In addition, it also contains an
implementation of the HCB1 online cipher from [BBKN07].
The following SSE schemes are implemented:
-
2Lev: a static and I/O-efficient SSE scheme [CJJJKRS14].
-
IEX^B-2Lev: a worst-case optimal boolean SSE scheme [KM16]. This implementation makes use of 2Lev as a building block. The disjunctive-only IEX-2Lev construction from [KM16] is a special case of IEX^B-2Lev where the number of disjunctions is set to 1 in the Token algorithm.
-
ZMF: a compact single-keyword SSE scheme (with linear search complexity) [KM16]. The construction is inspired by the Z-IDX construction [Goh03] but handles variable-sized collections of Bloom filters called Matryoshka filters. ZMF also makes a non-standard use of online ciphers. Here, we implemented the HCBC1 construction from [BBKN07] but would like to replace this with the more efficient COPE scheme from [ABLMTY13].
-
IEX^B-ZMF: a compact worst-case optimal boolean SSE scheme. Like our IEX^B-2Lev implementation, the purely disjunctive variant IEX-ZMF is a special case with the number of disjunctions set to 1.
-
IEX-2Lev-Amazon: a distributed implementation of text indexing based on MapReduce/Hadoop on Amazon AWS.
-
We also plan to share our Client-Server implementation for 2Lev, IEX^B-2Lev, IEX^B-ZMF once finalized.
For a quick test, create a folder, make sure that you have all required libraries, store some files in the folder and enjoy!
- to test 2Lev run
TestLocal2Lev
- to test IEX-2Lev run
TestLocalIEX2Lev
- to test IEX-ZMF run
TestLocalIEXZMF
- to test IEX-2Lev on Amazon run
IEX2LevAMAZON
Clusion currently does not have any documentation. The best way to learn how to use the library is to read through the source of the test code:
TestLocal2Lev.java
TestLocalIEX2Lev.java
TestLocalIEXZMF.java
-
[CJJJKRS14]: Dynamic Searchable Encryption in Very-Large Databases: Data Structures and Implementation by D. Cash, J. Jaeger, S. Jarecki, C. Jutla, H. Krawczyk, M. Rosu, M. Steiner.
-
[KM16]: Boolean Searchable Symmetric Encryption with Worst-Case Optimal Complexity by S. Kamara and T. Moataz. Available upon request.
-
[Goh03]: Secure Indexes by E. Goh.
-
[ABLMTY13]: Parallelizable and Authenticated Online Ciphers by E. Andreeva, A. Bogdanov, A. Luykx, B. Mennink, E. Tischhauser, and K. Yasuda. .
-
[BBKN07]: On-Line Ciphers and the Hash-CBC Constructions by M. Bellare, A. Boldyreva, L. Knudsen and C. Namprempre.