/mpc4j

Primary LanguageJavaApache License 2.0Apache-2.0

mpc4j

Introduction

Multi-Party Computation for Java (mpc4j) is an efficient and easy-to-use Secure Multi-Party Computation (MPC) and Differential Privacy (DP) library mainly written in Java.

mpc4j aims to provide an academic library for researchers to study and develop MPC/DP in a unified manner. As mpc4j tries to provide state-of-the-art MPC/DP implementations, researchers could leverage the library to have fair and quick comparisons between the new algorithms/protocols they proposed and existing ones.

We note that mpc4j is mainly focused on research and mpc4j assumes a very strong system model. Specifically, mpc4j assumes never-crash nodes with a fully synchronized network. In practice, crash-recovery nodes with a partially synchronized network would be a reasonable system model. Aside from the system model, mpc4j tries to integrate tools that are suitable to be used in the production environment. We emphasize that additional engineering problems need to be solved if you want to develop your own MPC/DP applications. A reasonable solution would be to implement communication APIs on your own, develop protocols by calling tools in mpc4j, and referring protocol implementations in mpc4j as a prototype.

Features

mpc4j has the following features:

  • aarch64 support: mpc4j can run on both x86_64 and aarch64. Researchers can develop and test protocols on Macbook M1 (aarch64) and then run experiments on Linux OS (x86_64).
  • SM series support: Developers may want to use SM series algorithms (SM2 for public-key operations, SM3 for hashing, and SM4 for block cipher operations) instead of regular algorithms (like secp256k1 for public-key operations, SHA256 for hashing, and AES for block cipher operations). Also, the SM series algorithms are accepted by ISO/IES, so it may be necessary to support SM series algorithms under MPC settings. mpc4j leverages Bouncy Castle to support SM series algorithms.

Contact

mpc4j is mainly developed by Weiran Liu. Feel free to contact me at liuweiran900217@gmail.com.

  • The submodules involving Fully Homomorphic Encryption (FHE) are mainly developed by Liqiang Peng and Qixian Zhou.
  • The submodules involving Vector Oblivious Linear Evaluation (VOLE) are mainly developed by Hanwen Feng.
  • The components of TFHE are developed by Zhen Gu of Computing Technology Lab (CTL) in Damo, Alibaba. The rest of their TFHE implementation by extending SEAL will be later released in their FHE library.
  • The FourQ-related implementations and mobile PSI-friendly OPRF (i.e., single-query OPRF) are developed by Qixian Zhou.
  • The submodules for circuits and operations based on the Boolean/arithmetic circuits are mainly developed by Li Peng.

Who Uses mpc4j

Currently, DataTrust is powered by mpc4j. If your project uses mpc4j and you do not mind it appearing here, don't hesitate to get in touch with me.

Academic Implementations

Some Implementations of our Works

If you want to test and evaluate our protocol implementations, compile and run the corresponding jar file with the config file. For example, if you want to run implementations related to PSU in the package mpc4j-s2pc-pso, you can first find example config files located in conf/psu in mpc4j-s2pc-pso, and then run java -jar mpc4j-s2pc-pso-X.X.X-jar-with-dependencies.jar conf_file_name.txt separately on two platforms with direct network connections (using the network channel assigned in config files) or on two terminals in one platform (using local network 127.0.0.1). Note that **you need first to run the server and then run the client. **The server and the client implicitly synchronize before running the protocol, and the first step is the client sends something like "hello" to the server. If the server is offline at that time, the program will get stuck.

Some Implementations of Existing Works

mpc4j contains some implementations of existing works. See PAPERS.md for more details.

References

mpc4j includes some implementation ideas and codes from the following open-source libraries.

Included Libraries

Here are some libraries that are included in mpc4j.

  • smile: A fast and comprehensive machine learning, NLP, linear algebra, graph, interpolation, and visualization system in Java and Scala. We understand many details of implementing machine learning tasks from this library. We also introduce some codes into mpc4j for the dataset management and our privacy-preserving federated GBDT implementation. See packages edu.alibaba.mpc4j.common.data in mpc4j-common-data and package edu.alibaba.mpc4j.sml.smile in mpc4j-sml-opboost for details. Note that we introduce source codes that are released only under the GNU Lesser General Public License v3.0 (LGPLv3).
  • Javallier: A Java library for Paillier partially homomorphic encryption based on python-paillier, with modifications to additionally support other schemes and optimizations. See mpc4j-crypto-phe for details.
  • JNA GMP project: A JNA wrapper around the GNU Multiple Precision Arithmetic Library. We modify the code for supporting the aarch64 system. See mpc4j-common-jna-gmp for details.
  • Bouncy Castle: A Java implementation of cryptographic algorithms, developed by the Legion of the Bouncy Castle, a registered Australian Charity. We understand many details of how to efficiently implement cryptographic algorithms using Java. We introduce its X25519 and Ed25519 implementations in mpc4j to support efficient Elliptic Curve Cryptographic (ECC) operations. See package edu.alibaba.mpc4j.common.tool.crypto.ecc.bc in mpc4j-common-tool for details.
  • Rings: An efficient, lightweight library for commutative algebra. We understand how to efficiently do algebra operations from this library. We wrap its polynomial interpolation implementations in mpc4j. See package edu.alibaba.mpc4j.common.tool.polynomial in mpc4j-common-tool for details. We also provide JdkIntegersZp that uses JNA GMP to implement operations in $\mathbb{Z}_p$. See JdkIntegersZp in mpc4j-common-tool for details.
  • blake2: Faster cryptographic hash function implementations. We introduce its original implementations and compare the efficiency with Java counterparts provided by Bouncy Castle and other hash functions (e.g., blake3). See crypto/blake2 in mpc4j-native-tool for details.
  • blake3: Much faster cryptographic hash function implementations. We introduce its original implementations and compare the efficiency with Java counterparts provided by Bouncy Castle and other hash functions (e.g., blake2). See crypto/blake3 in mpc4j-native-tool for details.
  • emp-toolkit: Efficient bit-matrix transpose (See bit_matrix_trans in mpc4j-native-tool), AES-NI implementations (See crypto/aes.h in mpc4j-native-tool), efficient $GF(2^\kappa)$ operations (See gf2k in mpc4j-native-tool).
  • KyberJCE: Kyber is an IND-CCA2-secure key encapsulation mechanism (KEM), whose security is based on the hardness of solving the learning-with-errors (LWE) problem over module lattices. KyberJCE is a pure-Java implementation of Kyber. We introduce its Kyber implementation in mpc4j for supporting post-quantum secure oblivious transfer. See crypto/kyber in mpc4j-native-tool for details.
  • xgboost-predictor: Pure Java implementation of XGBoost predictor for online prediction tasks. This work is released under the Apache Public License 2.0. We understand the format of the XGBoost model from this library. We also introduce some codes in mpc4j for our privacy-preserving federated XGBoost implementation. See packages ai.h2o.algos.tree and biz.k11i.xgboost in mpc4j-sml-opboost for details.
  • curve25519-elisabeth: A pure-Java implementation of group operations on Curve25519. We introduce its ED25519 and Ristretto implementation in mpc4j . See package crypto/ecc/cafe for details.
  • FourQlib: A library that implements essential elliptic curve and cryptographic functions based on FourQ, a high-security, high-performance elliptic curve that targets the 128-bit security level. We rewrite makefile so that now FourQ can run on MacBook.

Inspired Libraries

Here are some libraries that inspire our implementations.

  • mobile_psi_cpp: A C++ library implementing several OPRF protocols and using them for Private Set Intersection. We introduce its LowMC parameters and encryption implementations in mpc4j. See edu.alibaba.mpc4j.common.tool.crypto.prp.JdkBytesLowMcPrp and edu.alibaba.mpc4j.common.tool.crypto.prp.JdkLongsLowMcPrp in mpc4j-common-tool for details.
  • emp-toolkit: We follow the implementation of the Silent OT protocol presented in the paper "Ferret: Fast Extension for coRRElated oT with Small Communication," accepted at CCS 2020 (See cot in mpc4j-s2pc-pcg).
  • Kunlun: A C++ wrapper for OpenSSL, making it handy to use without worrying about cumbersome memory management and memorizing complex interfaces. Based on this wrapper, Kunlun builds an efficient and modular crypto library. We introduce its OpenSSL wrapper for Elliptic Curve and the Window Method implementation in mpc4j, see ecc_openssl in mpc4j-native-tool for details.
  • PSI-analytics: The implementation of the protocols presented in the paper "Private Set Operations from Oblivious Switching," accepted at PKC 2021. We introduce its switching network implementations in mpc4j. See package benes_network in mpc4j-native-tool for details.
  • Diffprivlib: A general-purpose library for experimenting with, investigating, and developing applications in differential privacy. We understand how to organize source codes for implementing differential privacy mechanisms. See mpc4j-dp-cdp for details.
  • b2_exponential_mchanism: An exponential mechanism implementation with base-2 differential privacy. We re-implement the base-2 exponential mechanism in mpc4j. See package edu.alibaba.mpc4j.dp.cdp.nomial for details.
  • libOTe: Implementations for many Oblivious Transfer (OT) protocols, especially the Silent OT protocol presented in the paper "Silver: Silent VOLE and Oblivious Transfer from Hardness of Decoding Structured LDPC Codes" accepted at CRYPTO 2021 (See package cot in mpc4j-s2pc-pcg).
  • PSU: The implementation of the paper "Scalable Private Set Union from Symmetric-Key Techniques," published in ASIACRYPT 2019. We introduce its fast polynomial interpolation implementations in mpc4j. See package ntl_poly in mpc4j-native-tool for details. The PSU implementation is in package psu of mpc4j-s2pc-pso.
  • PSU: The implementation of the paper "Shuffle-based Private Set Union: Faster and More," published in USENIX Security 2022. We introduce the idea of how to concurrently run the Oblivious Switching Network (OSN) in mpc4j. See package psu in mpc4j-s2pc-pso for details.
  • SpOT-PSI: The implementation of the paper "SpOT-Light: Lightweight Private Set Intersection from Sparse OT Extension," published in CRYPTO 2019. We introduce many ideas for fast polynomial interpolations in mpc4j. See package polynomial in mpc4j-common-tool for details.
  • OPRF-PSI: The implementation of the paper "Private Set Intersection in the Internet Setting From Lightweight Oblivious PRF," published in CRYPTO 2020. We introduce its OPRF implementations in mpc4j. See oprf in mpc4j-s2pc-pso for details.
  • APSI: The implementation of the paper "Labeled PSI from Homomorphic Encryption with Reduced Computation and Communication," published in CCS 2021. For its source code, we understand how to use the Fully Homomorphic Encryption (FHE) library SEAL. Most of the codes for Unbalanced Private Set Intersection (UPSI) are partially from ASPI. We also adapt the encoding part of 6857-private-categorization to support arbitrary bit-length elements. See mpc4j-native-fhe and upsi in mpc-s2pc-pso for details.
  • MiniPSI: The implementation of the paper "Compact and Malicious Private Set Intersection for Small Sets," published in CCS 2021. We understand how to implement Elliagtor encoding/decoding functions on Curve25519. See package crypto/ecc/bc/X25519BcByteMulElligatorEcc in mpc4j-common-tool for details.
  • Ed25519: Ed25519 in for Go. We understand how to implement Elliagtor in Ed25519. See package crypto/ecc/bc/X25519BcByteMulElligatorEcc in mpc4j-common-tool for details.
  • dgs: Discrete Gaussians over the Integers. We learn many ways of discrete Gaussian sampling. See package common/sampler/integral/gaussian in mpc4j-common-sampler for details.
  • Pure-DP: a Python package that provides simple implementations of various state-of-the-art LDP algorithms (both Frequency Oracles and Heavy Hitters) with the main goal of providing a single, simple interface to benchmark and experiment with these algorithms. We learn many efficient LDP implementation details.
  • PantheonPIR, SimplePIR, MulPIR, Constant-weight PIR, FastPIR, Onion-PIR, SealPIR, and XPIR: We understand many details for implementing PIR schemes. We re-implement some protocols based on SEAL instead of NFLlib, since we found we cannot compile NFLlib on Macbook M1 with aarch64.
  • VOLE-PSI: VOLE-PSI implements the protocols described in "VOLE-PSI: Fast OPRF and Circuit-PSI from Vector-OLE" and "Blazing Fast PSI from Improved OKVS and Subfield VOLE". We understand how to implement "Blazing fast OKVS" and many details of how to refine our implementation.
  • Piano-PIR: This is a prototype implementation of the Piano private information retrieval(PIR) algorithm that allows a client to access a database without the server knowing the querying index. We understand many details of the implementation.

Acknowledge

  • We thank Prof. Benny Pinkas and Dr. Avishay Yanai for many discussions on implementing Private Set Intersection protocols. They also greatly help our Java implementations for Oblivious Key-Value Storage (OKVS) presented in the paper "Oblivious Key-Value Stores and Amplification for Private Set Intersection," accepted at CRYPTO 2021. See package okve/okvs in mpc4j-common-tool for more details.
  • We thank Dr. Stanislav Poslavsky and Prof. Benny Pinkas for many discussions on implementations of fast polynomial interpolations when we try to implement the PSI protocol presented in the paper "SpOT-Light: Lightweight Private Set Intersection from Sparse OT Extension."
  • We thank Prof. Mike Rosulek for the discussions about the implementation of Private Set Union (PSU). Their implementation for the paper "Private Set Operations from Oblivious Switching" brings much help for us to understand how to implement PSU.
  • We thank Prof. Xiao Wang for discussions about fast bit-matrix transpose. From the discussion, we understand that the basic idea of fast bit-matrix transpose is from the blog The Full SSE2 Bit Matrix Transpose Routine. He also helped me realize that there exists an efficient polynomial operation implementation in $GF(2^\kappa)$ introduced in Intel Carry-Less Multiplication Instruction and its Usage for Computing the GCM Mode. See package galoisfield/gf2k in mpc4j-common-tool for more details.
  • We thank Prof. Peihan Miao for discussions about the implementation of the paper "Private Set Intersection in the Internet Setting From Lightweight Oblivious PRF." From the discussion, we understand there is a special case for the lightweight OPRF when $n = 1$. See package oprf in mpc4j-s2pc-pso for more details.
  • We thank Prof. Yu Chen for many discussions on various MPC protocols. Here we recommend his open-source library Kunlun, a modern crypto library. We thank Minglang Dong for her example codes about implementing the Window Method for fixed-base multiplication in ECC.
  • We thank Dr. Bolin Ding for many discussions on introducing MPC into the database field. Here we recommend the open-source library FederatedScope, an easy-to-use federated learning package, from his team.
  • We thank anonymous USENIX Security 2023 Artifact Evaluation (AE) reviewers for many suggestions for the mpc4j documentation and for mpc4j-native-tool. These suggestions help us fix many memory leakage problems. Also, the comments help us remove many duplicate codes.
  • We thank Dr. Kevin Yeo and Dr. Joon Young Seo of discussions on how to implement band matrix solvers used in "Near-Optimal Oblivious Key-Value Stores for Efficient PSI, PSU and Volume-Hiding Multi-Maps".

License

This library is licensed under Apache License 2.0.

Specifications

C/C++ Modules

Most of the codes are in Java, except for very efficient implementations in C/C++. You need OpenSSL, GMP, NTL , MCL, libsodium, and FourQ that we rewrite (in mpc4j-native-fourq) to compile mpc4j-native-tool and SEAL 4.0.0 to compile mpc4j-native-fhe. Please see README.md in mpc4j-native-fourq, mpc4j-native-cool and mpc4j-native-fhe on how to install C/C++ dependencies.

After successfully installing C/C++ library mpc4j-native-fourq and obtaining the compiled C/C++ libraries (named libmpc4j-native-tool and libmpc4j-native-fhe, respectively), you need to assign the native library location when running mpc4j using -Djava.library.path.

Tests

mpc4j has been tested on MAC (x86_64 / aarch64), Ubuntu 20.04 (x86_64 / aarch64), and CentOS 8 (x86_64). We welcome developers to do tests on other platforms.

We note that you may need to run test cases in mpc4j-s2pc-pir separately, especially for test cases in IndexPirTest and KwPirTest. The reason is that PIR and related implementations heavily consume the main memory, and direct running all test cases may (automatically) involve frequent fullGC, introducing problems.

Performances

We have received a lot of suggestions and some performance reports from users. We thank Dr. Yongha Son for providing performance reports for Private Set Union (PSU) on his development platform (Intel Xeon 3.5GHz) under the Unit Test. He reported that:

Well, I tested other protocols, particularly JSZ22 SFC, GMR21, and KRTW19, from unit tests.

  • JSZ22 takes 4x faster time.

  • KRTW19 and GMR21 take 1.5x slower.

  • ZCL22 takes 2.5-3x slower time.

than the reported numbers in ZCL22.

We have a deep discussion about the performance gap. Here are the following reasons:

  1. In Unit Test, we use an optimized way of implementing JSZ22. Roughly speaking, we can use batched related-key OPRF proposed by Kolesnikov et al. instead of the more general multi-point OPRF proposed by Chase and Miao to speed up the underlying OPRF. The reason is that JSZ22 used cuckoo hash binning the input elements, suitable for related-key OPRF. See our paper "Private Set Operations from Multi-Query Reverse Private Membership Test" for more details.
  2. As far as we know, server-version CPUs (like Intel Xeon 3.5GHz) provide more efficient instructions than desktop-version CPUs (like Intel i9900k). Note that NTL and GMP would automatically detect the underlying platform to choose the most efficient way for their configurations. We doubt these instructions would help NTL and GMP libraries run faster. It seems that such efficient instructions would bring little help to ECC operations. As a comparison, Dr. Yongha Son ran EccEfficiencyTest on his platform. The result shows ECC operations on his platform with asm are much slower (about 5x) than on our Macbook M1 platform without asm.

We have to say that we underestimated the performance gap between different platforms. The performance comparison result also reflects that having fair comparisons for different protocols is very challenging. Aside from that, we still try to provide a unified library for trying to have a relatively fair comparison.

Notes for Running on aarch64

When using or developing mpc4j on aarch64 systems (like MacBook M1), you may get java.lang.UnsatisfiedLinkError with a description like "no mpc4j-native-tool / mpc4j-native-fhe in java.library.path", even if you correctly compile the native libraries and config the native library paths using -Djava.library.path. The reason is that some Java Virtual Machines (JVM) with versions less than 17 do not fully support aarch64. JDK 17 Release Notes stated that (In JEP 391: macOS / Aarch64 Port):

macOS 11.0 now supports the AArch64 architecture. This JEP implements support for the macos-aarch64 platform in the JDK. One of the features added is support for the W^X (write xor execute) memory. It is enabled only for macos-aarch64 and can be extended to other platforms at some point. The JDK can be either cross-compiled on an Intel machine or compiled on an Apple M1-based machine.

We recommend using Java 17 (or higher versions) to run or develop mpc4j on aarch64 systems. If you still want to use Java with versions less than 17, we test many JVMs and found that Azul Zulu fully supports aarch64.

Notes for Errors on FourQlib

When you run make test for mpc4j-native-fourq, you possibly meet test failures. The reason is that the original FourQlib have some unknown bugs when running on some platforms (but currently we do not know which platforms you may meet the bug). See Issue #9 in FourQlib and Issue #16 in mpc4j.

Simply ignoring the error is OK, but many test cases in mpc4j would fail since mpc4j uses FourQ EC curve by default. You need to change the default EC curve from FourQ to ED25519 (also see Issue #16 in mpc4j for more details):

  1. In module mpc4j-common-tool, find ByteEccFactory in package edu.alibaba.mpc4j.common.tool.crypto.ecc.
  2. Find the function public static ByteFullEcc createFullInstance(EnvType envType).
  3. Change return createFullInstance(ByteEccType.FOUR_Q); to return createFullInstance(ByteEccType.ED25519_SODIUM);.

Development

We develop mpc4j using Intellij IDEA and CLion. Here are some guidelines.

Intellij IDEA Preferences

Please change the following Preferences before actual development:

  1. Editor -> Code Style -> Java: Table size, Indent, Continuation indent are all 4.
  2. Editor -> Code Style -> Java -> Imports: select "Insert imports for inner classes".
  3. Editor -> Inspections: select Java -> JVM languages, and select "Serializable class without 'serialVersionUID'". We note that all PtoId in PtoDesc instances are generated using serialVersionUID. When creating a new instance of PtoDesc, make it implement Serializable , follow the warning to generate a serialVersionUID, paste that ID to be PtoId, and delete implement Serializable and corresponding imports.
  4. Plugins: Install and use "Git Commit Template" to write commit. If necessary, install and use "Alibaba Java Coding Guidelines" for unified code styles.

Linking Native Libraries

After successfully installing mpc4j-native-fourq, compiling mpc4j-native-tool and mpc4j-native-fhe, you need to configure IDEA with the following procedures so that IDEA can link to these native libraries.

  1. Open Run->Edit Configurations...
  2. Open Edit Configuration templates...
  3. Select JUnit.
  4. Add the following command into VM Options. Note that do not remove -ea, which means enabling assert in unit tests. If so, some test cases (related to input verifications) would fail.
-Djava.library.path=/YOUR_MPC4J_ABSOLUTE_PATH/mpc4j-native-tool/cmake-build-release:/YOUR_MPC4J_ABSOLUTE_PATH/mpc4j-native-fhe/cmake-build-release

Demonstration

We thank Qixian Zhou for writing a guideline demonstrating configuring the development environment on macOS (x86_64). We believe this guideline can also be used for other platforms, e.g., macOS (M1), Ubuntu, and CentOS. Here are the steps:

  1. Follow any guidelines to install JDK 8 and IntelliJ IDEA. If you successfully install JDK8, you can obtain similar information in the terminal when executing java -version.
java version "1.8.0_301"
Java(TM) SE Runtime Environment (build 1.9.0_301-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.301-b09, mixed mode)
  1. Clone mpc4j source code using git clone https://github.com/alibaba-edu/mpc4j.git.

  2. Follow the documentation in https://github.com/alibaba-edu/mpc4j/tree/main/mpc4j-native-tool to compile mpc4j-native-tool. If all steps are correct, you will see:

[100%] Linking CXX shared library libmpc4j-native-tool.dylib
[100%] Built target mc4j-native-tool
  1. Follow the documentation in https://github.com/alibaba-edu/mpc4j/tree/main/mpc4j-native-fhe to compile mpc4j-native-tool. If all steps are correct, you will see:
[100%] Linking CXX shared library libmpc4j-native-fhe.dylib
[100%] Built target mc4j-native-fhe
  1. Using IntelliJ IDEA to open mpc4j.
  2. Open Run->Edit Configurations....

macos_step_06

  1. Open Edit Configuration templates....

macos_step_06

  1. Select JUnit, and add the following command into VM Options (Note that you must replace /YOUR_MPC4J_ABSOLUTE_PATH with your own absolute path for libmpc4j-native-tool.dylib and libmpc4j-native-fhe.dylib.):
-Djava.library.path=/YOUR_MPC4J_ABSOLUTE_PATH/mpc4j-native-tool/cmake-build-release:/YOUR_MPC4J_ABSOLUTE_PATH/mpc4j-native-fhe/cmake-build-release

macos_step_06

  1. Now, you can run tests of any submodule by pressing the Green Arrows showing on the left of the source code in test packages.

macos_step_06

TODO List

Possible Missions

  • Provide more documentation.
  • Translate JavaDoc and comments in English.
  • We are still adjusting our implementations on many Private Set Intersection protocols. We will soon release the source code whenever available.
  • More secure two-party computation (2PC) protocol implementations.
  • More secure three-party computation (3PC) protocol implementations. Specifically, release the source code of our paper "Scape: Scalable Collaborative Analytics System on Private Database with Malicious Security" accepted at ICDE 2022.
  • More differentially private algorithms and protocols, especially for the Shuffle Model implementations of our paper "Privacy Enhancement via Dummy Points in the Shuffle Model."

Impossible Missions, but We Will Try