/AIJack

Security and Privacy Risk Simulator for Machine Learning

Primary LanguageC++Apache License 2.0Apache-2.0


AIJack
Let's hijack AI!

❤️ If you like AIJack, please consider becoming a GitHub Sponsor ❤️

What is AIJack?

AIJack allows you to assess the privacy and security risks of machine learning algorithms such as Model Inversion, Poisoning Attack and Evasion Attack. AIJack also provides various defense techniques like Federated Learning, Split Learning, Differential Privacy, Homomorphic Encryption, and other heuristic approaches. We currently implement more than 20 state-of-arts methods. We also support MPI for some of the distributed algorithms. For more information, see the documentation.

Table of Contents

Installation

AIJack requires Boost and pybind11.

apt install -y libboost-all-dev
pip install -U pip
pip install "pybind11[global]"

pip install git+https://github.com/Koukyosyumei/AIJack

Supported Algorithms

Distributed Learning

Example Paper
FedAVG example paper
FedProx WIP paper
FedKD example paper
FedMD WIP paper
FedGEMS WIP paper
DSFL WIP paper
SplitNN example paper
SecureBoost example paper

Attack

Attack Type Example Paper
MI-FACE Model Inversion example paper
DLG Model Inversion example paper
iDLG Model Inversion example paper
GS Model Inversion example paper
CPL Model Inversion example paper
GradInversion Model Inversion example paper
GAN Attack Model Inversion example paper
Shadow Attack Membership Inference example paper
Norm attack Label Leakage example paper
Gradient descent attacks Evasion Attack example paper
SVM Poisoning Poisoning Attack example paper

Defense

Defense Type Example Paper
DPSGD Differential Privacy example paper
Paillier Homomorphic Encryption example paper
CKKS Homomorphic Encryption test paper
Soteria Others example paper
MID Others example paper

Quick Start

We briefly introduce some example usages. You can also find more examples in example.

Federated Learning and Model Inversion Attack

FedAVG is the most representative algorithm of Federated Learning, where multiple clients jointly train a single model without sharing their local datasets.

  • Base

You can write the process of FedAVG like the standard training with Pytorch.

from aijack.collaborative import FedAvgClient, FedAvgServer

clients = [FedAvgClient(local_model_1, user_id=0), FedAvgClient(local_model_2, user_id=1)]
optimizers = [optim.SGD(clients[0].parameters()), optim.SGD(clients[1].parameters())]
server = FedAvgServer(clients, global_model)

for client, local_trainloader, local_optimizer in zip(clients, trainloaders, optimizers):
    for data in local_trainloader:
        inputs, labels = data
        local_optimizer.zero_grad()
        outputs = client(inputs)
        loss = criterion(outputs, labels.to(torch.int64))
        client.backward(loss)
        optimizer.step()
server.action()
  • Attack

You can simulate the model inversion attack against FedAVG.

from aijack.attack import GradientInversion_Attack

dlg_manager = GradientInversionAttackManager(input_shape, distancename="l2")
FedAvgServer_DLG = dlg.attach(FedAvgServer)
server = FedAvgServer_DLG(clients, global_model, lr=lr)

reconstructed_image, reconstructed_label = server.attack()
  • Defense

One possible defense for clients of FedAVG is Soteria, and you need only two additional lines to implement Soteria.

from aijack.collaborative import FedAvgClient
from aijack.defense import SoteriaManager

manager = SoteriaManager("conv", "lin", target_layer_name="lin.0.weight")
SoteriaFedAvgClient = manager.attach(FedAvgClient)
client = SoteriaFedAvgClient(Net(), user_id=i, lr=lr)

Split Learning and Label Leakage Attack

You can use split learning, where only one party has the ground-truth labels.

  • Base
from aijack.collaborative import SplitNN, SplitNNClient

clients = [SplitNNClient(model_1, user_id=0), SplitNNClient(model_2, user_id=1)]
optimizers = [optim.Adam(model_1.parameters()), optim.Adam(model_2.parameters())]
splitnn = SplitNN(clients, optimizers)

for data dataloader:
    splitnn.zero_grad()
    inputs, labels = data
    outputs = splitnn(inputs)
    loss = criterion(outputs, labels)
    splitnn.backward(loss)
    splitnn.step()
  • Attack

We support norm-based label leakage attack against Split Learning.

from aijack.attack import NormAttackManager
from aijack.collaborative import SplitNN

manager = NormAttackManager(criterion, device="cpu")
NormAttackSplitNN = manager.attach(SplitNN)
normattacksplitnn = NormAttackSplitNN(clients, optimizers)
leak_auc = normattacksplitnn.attack(target_dataloader)

DPSGD (SGD with Differential Privacy)

DPSGD is an optimizer based on Differential Privacy and theoretically privatizes your deep learning model. We implement the core of differential privacy mechanisms with C++, which is faster than many other libraries purely implemented with Python.

from aijack.defense import GeneralMomentAccountant
from aijack.defense import PrivacyManager

accountant = GeneralMomentAccountant(noise_type="Gaussian", search="greedy", orders=list(range(2, 64)), bound_type="rdp_tight_upperbound")
privacy_manager = PrivacyManager(accountant, optim.SGD, l2_norm_clip=l2_norm_clip, dataset=trainset, iterations=iterations)
dpoptimizer_cls, lot_loader, batch_loader = privacy_manager.privatize(noise_multiplier=sigma)

for data in lot_loader(trainset):
    X_lot, y_lot = data
    optimizer.zero_grad()
    for X_batch, y_batch in batch_loader(TensorDataset(X_lot, y_lot)):
        optimizer.zero_grad_keep_accum_grads()
        pred = net(X_batch)
        loss = criterion(pred, y_batch.to(torch.int64))
        loss.backward()
        optimizer.update_accum_grads()
    optimizer.step()

SecureBoost (XGBoost with Homomorphic Encryption)

SecureBoost is a vertically federated version of XGBoost, where each party encrypts sensitive information with Paillier Encryption. You need additional compile to use secureboost, which requires Boost 1.65 or later.

cd src/aijack/collaborative/tree
pip install -e .
from aijack_secureboost import SecureBoostParty, SecureBoostClassifier, PaillierKeyGenerator

keygenerator = PaillierKeyGenerator(512)
pk, sk = keygenerator.generate_keypair()

sclf = SecureBoostClassifier(2,subsample_cols,min_child_weight,depth,min_leaf,
                  learning_rate,boosting_rounds,lam,gamma,eps,0,0,1.0,1,True)

sp1 = SecureBoostParty(x1, 2, [0], 0, min_leaf, subsample_cols, 256, False, 0)
sp2 = SecureBoostParty(x2, 2, [1], 1, min_leaf, subsample_cols, 256, False, 0)

sparties = [sp1, sp2]

sparties[0].set_publickey(pk)
sparties[1].set_publickey(pk)
sparties[0].set_secretkey(sk)

sclf.fit(sparties, y)

sclf.predict_proba(X)

Evasion Attack

Evasion Attack generates data that the victim model cannot classify correctly.

from aijack.attack import Evasion_attack_sklearn

attacker = Evasion_attack_sklearn(target_model=clf, X_minus_1=attackers_dataset)
result, log = attacker.attack(initial_datapoint)

Poisoning Attack

Poisoning Attack injects malicious data into the training dataset to control the behavior of the trained models.

from aijack.attack import Poison_attack_sklearn

attacker = Poison_attack_sklearn(clf, X_train_, y_train_, t=0.5)
xc_attacked, log = attacker.attack(xc, 1, X_valid, y_valid)

Contact

welcome2aijack[@]gmail.com