/lambda_split

Primary LanguagePythonMIT LicenseMIT

Λ-Split: A Privacy-Preserving Split Computing Framework for Cloud-Powered Generative AI

Preprint URL : https://arxiv.org/abs/2310.14651

This repository provides demonstration programs that apply the Λ-Split to LLMs, including Llama 2, and diffusion models, including Stable Diffusion XL (SDXL).

Videos

Text generation using Llama 2 with GUI

text_generation_demo.mp4

Text generation using Llama 2 with HTTP communication

text_generation_demo_with_HTTP_720p.mp4

Image generation using SDXL with GUI

image_generation_demo_720p.mp4

Usage

Python version : 3.8 or later

python3 -m pip install -r requirements.txt

Text generation using Llama 2

  1. You must agree to Meta's license as stated on the Huggingface page.

  2. Execute the following command

cd text_generation
python3 main.py

Text generation using Llama 2 with HTTP communication

  1. You must agree to Meta's license as stated on the Huggingface page.

  2. Prepare 2 computers for cloud server and local device.

  3. Execute the following command on each computer

Cloud

cd text_generation
python3 cloud_main.py

Local

cd text_generation
python3 edge_main.py

Image generation using SDXL

cd image_generation
python3 main.py

Directory tree

lambda_split/
│
├─ text_generation/
│  ├─ main.py
│  ├─ cloud_main.py : For HTTP communication
│  ├─ edge_main.py : For HTTP communication
│  └─ src/
│     ├─ base.py
│     ├─ cloud.py
│     ├─ edge.py
│     ├─ split_models.py : Definition of split sub-models.
│     └─ utils.py
│
├─ image_generation/
│  ├─ main.py
│  ├─ evaluation.py
│  └─ src/
│     ├─ quantizers.py : For quantization
│     ├─ split_pipelines.py : Definition of split sub-models.
│     └─ utils.py
│
└─ requirements.txt

Overview of split implementation

  1. override forward method of models to correctly split inference layers at inference time (implemented by commenting out in forward method of FirstLlamaModel etc. in src/models.py)
  2. replace unused layers with identity layers to reduce memory usage (implemented by replace_unused_layers_with_identity method in src/models.py FirstLlamaModel etc.)