/EPCDedup

Code & reference repository for Intel SGX EPC memory deduplication (Although it fails, I put it on my homepage for logging)

Primary LanguageC++

EPCDedup

Code & reference repository for EPC memory deduplication

Update Logs

Working Plan

  1. Implement baseline system (Only EPC page deduplication, without any optimizations).
  2. Design experiments, verify problems and insights and find other insights to motivate design.
  3. Improve the design and realize EPCDedup.
  4. Develop several applications based on EPCDedup and verify the effect.

Paper Structure

  • Problem Statement: Practical application shows that frequent swapping in and out of the EPC page leads to a significant drop in SGX performance
    • Experimental design: TODO
  • Analysis:
    • Insight I: EPC page deduplication ratio 10%~30% -> opportunities of applying page-content-based deduplication in EPC.
      • Experimental design: TODO
    • Insight II: Classification of EPC pages and dynamic characteristics of different types of pages.
      • Experimental design: TODO
  • Design: based on insights, design EPCDedup
    • Overview: scanning -> deduplication -> address copy-on-write
    • Scanning: For different types of EPC pages, use different scanning cycles.
      • Scanning when enclave creates: TCB, SEC, Code, Data pages.
      • Long-term scanning: Heaps.
      • Short-term scanning: Stacks.
      • Non-scanning/deduplication: SSA, VA
    • Deduplication:
      • Create deduplication info page to record page index and duplicate pages' metadata.
      • Dynamic adjust the scanning intervals based on old read/write patterns and deduplication ratio.
      • Deal with copy-on-write.
  • Applications: Realize safe and efficient application based on EPCDedup
    • SQLite3: PUDF dataset.
    • Memcached: Ubuntu IRC logs and Enron email datasets.

Memory pages analysis

Application Total page number Unique page number Deduplication ratio
frpc 2475 2440 1.41%
Docker host 3777654 3568768 5.53%
snap 84642 18661 77.95%
chrome 274745 199111 27.53%
mongodb 3606262 3489657 3.23%
vscode 633605 493202 22.16%
qv2ray 79422 33257 58.13%

New problems & Solutions:

  • All enclave pages are encrypted with AES-GCM, so they need to be deduplicated in the CPU cache.
    • Deduplication inside CPU cache (compute plaintext page fingerprint for each page for deduplication).
    • Modify AES-GCM encrypt counter:
      • Use enclave ID as the counter to support deduplication inside each enclave.
      • Use page hash as the counter to support deduplication between pages with the same content.
  • The pages after deduplication need to retain the support for permissions management (the same content but different permissions pages may appear).
    • Modify sgx_pageinfo data structure in sgx_arch.h, use some additional flags to record different permissions.
  • Capture content of each page in all enclaves?
    • Capture the page content when pages are created, modified, or deleted.
    • In function sgx_ioc_enclave_add_page, the same enclave always creates the same page (not in line with expectations)
  • Link Linux driver with OpenSSL?
  • Installing the modified driver may cause the system to fail to start: Some memory addresses are not allowed to be written directly in the kernel
    • Disable SGX device in BIOS, then deleting the driver.
    • SGX Driver 2.11 version may not correct on Ubuntu 18.04 LTS, changing to version 2.6.
    • Any system kernel update (e.g., kernel version change from 5.4.0-53 to 5.4.0-54), the driver needs to recompile & reinstall.
  • Low overhead page scan (slides/EPCDedup-12-19).
  • Build paper
    • Expose problem: application to motivate problem, analysis to introduce page types (access, deduplication ratio)
    • Applications: memcached, sqlite3
  • [2021.2.5]Page permissions issue, direct access leads to errors, need to change EPCM or design a dedicated enclave to manage all pages?
  • [2021.2.19]Virtual address (from sgx_get_page function) lead to '0xFFFFFF' pages, which contains error content of each page.
  • [2021.2.23]Find vaddr and data->addp pair, content correct when copy from user space, going to 0xfffff when extract with sgx_get_page. May not working.

Base Linux SGX Driver & SDK

Driver Version (Changing to version 2.6) :

  • Release : 2.11
  • commit ID : 75bf89f7d6dd4598b9f8148bd6374a407f37105c

SDK Version :

  • Release : 2.7
  • commit ID : 504b28053f9526b43eedc9a75d858d2c750c3702

Reference :

Build

Prerequisites

  • Downlaod linux kernel header: sudo apt-get install linux-headers-$(uname -r).

Make and Install

  • Use makefile to make the kernel module isgx.
  • Install with :
$ sudo mkdir -p "/lib/modules/"`uname -r`"/kernel/drivers/intel/sgx"    
$ sudo cp isgx.ko "/lib/modules/"`uname -r`"/kernel/drivers/intel/sgx"    
$ sudo sh -c "cat /etc/modules | grep -Fxq isgx || echo isgx >> /etc/modules"    
$ sudo /sbin/depmod
$ sudo /sbin/modprobe isgx
  • Uninstall with:
$ sudo /sbin/modprobe -r isgx
$ sudo rm -rf "/lib/modules/"`uname -r`"/kernel/drivers/intel/sgx"
$ sudo /sbin/depmod
$ sudo /bin/sed -i '/^isgx$/d' /etc/modules
  • Note :
    • Need to restart the machine after each reinstallation to use the isgx device normally.
    • [2020.11.12] Trying to build isgx with dkms to avoid system reboot for testing.
    • System kernel needs rate-limiting the output.
    • Kernel thread needs sleep when idle, or will lead to kernel suspend.

Signing SGX Driver (Run with secure boot enabled)

  • Self signing isgx by sudo /usr/src/linux-headers-$(uname -r)/scripts/sign-file sha256 ./MOK.priv ./MOK.der $(modinfo -n isgx).
  • After all done and enroll the MOK successfully, redo the installation.

Analysis

How SGX Driver works?

When the user process uses the device file to perform operations such as read/write, the system call finds the corresponding device driver through the major device number of the device file, then reads the corresponding function pointer of this data structure, and then transfers control to the Function, this is the basic principle of Linux device driver work.

static const struct file_operations sgx_fops = {
    .owner			= THIS_MODULE,
    .unlocked_ioctl		= sgx_ioctl, // the ioctl function pointer, to do the device I/O control command
#ifdef CONFIG_COMPAT
    .compat_ioctl		= sgx_compat_ioctl,
#endif
    .mmap			= sgx_mmap, // request the device memory to be mapped to the process address space
    .get_unmapped_area	= sgx_get_unmapped_area,
};
Enclave create workflow:

Workflow of create new enclave

Workflow of initialize the new enclave

ioctl functions:
long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
{
    char data[256];
    sgx_ioc_t handler = NULL;
    long ret;

    switch (cmd) {
    case SGX_IOC_ENCLAVE_CREATE:
        handler = sgx_ioc_enclave_create;
        break;
    case SGX_IOC_ENCLAVE_ADD_PAGE:
        handler = sgx_ioc_enclave_add_page;
        break;
    case SGX_IOC_ENCLAVE_INIT:
        handler = sgx_ioc_enclave_init;
        break;
    case SGX_IOC_ENCLAVE_EMODPR:
        handler = sgx_ioc_page_modpr;
        break;
    case SGX_IOC_ENCLAVE_MKTCS:
        handler = sgx_ioc_page_to_tcs;
        break;
    case SGX_IOC_ENCLAVE_TRIM:
        handler = sgx_ioc_trim_page;
        break;
    case SGX_IOC_ENCLAVE_NOTIFY_ACCEPT:
        handler = sgx_ioc_page_notify_accept;
        break;
    case SGX_IOC_ENCLAVE_PAGE_REMOVE:
        handler = sgx_ioc_page_remove;
        break;
    default:
        return -ENOIOCTLCMD;
    }

    if (copy_from_user(data, (void __user *)arg, _IOC_SIZE(cmd)))
        return -EFAULT;

    ret = handler(filep, cmd, (unsigned long)((void *)data));
    if (!ret && (cmd & IOC_OUT)) {
        if (copy_to_user((void __user *)arg, data, _IOC_SIZE(cmd)))
            return -EFAULT;
    }

    return ret;
}

Pages

  • Page type:
enum sgx_page_type {
  SGX_PAGE_TYPE_SECS  = 0x00, // Meta data for each enclave
  SGX_PAGE_TYPE_TCS = 0x01, // Meta data for each thread
  SGX_PAGE_TYPE_REG = 0x02, // The general memory allocated by the system
  SGX_PAGE_TYPE_VA  = 0x03, // Version Array of evicted pages
  SGX_PAGE_TYPE_TRIM  = 0x04, // remove a page from the enclave and reclaim the linear address for future use
};
  • EPC layout

EPC layout

Data structures:

  • SGX Enclave Control Structure (SECS):
    • Represents one enclave.
    • Contains, for instance, Hash, ID, size etc.
  • Thread Control Structure (TCS):
    • Each executing thread in the enclave is associated with a Thread Control Structure.
    • Contains, for instance, Entry point, pointer to SSA.
  • State Save Area (SSA):
    • When an AEX occurs while running in an enclave, the architectural state is saved in the thread’s SSA
  • Page Information (PAGEINFO):
    • PAGEINFO is an architectural data structure that is used as a parameter to the EPC-management instructions
      • Linear Address
      • Effective address of the page (aka virtual address)
      • SECINFO
      • SECS
  • Security Information (SECINFO):
    • The SECINFO data structure holds meta-data about an enclave page
      • Read/Write/Execute
      • Page type (SECS, TCS, normal page or VA)
  • Paging Crypto MetaData (PCMD):
    • The PCMD structure is used to keep track of crypto meta-data associated with a paged-out page. Combined with PAGEINFO, it provides enough information for the processor to verify, decrypt, and reload a paged-out EPC page.
    • EWB writes out (the reserved field and) MAC values.
    • ELDB/U reads the fields and checks the MAC.
    • Contains Enclave ID, SECINFO and MAC
  • Version Array (VA):
    • In order to securely store the versions of evicted EPC pages, SGX defines a special EPC page type called a Version Array (VA).
      • Each VA page contains 512 slots, each of which can contain an 8-byte version number for a page evicted from the EPC.
      • When an EPC page is evicted, software chooses an empty slot in a VA page; this slot receives the unique version number of the page being evicted: When the EPC page is reloaded, a VA slot must hold the version of the page. If the page is successfully reloaded, the version in the VA slot is cleared.
      • VA pages can be evicted, just like any other EPC page: When evicting a VA page, a version slot in some other VA page must be used to receive the version for the VA being evicted.

Pages management

Evicting Enclave Pages

Intel SGX paging allows the Operating System (OS) to evict multiple pages out of the EPC under a single synchronization. Flow for evicting a list of pages from the EPC is:

  1. For each page to be evicted from the EPC:
  2. Select an empty slot in a Version Array (VA) page. (If no empty VA page slots exist, create a new VA page using the EPA leaf function.)
  3. Remove linear-address to physical-address mapping from the enclave context's mapping tables (page table and EPT tables).
  4. Execute the EBLOCK leaf function for the target page. This sets the target page state to BLOCKED. At this point, no new mappings of the page will be created. So any access which does not have the mapping cached in the TLB will generate a #PF.
  5. For each enclave containing pages selected in step 1:
    • Execute an ETRACK leaf function pointing to that enclave's SECS. This initiates the tracking process that ensures that all caching of linear-address to physical-address translations for the blocked pages is cleared.
  6. For all logical processors executing in processes (OS) or guests (VMM) that contain the enclaves selected in step 1:
  • Issue an IPI (inter-processor interrupt) to those threads. This causes those logical processors to asynchronously exit any enclaves they might be in, and as a result, flush cached linear-address to physical-address translations that might hold stale translations to blocked pages. There is no need for additional measures, such as performing a "TLB shootdown."
  1. After enclaves exit, allow logical processors to resume normal operation, including enclave re-entry as the tracking logic keeps track of the activity.
  2. For each page to be evicted:
    • Evict the page using the EWB leaf function with parameters including the effective-address pointer to the EPC page, the VA slot, a 4K byte buffer to hold the encrypted page contents, and a 128-byte buffer to hold page metadata. The last three elements are tied together cryptographically and must be used to later reload the page.
  3. At this point, system software has the only copy of each page data encrypted with its page metadata in the main memory.
Loading Enclave Page

To reload a previously evicted page, the system software needs four elements: the VA slot used when the page was evicted, a buffer containing the encrypted page contents, a buffer containing the page metadata, and the parent SECS to associate with this page. If the VA page or the parent SECS are not already in the EPC, they must be reloaded first.

  1. Execute ELDB/ELDU (depending on the desired BLOCKED state for the page), passing as parameters: the EPC page linear address, the VA slot, the encrypted page, and the page metadata.
  2. Create a mapping in the enclave context's mapping tables (page tables and EPT tables) to allow the application to access that page (OS: system page table; VMM: EPT).
  3. The ELDB/ELDU instruction marks the VA slot empty so that the page cannot be replayed at a later date.
Page encryption / decryption
  • Page encryption counter & key could change on asm level, but ewb functions set dedicated register to store memory encryption key
  • Data structure to store those information:
struct sgx_encl {
    unsigned int flags;
    uint64_t attributes;
    uint64_t xfrm;
    unsigned int secs_child_cnt;
    struct mutex lock;
    struct mm_struct *mm;
    struct file *backing;
    struct file *pcmd;
    struct list_head load_list;
    struct kref refcount; // the encryption counter for AES-GCM
    unsigned long base;
    unsigned long size;
    unsigned long ssaframesize;
    struct list_head va_pages;
    struct radix_tree_root page_tree;
    struct list_head add_page_reqs;
    struct work_struct add_page_work;
    struct sgx_encl_page secs;
    struct sgx_tgid_ctx *tgid_ctx;
    struct list_head encl_list;
    struct mmu_notifier mmu_notifier;
    unsigned int shadow_epoch;
};
  • Encryption:
(* Decrypt and MAC page. AES_GCM_DEC has 2 outputs, {plain text, MAC} *)
(* Parameters for AES_GCM_DEC {Key, Counter, ..} *)
{DS:RCX, TMP_MAC} := AES_GCM_DEC(CR_BASE_PK, TMP_VER << 32, TMP_HEADER, 128, DS:RCX, 4096);
  • Decryption:
(* Encrypt the page, DS:RCX could be encrypted in place. AES-GCM produces 2 values, {ciphertext, MAC}. *)
(* AES-GCM input parameters: key, GCM Counter, MAC_HDR, MAC_HDR_SIZE, SRC, SRC_SIZE)*)
{DS:TMP_SRCPGE, DS:TMP_PCMD.MAC} := AES_GCM_ENC(CR_BASE_PK, (TMP_VER << 32), TMP_HEADER, 128, DS:RCX, 4096);
Function call relationship for adding a new page
  • Start with a new enclave:
    • Use sgx_ioc_enclave_init() in sgx_ioctl.c to create a new enclave with pre-allocated pages.
    • Use alloc_page() function to allocate one page from system memory manager.
    • Use kmap() function to map the physical page with a free virtual address space (kmap use total 4MB space, contains 1024 pages).
    • Use sgx_get_encl() to get the enclave information sgx_encl handler for new page encrypion.
    • Use sgx_encl_init() to update encl handler for current enclave.
  • Add a new page to exist enclave:
    • Use sgx_ioc_enclave_add_page() in sgx_ioctl.c to creates a new enclave page and enqueues an EADD operation that will be processed by a worker thread later.
    • Use sgx_get_encl() to get the enclave information
    • Use alloc_page() and kmap() to generate a new page.
    • Use the sgx_encl_add_page() to final update the newly added page by calling method __sgx_encl_add_page()
  • Main encls function:
#define __encls_ret(rax, rbx, rcx, rdx)			\
    ({						\
    int ret;					\
    asm volatile(					\
    "1: .byte 0x0f, 0x01, 0xcf;\n\t"		\
    "2:\n"						\
    ".section .fixup,\"ax\"\n"			\
    "3: mov $-14,"XAX"\n"				\
    "   jmp 2b\n"					\
    ".previous\n"					\
    _ASM_EXTABLE(1b, 3b)				\
    : "=a"(ret)					\
    : "a"(rax), "b"(rbx), "c"(rcx), "d"(rdx)	\
    : "memory");					\
    ret;						\
    })

Datasets

  • Muhammad Naveed, Seny Kamara, and Charles V. Wright. 2015.Inference Attacks on Property-preserving Encrypted Databases. In Proceeding of 22nd ACM Conference on Computer and Communications Security (CCS’15). 644--655.
  • Paul Grubbs, Kevin Sekniqi, Vincent Bindschaedler, Muhammad Naveed, and Thomas Ristenpart. 2017. Leakage-Abuse Attacks against Order-Revealing Encryption. In Proceeding of IEEE Symposium on Security and Privacy (SP’17). 655--672.
  • David C. Uthus and David W. Aha. 2013. The Ubuntu Chat Corpus for Multiparticipant Chat Analysis. In Proceeding of AAAI Spring Symposium. 99--102.
  • Bryan Klimt and Yiming Yang. 2004. The Enron Corpus: A New Dataset for Email Classification Research. In Proceeding of European Conference on Machine Learning. 217--226.

Sqlite based test

  • Trace: Hospital Discharge Data Public Use Data File (PUDF_base1_1q2014, raw content size 661 MB)
  • Method:
    • Change trace to CSV format: Change TXT to CSV
    • Dump memory with only sqlite3 CLI running (baseline)
    • Dump memory for sqlite3 CLI after insert trace content (loaded-charges1q2014)
  • Steps:
    • Use Tab-delimited trace as input, change file format to csv.
    • Use Sqlite3 to store the trace via import csv (sqlite3 xxx.db; .mode csv; .import xx.csv tab_xx).
    • Use dumpMemory.sh script to get page information.
  • Result:
Test Type Total page number Unique page number Deduplication ratio
baseline 272 208 23.53%
loaded-charges1q2014 6337 4268 32.65%
loaded-charges2q2014 6352 4282 32.59%
loaded-charges3q2014 6317 4250 32.72%
loaded-charges4q2014 6247 4180 33.09%

Memcached based test

Test Type Total page number Unique page number Deduplication ratio
baseline-128MB 21140 205 99.03%
baseline-2GB 21131 186 99.12%
enron_mail_20150507 356976 338176 5.03%

Related Tools & Method

  • Memory capture : memoryCaptureTools/dumpMemory.sh, read memory for some application.
  • Memory access pattern analysis: valgrind's memcheck

Reference

Notes for kernel programming

Use kernel modules

# load kernel module
sudo insmod xxx.ko
# Verify output
dmesg
# load kernel module
sudo rmmod xxx.ko