Our paper Boosting Data Center Performance via Intelligently Managed Multi-backend Disaggregated Memory has been accepted in SC'24.
This paper proposes xDM, a multi-backend disaggregated memory system that can manage multiple far memory paths with high performance.
code
: Source code
code/drivers
: Source code of RDMA and DRAM backend drivers.
code/eval
: Source code of different workloads.
code/farmemserver
: Source code of RDMA server.
code/kernel
: Source code of fastswap kernel.
code/log_process
: Source code of log process.
code/scripts
: Source code of scripts used in installation.
document
:Include documents about how to confige our system.
1)xDM rdma server:a server with at least 64G memory, a MT27800 Family [ConnectX-5] NIC(recommend), MLNX_OFED_LINUX-5.8-4.1.5.0 installed (available at Linux InfiniBand Drivers (nvidia.com), should match the Linux distribution and rdma NIC version.),
2)xDM client:a server with at least 64G memory, a MT27800 Family [ConnectX-5] NIC(recommend), MLNX_OFED_LINUX-5.8-4.1.5.0 installed (available at Linux InfiniBand Drivers (nvidia.com), should match the Linux distribution and rdma NIC version.), require qemu-kvm installed
Install qemu-kvm, we recommend to install virt-manager to manage VMs
sudo apt install qemu-system qemu-utils virt-manager libvirt-clients libvirt-daemon-system -y
Install VM with virt-manager. We recommand to use ubuntu 16.04. The next steps are finished in the VMs.
2) Compiling and installing data swap kernel in each vm on the client node, only DRAM and RDMA kernel need this step
We use modified kernel in clusterfarmem/fastswap and based on the drivers to implement xDM. We also use part of workloads in clusterfarmem/cfm .
git clone the repo
cd ~
git clone https://github.com/linqinluli/Multi-backend-DM.git
First you need a copy of the source for kernel 4.11 with SHA a351e9b9fc24e982ec2f0e76379a49826036da12. We outline the high level steps here.
cd ~
wget https://github.com/torvalds/linux/archive/a351e9b9fc24e982ec2f0e76379a49826036da12.zip
mv a351e9b9fc24e982ec2f0e76379a49826036da12.zip linux-4.11.zip
unzip linux-4.11.zip
cd linux-4.11
git init .
git add .
git commit -m "first commit"
Now you can use the provided patch and apply it against your copy of linux-4.11, and use the generic Ubuntu config file for kernel 4.11. You can get the config file from internet, or you can use the one we provide.
git apply ~/Multi-backend-DM/code/kernel/kernel.patch
cp ~/fastswap/kernel/config-4.11.0-041100-generic ~/linux-4.11/.config
Make sure you have necessary prerequisites to compile the kernel, and compile it:
sudo apt-get install git build-essential kernel-package fakeroot libncurses5-dev libssl-dev ccache bison flex
make -j `getconf _NPROCESSORS_ONLN` deb-pkg LOCALVERSION=-fastswap
Once it's done, your deb packages should be one directory above, you can simply install them all:
cd ..
sudo dpkg -i *.deb
The fastswap kernel has been installed in the VM. If you want to use RDMA or DRAM backend, you should boot system with the modified 4.11 fastswap kernel.
Refer to document configure rdma in kvm VM, in this step, make sure the ofed driver you installed in VM is 4.3 version. If the official version 4.3 driver is not available, we provide a Google Cloud Drive download link.
DRAM backend:
Use DRAM backend in xDM client (in VM)
cd ~/Multi-backend-DM/code/drivers
make BACKEND=DRAM
RDMA backend:
Use RDMA backend in xDM client (in VM)
cd ~/Multi-backend-DM/code/drivers
make BACKEND=RDMA
in xDM server
cd ~
git clone https://github.com/linqinluli/Multi-backend-DM.git
cd ~/Multi-backend-DM/code/farmemserver
make
xDM supports three types of swap backend SSD (or disk), DRAM, and RDMA. After following the above steps, you can configure it. We offer scripts for configuration. Before you use configure backend, you should have 32G swap space set.
free -g | grep swap
# Swap: 32 0 32
SSD backend (supporting Linux simple kernel):
cd ~/Multi-backend-DM/code/scripts/
sudo chmod +x backendswitch.sh
./backendswitch.sh ssd $path_mount_on_ssd
DRAM backend (supporting modified Linux kernel)
cd ~/Multi-backend-DM/code/scripts/
sudo chmod +x backendswitch.sh
./backendswitch.sh dram
RDMA backend (supporting modified Linux kernel)
To build and run the far memory server do(xDM RDMA server):
./rmserver $port $far_memory_size $cpu_num_in_rdma_client
Configure rdma backend in xDM client
cd ~/Multi-backend-DM/code/scripts/
sudo chmod +x backendswitch.sh
./backendswitch.sh rdma $rdma_server_ip $rdma_server_port $rdma_client_ip
Using code/scripts/backendswitch.sh
, we strongly suggest to use SSD backend without the modified kernel for it may cause the system crush. We will solve the problem in the next version.
Configure a new backend or switch to another backend can be finished to use the script. Just follow the steps in Backend configuration.
turn on THP
sudo sh -c "echo always > /sys/kernel/mm/transparent_hugepage/enabled"
turn off THP
sudo sh -c "echo never> /sys/kernel/mm/transparent_hugepage/enabled"
The number of CPUs can be only configured by kvm. You should shut down the VM server and start it.
# modify VM configuration
sudo virsh edit CacheExp
# query the number of CPUs
cat /proc/cpuinfo| grep "physical id"| sort| uniq| wc -l
Here is a example of how to evaluate chatglm with 0.5 local memory ratio.
cd ~/Multi-backend-DM/code/eval
python3 benchmark chatglm 0.5
Here is an example of hot to configure NUMA node assignment.
numactl --cpunodebind=0 --membind=0 ./test
numactl -C 0-1 ./test
list running VMs
sudo virsh list
shutdown VM-CacheExp
sudo virsh shutdown CacheExp
force shudown VM-CacheExp
sudo virsh destory CacheExp
start VM-CacheExp
sudo virsh start CacheExp
edit VM-CacheExp's configurations
sudo virsh edit CacheExp
Here are the workloads we support now.
type | name | state | Notes |
---|---|---|---|
C/C++ | quicksort | √ | |
linpack | √ | ||
stream | √ | ||
Spark | PageRank | √ | |
GridGraph | PreProcess | √ | |
BFS | √ | ||
Ligra | BFS | √ | |
BC | √ | ||
CF | √ | ||
PageRank | √ | ||
Python | kmeans | √ | when local mremory ratio is low, program will crash |
tensorflow | inception | √ | |
resnet | √ | ||
File operation | file read/write | √ | PreProcess in GridGraph |
PostgreSQL | TPCH | √ | Small memory usage with huge page cache |
TPCDS | √ | Small memory usage with huge page cache | |
TPCC | √ | Small memory usage with huge page cache | |
Sysbench | √ | Small memory usage with huge page cache | |
AI | chatglm | √ | |
chatglm-int4 | √ | ||
clip | √ | ||
text-classify | √ | ||
bert-uncased | √ |
Some workloads' configuration can refer to CFM: quicksort, linpack, stream, pagerank, kmeans, inception, resnet
GridGraph: refer to thu-pacman/GridGraph: Out-of-core graph processing on a single machine
Ligra: refer to jshun/ligra: Ligra: A Lightweight Graph Processing Framework for Shared Memory
chatglm:
model:refer to THUDM/chatglm2-6b · Hugging Face
chatglm-int4:
model:refer to THUDM/chatglm2-6b-int4 · Hugging Face
clip:
model:refer to openai/clip-vit-large-patch14 · Hugging Face
data:refer to CIFAR-10 and CIFAR-100 datasets
text-classify:
model:refer to GitHub - gaussic/text-classification-cnn-rnn: CNN-RNN中文文本分类,基于TensorFlow
data:refer to http://thuctc.thunlp.org/
bet-uncased:
model:refer to https://huggingface.co/bert-base-uncasedSome workloads' configuration can refer to CFM: quicksort, linpack, stream, pagerank, kmeans, inception, resnet
GridGraph: refer to thu-pacman/GridGraph: Out-of-core graph processing on a single machine
Ligra: refer to jshun/ligra: Ligra: A Lightweight Graph Processing Framework for Shared Memory
chatglm:
model:refer to THUDM/chatglm2-6b · Hugging Face
chatglm-int4:
model:refer to THUDM/chatglm2-6b-int4 · Hugging Face
clip:
model:refer to openai/clip-vit-large-patch14 · Hugging Face
data:refer to CIFAR-10 and CIFAR-100 datasets
text-classify:
model:refer to GitHub - gaussic/text-classification-cnn-rnn: CNN-RNN中文文本分类,基于TensorFlow
data:refer to http://thuctc.thunlp.org/
bet-uncased:
model:refer to google-bert/bert-base-uncased · Hugging Face
It's different in different kernels and lsb versions. We show the steps we used in our system.
- Open /boot/grub/grub.cfg in your editor of choice
- Find the
menuentry
for the fastswap kernel - Add
cgroup_no_v1=memory
to the end of the line beginning inlinux /boot/vmlinuz-4.11.0-sswap
- Save and exit the file
- Run: sudo update-grub
- Reboot
The framework and scripts rely on the cgroup system to be mounted at /cgroup2. Perform the following actions:
- Run
sudo mkdir /cgroup2
to create root mount point - Execute
code/scripts/init_bench_cgroups.sh
Here is an example of how to evaluate chatglm with a 0.5 local memory ratio.
cd ~/Multi-backend-DM/code/eval
python3 benchmark chatglm 0.5
Make sure you have installed the workloads install in the code\eval
path. Here is a script to quickly install workloads we provide in the repo.
chmod +x ~/Multi-backend-DM/code/scripts/install_workloads.sh
sh ~/Multi-backend-DM/code/scripts/install_workloads.sh
Then evaluate mutil workloads:
chmod +x ~/Multi-backend-DM/code/scripts/install_workloads.sh
sh ~/Multi-backend-DM/code/scripts/eval_workloads.sh $log_file_name
Use script in code/log_process to process log file.
python ~/Multi-backend-DM/code/log_process/log_process.py $log_file_path