Shoal is a new network stack and a network fabric for disaggregated racks within datacenters that can achieve very high performance while consuming a very modest fraction of rack's power budget, thus enabling dense high performance rack-scale disaggregated clusters within datacenters. The key insight used in Shoal's design is to use fast circuit switches (that could reconfigure within nanoseconds) to build the rack's internal network. For more details, please refer to our NSDI'19 paper.
This repository is the same as resource-disaggregation/Shoal
$ cd simulator/
Format of tracefile
Each line should contain the follwing 5 fields comma-separated
flow-id, src-id, dst-id, flow-size (in bytes), flow-start-time (in seconds)
Processing tracefile
If the tracefile is in raw-format (i.e. has pkt size in bytes and time in seconds), then first run the preprocessor.
$ python scripts/tracefile_preprocessor.py -f <path/to/tracefile>
The processed tracefile will be stored in the same directory as the original tracefile with a .processed extension
Compiling the simulator code
$ ./compile.sh
Running the simulator code
$ ./run.sh -f <filename> [-e <epochs> -w <0/1> -b <link bandwidth> -t <time slot> -c <cell size> -h <header size> -s <short flow size> -l <long flow size> -n <num of flows> -d <percentage failed nodes> -i <interval>][-a -r -p]
-e: to run the exp till specified num of epochs
-w: 1 = static workload; 0 = dynamic workload
-b: link bandwidth in Gbps (float)
-t: length of a time slot in ns (float)
-c: packet(cell) size in Bytes
-h: cell header size in Bytes
-s: small flow size in KB
-l: long flow size in KB
-n: stop experiment after these many flows have finished
-d: percentage failed nodes
-i: interval
-r: to run shoal
-p: to plot graphs
-a: to do both run and plot at once
All the results will be stored in the directory experiments/
Reproducing Shoal results from NSDI'19 paper
We have added the workloads and scripts to reproduce results from Figures 15 and 18 from our NSDI'19 paper at the following location - workloads.zip. To run the experiments,
$ unzip workloads.zip
For Figure 15,
$ ./workloads/dc_workload/tracefile_preprocessor_batch.sh
$ ./workloads/dc_workload/run_all.sh
$ Results stored in experiments/workloads/dc_workload/
For Figure 18,
$ ./workloads/disaggregated_workload/tracefile_preprocessor_batch.sh
$ ./workloads/disaggregated_workload/run_all.sh
$ Results stored in experiments/workloads/disaggregated_workload/
FPGA implementation of Shoal in Bluespec
- connectal
- fpgamake
- buildcache [optional]
$ cd prototype/
project_dir = [circuit-switch, shoal-NIC]
$ cd project_dir/
$ cd bsv/
$ make build.vsim [OR] USE_BUILDCACHE=1 make build.vsim
$ cd vsim/
$ make run <args>
$ cd project_dir/<location of Makefile>
$ make build.de5
$jtagconfig -- to get the SERIALNO of the machine to which board is connected
$ SERIALNO=* make run.de5
[OR]
$ cd into de5/ and run the below command
$ quartus_pgm -c SERIALNO -m jtag -o p\;./bin/mkPcieTop.sof
#### On the remote machine to which the board is attached,
- Restart the machine whose board was programmed in the last step
- copy de5/bin/ubuntu.exe to the machine
- make sure connectal/ is present
$ cd connectal/drivers/pcieportal/
$ make
$ sudo insmod pcieportal.ko
- Verify the output of ls /dev/portal*
/dev/portal_b0t0p1 /dev/portal_b0t0p3 /dev/portal_b0t1p5 /dev/portal_b1t0p1 /dev/portal_b1t0p3 /dev/portal_b1t1p5
/dev/portal_b0t0p2 /dev/portal_b0t0p4 /dev/portal_b0t1p6 /dev/portal_b1t0p2 /dev/portal_b1t0p4 /dev/portal_b1t1p6
If there are 2 boards. b0 = board 1, b1 = board 2.
$ FPGA_NUMBER=* ./ubuntu.exe <args> [FPGA_NUMBER value starts with 0]
$ quartus de5/mkPcieTop.qpf (then go to tools -> timing)