Project Guide

This project is a light-weight prototype of Swallow.
Design details and principles of Swallow can be found at "Swallow: Joint Online Scheduling and Coflow Compression in Datacenter Networks", in Proc. IPDPS, Vancouver, May. 2018.
You can also found our implementation in Spark-2.2.0 at: Modified-Spark. As the full source code project is large, we just put modified codes, mainly in module of spark-core and network-common.
In root directory Swallow-master, four subprojects are contained: swallow, swallow-sim, swallow-benchmark and swallow-trace.

swallow

Efficient flow scheduling system in data-intensive clusters, based on Akka.
Development Language: Scala.
Full Project: swallow.

swallow-sim

Trace-driven simulator for flow scheduling used in swallow.
Development Language: Scala.
Full project: swallow-sim.

swallow-benchmark

Modified Hibench for swallow evaluation. It is also suitable for evaluating other big data frameworks (e.g., Spark and Hadoop).
Full project: swallow-benchmark.

swallow-trace

Synthesized data from real-world traces of data-intensive clusters.
Full project:: swallow-trace.

Prerequisites

JDK 8.
sbt 0.13.13 or higher (download here).

How to Download

Download from https://github.com/kimihe/Swallow or use git clone git@github.com:kimihe/Swallow.git.
Extract the zip file to a convenient location:

On Linux and OSX systems, open a terminal and use the command unzip Swallow-master.zip.
On Windows, use a tool such as File Explorer to extract the project.

How to Complile and Run

Take Swallow-master/swallow as an example:

In a console, change directories to the top level of the unzipped project. For example, if you used the default project name, Swallow-master, and extracted the project to your root directory, from the root directory, enter: cd Swallow-master/swallow.
In the above directory, enter sbt compile to compile the source codes and enter sbt run to execute the program. sbt will download project dependencies, build the project and compile the archived package. Output looks like this:

Multiple main classes detected, select one to run:

 [1] examples.ExampleClusterApp
 [2] examples.ExampleMaster
 [3] examples.ExampleReceiver
 [4] examples.ExampleSender
 [5] examples.KMActorAggregation
 [6] examples.KMClusterApp
 [7] examples.KMLocalActor
 [8] examples.KMMasterActor
 [9] examples.KMRemoteActor

Enter number:

For example: enter 1 and select to run [1] examples.ExampleClusterApp . Then, you can start three new terminals to run [2] examples.ExampleMaster, [4] examples.ExampleSender and [3] examples.ExampleReceiver.
These four processes simulate a simplest distributed scheduling system. In this example, all of them are run on the same machine, you can modify the application.conf in directory resources to respectively configure their IP address and communication port. Swallow can be deployed on different machines and run as a real distributed scheduling system.

Collaboration with Intellij IDEA

You can also organize this project with Intellij IDEA, just open from the root directory Swallow-master/swallow.

Relevant Publications

"Swallow: Joint Online Scheduling and Coflow Compression in Datacenter Networks", in Proc. IPDPS, Vancouver, May. 2018.

This project is still in development, welcome any contributions to help make it better. :)

Hypnotised1/Swallow