wormsimulator: A Python repository from charleswli

========================================================================
Internet Worm Propagation Simulation
Brandon Haynes & Charles Li (2011)
wormsimulator.info
========================================================================

This file documents the use of the simulation software, including network 
creation, propagation, and visualization.

------------------------------------------------------------------------
Source Code
------------------------------------------------------------------------

Source code is available in the project repository, located at 
http://code.google.com/p/wormsimulator.

------------------------------------------------------------------------
Quick Start
------------------------------------------------------------------------

Create a new IPv4 network:

> python Create.py network.0 Network256 10 1

Propagate the worm forward in time:

> python Propagate.py --network Network256 --emit-volatile 1 network.0 > network.1
> python Propagate.py --network Network256 --emit-volatile 1 network.1 > network.2
> python Propagate.py --network Network256 --emit-volatile 1 network.2 > network.3

Visualize the results:

> python Visualize.py Network256 network.3

------------------------------------------------------------------------
General Usage
------------------------------------------------------------------------

--- Creating a Network ---

The script Create.py is used to create an initial network, identify a set 
of vulnerable nodes, and mark one (or more) of those nodes as initially 
infected.  This script is invoked as:

> python Create.py output_filename network_class [number infected] 
                   [hit-list size]

Here network_class is one of the set:
    Network256, 
    NetworkGraphable, 
    IPv4, or
    IPv6 

The number infected parameter allows a user to identify the number of 
initially-infected hosts (a network with zero infected nodes is not 
particularly interesting); by default one node is marked as infected.  
Similarly, infected nodes may be pre-loaded with one or more known-
vulnerable hosts to speed propagation via the hit-list size parameter.

By way of example, the following command creates a new IPv4 network 
with a single initially-infected machine and saves the result to a 
file named my-network:

> python Create.py my-network IPv4

--- Infection Propagation ---

Two propagation methods are supported, a direct, brute-force method 
and an alternative that utilizes a variant of the Schimmy pattern.  
The differences between these two methods are discussed in more detail 
under Program Design.

- Direct Propagation -

The script Propagate.py accepts a previously-created network and 
simulates infection propagation by one or more iterations.  This script 
is invoked as:

> python Propagate.py --network network_type [--iterations #iterations] 
                      [--propagation-delay delay] [--emit-volatile flag] 
                      [additional flags] input-network

Here network_class is one of the set { Network256, NetworkGraphable, 
IPv4, and IPv6 }.  These classes are discussed in more detail under 
Program Design.

Multiple iterations may be chained together during one execution via 
the iterations flag; by default the network is moved forward by one 
cycle.  Note that intermediate network states are not emitted when 
multiple iterations are specified; accordingly for some types of 
analyses it may be optimal to loop over single iterations via a local 
script.

Network propagation may be modeled with a delay factor; this factor is 
applied to both the attacking and newly-infected node, and represents a 
transfer delay of the infection payload.  For example, a propagation 
delay of 10 iterations will lock both sides of a successful attack for 
ten iterations.  Infected nodes that are unsuccessful in locating a 
vulnerable node are not locked (in this case no payload would be 
transferred).

The emit-volatile flag causes some intermediate key-value pairs to be 
output during each iteration.  This is useful both for debugging and 
visualization purposes.  Specifically, this flag emits infection 
attempts (whether successful or not) that took place during the most 
recent iteration.

Since the direct propagation script relies upon MRJob to effectuate 
mapping and reduction, most flags exposed by MRJob are also usable via 
the direct propagation script.  For example, including the switch 
-r emr allows propagation to be performed on the Amazon EMR platform.

To illustrate these options, consider the following script which models 
propagation for one cycle assuming an IPv4 universe:

> python Propagate.py --network IPv4 my-network

By contrast, the following command executes the same propagation on the 
Amazon EMR platform:

python Propagate.py -r emr --network IPv4 my-network
By default, propagation results are emitted to standard output.  For 
many purposes, it may be useful to redirect this to a file:

> python Propagate.py -r emr --network IPv4 my-network > my-network-prime

- Schimmy Propagation -

The script SchimmyPropagate.py is largely similar to the Propagate.py 
script discussed above, with the exception that it uses the Schimmy 
pattern to reduce the cost of intermediate shuffling for large data sets.

This script is invoked as:

> python SchimmyPropagate.py --network network_type 
                             --partitions #partitions 
                             [--iterations #iterations] 
                             [--propagation-delay delay] 
                             [--emit-volatile flag] 
                             [additional flags] input-network

Here network_type, iterations, propagation-delay, emit-volatile, and 
input-filename function identically to the Propagate.py script 
discussed above.

The partitions flag is a required switch that indicates the number of 
partitions that are used during Schimmy processing.  During 
initialization, the input file is decomposed into this number of 
partitions, and each is associated with a specific reducer.

Note that the Schimmy pattern requires a specifically-constructed 
partitioner and key split, and as such is only available on the Amazon 
EMR platform.  The proper MRJob switches are automatically specified 
for this purpose.  Additionally, Hadoop version 2.0 is automatically 
selected on the EMR platform.

As above, most other flags that are available via the MRJob system 
may be used with Schimmy propagation.

By way of example, the following command line models propagation for 
one cycle using the Schimmy pattern assuming an IPv4 network and 8 
partitions:

> python SchimmyPropagate.py --network IPv4 --partitions 8 my-network

--- Visualization ---

The script Visualize.py generates a plot of the network as a grid. It 
accepts a previously created network and plots the nodes. If volatile 
edges were emitted during propagation, the script will plot the 
successful infections as directed edges from the attacker to the newly-
infected host. The NetworkX Python graph-plotting library is required.

The script is invoked as:

> python Visualize.py network_type input_graph 
                      [output_plot.png 'Title of Plot']

For this script, network_type corresponds to only Network256 and 
NetworkGraphable (IPv4 and IPv6 are too large to visualize).

For Network256, all nodes are plotted (immune, vulnerable, and 
infected). For NetworkGraphable, due to the large size of the entire 
graph, only vulnerable and infected nodes are drawn. Within the script 
itself, two adjustable parameters exist: nodePlotSize adjusts the size 
of each node, and plotSideLength adjusts the length of the sides of the 
generated image.

We have the following example of usage with nodePlotSize = 15 and 
plotSideLength = 12:

> python Visualize.py NetworkGraphable input_graph output_plot.png 'Usage
charleswli/wormsimulator