/pbfs

Parallel Breadth-First Search Algorithm

Primary LanguageC++

Parallel Breadth-First Search

Given a graph G = (V,E) with vertex set V = V(G) and edge set E = E(G), the BFS problem is to compute for each vertex v ∈ V the distance that v lies from a source vertex v0 ∈ V. We measure distance as the minimum number of edges on a path from v0 to v in G. BFS visits the graph in waves (levels).

BFS Levels

A level is all of the vertices equidistant to the source. The number of levels is bounded by the diameter of a graph. The order in which we visit vertices of any given level doesn't matter, so it can be done concurrently. This idea underlies the Parallel BFS (PBFS) algorithm.

PBFS represents levels using an unordered-set data structure, called a bag, which supports efficient parallel traversal over the elements in the set and provides the following operations:

• Creation of a new empty bag

• Insertion of an element into a bag

• Bag unioning

• Bag splitting

A bag is a collection of pennants, no two of which have the same size. A pennant is a tree of 2k nodes, where k is a nonnegative integer. Each node x in this tree contains two pointers (left and right) to its children. The root of the tree has only a left child, which is a complete binary tree of the remaining elements. Two pennants x and y of size 2k can be combined to form a pennant of size 2k+1 in O(1) time. It is also possible to split a pennant of at least two elements into two parts in O(1) time.

Pennant unioning

A bag stores pennants in an array S, called the backbone. Each entry S[k] in the backbone contains either a null pointer or a pointer to a pennant of size 2k.

Backbone

Insertion of an element into a bag employs an algorithm similar to that of incrementing a binary counter. First we package the given element as a pennant of size 1. Then we combine it with all pennants in the backbone starting from the smallest one until a free entry is found. Finally, the pennant is placed into a free entry and all previous entries are set to a null pointer.

Since pennant unioning takes constant time, worst-case time to insert a pennant into a bag of n elements is O(log(n)).

Bag unioning uses an algorithm similar to ripple-carry addition of two binary counters. Given three pennants x, y, and z, where each either has size 2k or is empty, we can merge them to produce a pair of pennants (s, c), where s has size 2k or is empty, and c has size 2k+1 or is empty. With this idea in mind, we can union two bags by merging corresponding pennants of their backbones and the "carry bit" from the previous step. To compute all entries in the backbone of the resulting bag takes O(log(n)) time.

Splitting a bag operates like an arithmetic right shift: we divide each pennant into halves. One half is kept in the original bag and the other one is placed at the same position in the new bag. If there is a remainder (the least significant element), it is inserted into the original bag.

To split a pennant A, we first set the root of a new pennant B to be the root of A. Then we set the root of A to be the only child of B, set the only child of B to be the second child of A, and remove the second child of A. Each of the pennants A and B now contain half the elements. Like pennant unioning, splitting a pennant also operates in O(1) time. Consequently, the asymptotic runtime of bag splitting is O(log(n)).

Results

The following tables compare execution time (in seconds) of a sequential and parallel (OpenMP) versions of the BFS algorithm for various types of graph run on a Core i7-4712HQ CPU @ 2.3 GHz.

  • Graph density = 0.01
#Vertices Sequential OMP Parallel
1000 0.0010 0.0015
10000 0.0221 0.0338
50000 2.8965 0.9770
100000 1.7115 0.6657
  • Graph density = 0.3
#Vertices Sequential OMP Parallel
1000 0.0051 0.0030
10000 0.3915 0.1538
50000 8.3698 2.6996
  • Graph density = 0.5
#Vertices Sequential OMP Parallel
1000 0.0076 0.0171
10000 0.6261 0.2970
50000 13.7973 4.5804
  • Graph density = 0.9
#Vertices Sequential OMP Parallel
1000 0.0114 0.0219
10000 1.1992 0.5215
50000 24.8087 8.1881

Credits

  • The project implements ideas from the paper "A Work-Efficient Parallel Breadth-First Search Algorithm (or How to Cope with the Nondeterminism of Reducers)" by Charles E. Leiserson and Tao B. Schardl.

  • The BFS image was taken from the High Performance Computing course offered at Udacity.