/mpi-histo

A program developed using MPI for distributed computation of Histogram for large data and their performance anaysis on multi-core systems. The program is written in C and uses SPMD architecture to fork processes and compute histogram.

Primary LanguageC

mpi-histo

A program developed using MPI for distributed computation of Histogram for large data and their performance anaysis on multi-core systems. The program is written in C and uses SPMD architecture to fork processes and compute histogram. There are two implementations:

histo.c => Using point to point communication MPI_Send, MPI_Recv to communicate between processes
histo.c => Using collective communication MPI_Broadcast, MPI_Scatter and MPI_Gather to communicate between processes

Compilation and Execution

The programs are compiled using mpicc and run using mpiexec. The compilation and execution instructions are shown below:

Compilation: mpicc mpi_histo.c -o <executable_name>

Execution: mpiexec -n <no_processes> <executable_name> <bin_count> <data_min> <data_max> <data_count>

Taken from command line arguments:

no_processes  : Total number of processes to start.
bin_count     : Total number of bin_count for histogram.
data_min      : Minimum value for data
data_max      : Maximum value for data
data_count    : Number of random data points to generate.

By default this program creates <data_count> number of random data points between data_min and data_max and computes the histogram for <bin_count> number of data items. The code can be modified according to scenarios or make it read a large data file and compute the histogram on real data.

Example:
Compilation: mpicc mpi_histo.c -o mpi_histo
Execution: mpiexec -n 2 ./mpi_histo 10 1 100 10

Performance Analysis

The histogram compuation was done for 50000000, 100000000 and 200000000 data points using 1 to 64 cores on a 4 node CHPC kingspeak cluster. The results obtained are described here.

1. Timing Performance


Total Execution time: It is the time taken by the program to complete the execution.
Application Time: This includes the time taken by the process to create and distribute data and compute histogram.
Histogram time: This is the time taken by one process to compute histogram on scattered data.

From the figures, for a serial program (process = 1) as the number of data elements increases the time taken to compute the histogram also increases. But as the number of processes is increased the total time to compute the histogram decreases as the work is divided among the processes. We can also see that the total time and the application time is greater than the time taken to compute the histogram. The reason behind this is, total execution time and application time also includes the time taken to scatter the data among the processes whereas, histogram time only accounts the maximum time taken by a process to compute the histogram.

2. Speedup Analysis


The speedup of a parallel program is given as: S(n,p) = Tserial(n) / Tparallel(n, p). From the figures we can see that the speedup for total time of the program is less than application and histogram computation time for a program. Similar to timing reports, total execution time of a program includes the time taken by process 0 to populate and scatter the data. But in the case of histogram time the speedup is drastically increased and is nearly a linear speedup.

Note: As the number of processes increases it adds a communication overhead on the system thus reducing the execution time.

Timing data obtained for my execution obtained from 4 Node (kingspeak) CHPC cluster:

Processes

Total Time

Application Time

Histogram Time

Data Count

1

74.423788

73.756131

73.569383

50000000

4

19.308111

18.636843

18.439549

50000000

16

5.504118

4.754371

4.615857

50000000

32

3.926105

3.230799

2.302202

50000000

64

3.436093

2.481364

1.150515

50000000

1

149.605372

148.257896

147.896947

100000000

4

38.644788

37.317932

36.934477

100000000

16

10.867961

9.537738

9.233756

100000000

32

7.780161

6.458228

4.580765

100000000

64

6.301298

4.971914

2.294652

100000000

1

298.528679

295.804794

294.842392

200000000

4

77.303529

74.629462

73.983486

200000000

16

21.717287

19.049401

18.434128

200000000

32

15.902424

13.048625

9.19137

200000000

64

12.687857

9.89958

4.63593

200000000