Parallel--Merge-Sort-Using-MPI

Implementing Merge Sort Using MPI

Abstract

While merge sort is ll-understood in parallel algorithms theory, relatively little is known of how to implement parallel merge sort with mainstream parallel programming platforms as MPI, and run it. Hence better understanding of merge sort parallelization can contribute to better understanding of divide-and-conquer parallelization in general. Here, I investigate a serial and parallel merge-sorts with message-passing merge sort that runs on computer clusters with MPI. In my experiments, comparing measured time of serial and parallel version of merge sort, parallel version has achieved best speedup.

1- Merge Sort Algorithm

Merge Sort follows the rule of Divide and Conquer. But it doesn't divides the list into two halves. In merge sort the unsorted list is divided into N sublists, each having one element, because a list of one element is considered sorted. Then, it repeatedly merge these sublists, to produce new sorted sublists, and at lasts one sorted list is produced. Merge Sort is quite fast, and has a time complexity of O(n log n). It is also a stable sort, which means the "equal" elements are ordered in the same order in the sorted list. The merge sort function should recursively sort the subarray array[p..r] i.e. after calling merge sort (array,p,r) the elements from index p to index r of array should be sorted in ascending order. To remind you of the merge sort algorithm: If the subarray has size 0 or 1, then it's already sorted, and so nothing needs to be done. Otherwise, merge sort uses divide-and-conquer to sort the subarray. Use merge (array, p, q, r) to merge sorted sub arrays array[p..q] and array[q+1..r]. Once implemented, uncomment the Program.assertEqual() at the bottom to verify that the test assertion passes. The remaining piece of merge sort is the merge function, which merges two adjacent sorted subarrays, array[p..q] and array[q+1..r] into a single sorted subarray in array[p..r]. I'll see how to construct this function so that it's as efficient as possible. Let's say that the two subarrays have a total of nn elements. I have to examine each of the elements in order to merge them together, and so the best I can hope for would be a merging time of \Theta(n)Θ(n). Indeed, I'll see how to merge a total of nn elements in \Theta(n)Θ(n) time. In order to merge the sorted subarrays array[p..q] and array[q+1..r] and have the result in array[p..r], I first need to make temporary arrays and copy array[p..q] and array[q+1..r] into these temporary arrays. I can't write over the positions in array[p..r] until I have the elements originally in array[p..q] and array[q+1..r] safely copied.

2- Serial Implement of Merge Sort in Message Passing Interface

Merge sort is a divide and conquer paradigm. It divides those issue under a number of sub problems that are littler instances of the same problem; At that point it overcomes the sub problems Toward fathoming them recursively; at last, it combines the answers for those sub problems under the result for those unique issue.

2-1 Sort Function

The sort function shows the implementation of sort algorithm. This function receives an array, its beginning and finishing index as parameters. Convenient it finds the mid index of order and it finds the number of elements. At this period the accepted array is divided to two smaller sub array which I are able to sort them by calling the sort function. Now, in my function, I should divide to two and do it until I achieve an array with one element. So this array is considered as sorted array. Hence, then return statement of this function locates here. In alternate situation, the sort function would be supposed for each sub arrays. In sort function, at whatever point the two subarrays are sorted, I have to call another capacity which will merge these two subarrays and structure a merge sorted array, as I find in the last statement of sort function.

2-2 Merge Function

The accompanying code in shows merge function that is dealing with merge some portion of sorting algorithm. This function gets two array and the related size of them and return a merge sorted array. The measure of the merge sorted exhibit array should be the aggregate size of two got array Msize = Fsize + Ssize. I have characterized three diverse iterators (fi, si, mi) to stroll through every array. Everything I need is to contrast the first element and all element of second array. To do as such, I select the first element of first array and compare it and the first element of second array. I'll put smaller one into the M array which would contains all element of both initial array in a sorted order. I'll keep pushing ahead till one of the array achieves the end. As of now I am certain elements of the array which I have come to the end of it are inserted into array M and all remained elements in the other array are greater than every one of them, on the grounds that on the off chance that they were littler, they would be embedded into M before the first array element be inserted. Next, the remained elements on the other array will be put into the end of array M, which has still a few places left to oblige them. I will put them with the same order it as of now has, in light of the fact that both arrays were sorted before merge together. The last step is to partition the sorted merged array M into two equivalent size subarrays and supplant them with F and S which are the information array to the function.

2-3 Main Function

Here I defined all of my data and data array with size of N. In addition, I used clock function to take time.

3- Parallel Implementation of Merge Sort Algorithm

Single Instruction Multiple Data. I have the same tasks and different data, so it is a good idea to use single instruction multiple parallel implementation.

3-1 Partitioning the Data equally

At first I partition the data equally into processors, but sometimes data cannot be equally dividable by processors. In order to have equal local data in each processor, I add one to local n, the quotient of the number of data (n) divided by the number of processors. And I have some new elements so I should assign them into zero. In order to calculate the number of new elements I subtract the local n from r, the remainder of the dividing the number of data by the number of processors. Afterwards, I generate the elements of data randomly. In order to generate some three-digit numbers I use the remainder of dividing random numbers by 1000.

3-2 Sorting the Data

I use the function MPI-Broadcast to send the size of local n to all processors and also allocate memory for them. And I use the function MPI Scatter to send the needed components to each of the other processors. Afterwards, I call the sort function for all processors.

3-3 Merging the Data

I use the tree structure global sum strategy to merge the sorted data in each processor to produce a merged sorted array. I need to find the pattern the rank of receiver and sender processors. If the rank of a processor is dividable by my step it receives the sorted array of the processor the rank of which is not dividable by the step, and merge it with its sorted array. I assign steps the multiples of 2.

4- Taking time

I calculate the CPU time by using the function Clock to measure the run-time of both serial and parallel implementations.

pouyarz/Parallel--Merge-Sort-Using-MPI