Summary

This document is a set of lecture notes on sorting algorithms. It covers topics like bubble sort, quick sort, bucket sort, odd-even transposition, and the parallel implementation of these algorithms. The focus is on parallel and distributed systems.

Full Transcript

9. Sorting Algorithms Parallel & Distributed Processing- 1502412 Prof. Ali El- 1 Moursy Topic Overview Issues in Sorting on Parallel Computers Bubble Sort and its Variants Quick sort Bucket & Samp...

9. Sorting Algorithms Parallel & Distributed Processing- 1502412 Prof. Ali El- 1 Moursy Topic Overview Issues in Sorting on Parallel Computers Bubble Sort and its Variants Quick sort Bucket & Sample Sort Parallel & Distributed Processing- 1502412 Prof. Ali El- 2 Moursy Sorting: Overview One of the most commonly used and well-studied kernels. The fundamental operation of comparison-based sorting is compare-exchange. The lower bound on any comparison-based sort of n numbers is Θ(nlog n). We focus here on comparison-based sorting algorithms. Parallel & Distributed Processing- 1502412 Prof. Ali El- 3 Moursy Bubble Sort and its Variants The complexity of bubble sort is Θ(n2). Bubble sort is difficult to parallelize since the algorithm has no concurrency. A simple variant, though, uncovers the concurrency. Parallel & Distributed Processing- 1502412 Prof. Ali El- 4 Moursy Sorting: Basics What is a parallel sorted sequence? Where are the input and output lists stored? We assume that the input and output lists are distributed. The sorted list is partitioned with the property that each partitioned list is sorted and each element in processor Pi's list is less than that in Pj's list if i < j. Parallel & Distributed Processing- 1502412 Prof. Ali El- 5 Moursy Sorting: Parallel Compare Exchange Operation A parallel compare-exchange operation. Processes Pi and Pj send their elements to each other. Process Pi keeps min{ai,aj}, and Pj keeps max{ai, aj}. Parallel & Distributed Processing- 1502412 Prof. Ali El- 6 Moursy Sorting: Basics What is the parallel counterpart to a sequential comparator? If each processor has one element, the compare exchange operation stores the smaller element at the processor with smaller id. This can be done in ts + tw time. If we have more than one element per processor, we call this operation a compare split. Assume each of two processors have n/p elements. After the compare-split operation, the smaller n/p elements are at processor Pi and the larger n/p elements at Pj, where i < j. The time for a compare-split operation is (ts+ twn/p), assuming that the two partial lists were initially sorted. Parallel & Distributed Processing- 1502412 Prof. Ali El- 7 Moursy Sorting: Parallel Compare Split Operation A compare-split operation. Each process sends its block of size n/p to the other process. Each process merges the received block with its own block and retains only the appropriate half of the merged block. In this example, process Pi retains the smaller elements and process Pi retains the larger elements. Parallel & Distributed Processing- 1502412 Prof. Ali El- 8 Moursy Odd-Even Transposition Sequential odd-even transposition sort algorithm. Parallel & Distributed Processing- 1502412 Prof. Ali El- 9 Moursy Odd-Even Transposition Sorting n = 8 elements, using the odd-even transposition sort algorithm. During each phase, n = 8 elements are compared. Parallel & Distributed Processing- 1502412 Prof. Ali El- 10 Moursy Odd-Even Transposition After n phases of odd-even exchanges, the sequence is sorted. Each phase of the algorithm (either odd or even) requires Θ(n) comparisons. Serial complexity is Θ(n2). Parallel & Distributed Processing- 1502412 Prof. Ali El- 11 Moursy Parallel Odd-Even Transposition Consider the one item per processor case. There are n iterations, in each iteration, each processor does one compare-exchange. The parallel run time of this formulation is Θ(n). This is cost optimal with respect to the base serial algorithm but not the optimal one. Parallel & Distributed Processing- 1502412 Prof. Ali El- 12 Moursy Parallel Odd-Even Transposition Parallel & Distributed Processing- 1502412 Prof. Ali El- 13 Moursy Parallel Odd-Even Transposition Consider a block of n/p elements per processor. The first step is a local sort. In each subsequent step, the compare exchange operation is replaced by the compare split operation. The parallel run time of the formulation is Parallel & Distributed Processing- 1502412 Prof. Ali El- 14 Moursy Example: MPI Odd-Even Sort 1 #include 2 #include 3 4 main(int argc, char *argv[]) 5{ 6 int n; 7 int npes; 8 int myrank; 9 int nlocal; 10 int *elmnts; 11 int *relmnts; 12 int oddrank; 13 int evenrank; 14 int *wspace; 15 int i; 16 MPI_Status status; 17 18 19 MPI_Init(&argc, &argv); 20 MPI_Comm_size(MPI_COMM_WORLD, &npes); 21 MPI_Comm_rank(MPI_COMM_WORLD, &myrank); Parallel & Distributed Processing- 1502412 Prof. Ali El- 15 Moursy Example: MPI Odd-Even Sort 23 n = atoi(argv); 24 nlocal = n/npes; 25 26 27 elmnts = (int *)malloc(nlocal*sizeof(int)); 28 relmnts = (int *)malloc(nlocal*sizeof(int)); 29 wspace = (int *)malloc(nlocal*sizeof(int)); 30 31 32 srandom(myrank); 33 for (i=0; i

Use Quizgecko on...
Browser
Browser