Podcast
Questions and Answers
What is the lower bound on any comparison-based sorting of $n$ numbers?
What is the lower bound on any comparison-based sorting of $n$ numbers?
In a compare-split operation, what is the time complexity if each processor has $n/p$ elements?
In a compare-split operation, what is the time complexity if each processor has $n/p$ elements?
What does a comparison-based sorting algorithm primarily rely on?
What does a comparison-based sorting algorithm primarily rely on?
What is the role of a comparator in sorting networks?
What is the role of a comparator in sorting networks?
Signup and view all the answers
What is assumed about the input and output lists in parallel sorted sequences?
What is assumed about the input and output lists in parallel sorted sequences?
Signup and view all the answers
Which of the following statements is true regarding a parallel compare-exchange operation?
Which of the following statements is true regarding a parallel compare-exchange operation?
Signup and view all the answers
What is the outcome of a compare-split operation between two processors?
What is the outcome of a compare-split operation between two processors?
Signup and view all the answers
What type of sorting algorithms focuses on comparison-based methods?
What type of sorting algorithms focuses on comparison-based methods?
Signup and view all the answers
What is the time complexity of sorting n elements using a bitonic sorting network?
What is the time complexity of sorting n elements using a bitonic sorting network?
Signup and view all the answers
Which statement about bitonic sequences is correct?
Which statement about bitonic sequences is correct?
Signup and view all the answers
Given a bitonic sequence s = ⟨2, 3, 5, 7, 6, 4, 2, 1⟩, which of the following is a valid subsequence s1?
Given a bitonic sequence s = ⟨2, 3, 5, 7, 6, 4, 2, 1⟩, which of the following is a valid subsequence s1?
Signup and view all the answers
When creating a bitonic sequence from an unsorted sequence, what is the first step?
When creating a bitonic sequence from an unsorted sequence, what is the first step?
Signup and view all the answers
What does the kernel of a bitonic sorting network uniquely accomplish?
What does the kernel of a bitonic sorting network uniquely accomplish?
Signup and view all the answers
What type of comparators are represented by and Ө in sorting networks?
What type of comparators are represented by and Ө in sorting networks?
Signup and view all the answers
If elements of a bitonic sequence are denoted as a0 ≤ a1 ≤ ... ≤ an/2-1 and an/2 ≥ an/2+1 ≥ ... ≥ an-1, what do these notations represent?
If elements of a bitonic sequence are denoted as a0 ≤ a1 ≤ ... ≤ an/2-1 and an/2 ≥ an/2+1 ≥ ... ≥ an-1, what do these notations represent?
Signup and view all the answers
What is the result of recursively applying the bitonic merge procedure on subsequences s1 and s2?
What is the result of recursively applying the bitonic merge procedure on subsequences s1 and s2?
Signup and view all the answers
What is the depth of the bitonic sorting network?
What is the depth of the bitonic sorting network?
Signup and view all the answers
In a bitonic sort, how many comparators are present in each stage of the network for n items?
In a bitonic sort, how many comparators are present in each stage of the network for n items?
Signup and view all the answers
What communication characteristic is noted in the mapping of bitonic sort onto hypercubes?
What communication characteristic is noted in the mapping of bitonic sort onto hypercubes?
Signup and view all the answers
How does the time complexity of a serial implementation of the bitonic sort network compare to that of its parallel counterpart?
How does the time complexity of a serial implementation of the bitonic sort network compare to that of its parallel counterpart?
Signup and view all the answers
What implication does the mapping of wires to processors in a hypercube have for the compare-exchange operation?
What implication does the mapping of wires to processors in a hypercube have for the compare-exchange operation?
Signup and view all the answers
What is the communication overhead expected when mapping bitonic sort to meshes compared to hypercubes?
What is the communication overhead expected when mapping bitonic sort to meshes compared to hypercubes?
Signup and view all the answers
What is the parallel time complexity of bitonic sort in hypercube architecture?
What is the parallel time complexity of bitonic sort in hypercube architecture?
Signup and view all the answers
What advantage does the bitonic sorting algorithm have over its serial counterpart?
What advantage does the bitonic sorting algorithm have over its serial counterpart?
Signup and view all the answers
What is the total communication performed by each process in the row-major shuffled mapping of bitonic sort?
What is the total communication performed by each process in the row-major shuffled mapping of bitonic sort?
Signup and view all the answers
During the last stage of the bitonic sort on a mesh, what operation do process pairs perform?
During the last stage of the bitonic sort on a mesh, what operation do process pairs perform?
Signup and view all the answers
What is the total computation performed by each process in the row-major shuffled mapping?
What is the total computation performed by each process in the row-major shuffled mapping?
Signup and view all the answers
In the block of elements per processor, which sorting technique is first used on the local block?
In the block of elements per processor, which sorting technique is first used on the local block?
Signup and view all the answers
What does the parallel runtime in the block of elements per processor case primarily depend on?
What does the parallel runtime in the block of elements per processor case primarily depend on?
Signup and view all the answers
What is the maximum number of processes the algorithm can efficiently use in the block of elements per processor?
What is the maximum number of processes the algorithm can efficiently use in the block of elements per processor?
Signup and view all the answers
What is the isoefficiency function due to both communication and extra work for p processes?
What is the isoefficiency function due to both communication and extra work for p processes?
Signup and view all the answers
What characterizes the row-major shuffled mapping in bitonic sort?
What characterizes the row-major shuffled mapping in bitonic sort?
Signup and view all the answers
What is the time complexity of the sequential bubble sort algorithm?
What is the time complexity of the sequential bubble sort algorithm?
Signup and view all the answers
Why is bubble sort considered difficult to parallelize?
Why is bubble sort considered difficult to parallelize?
Signup and view all the answers
In odd-even transposition sort, how many comparisons are made during each phase for n elements?
In odd-even transposition sort, how many comparisons are made during each phase for n elements?
Signup and view all the answers
What is the parallel run time of the odd-even transposition sort in the one item per processor case?
What is the parallel run time of the odd-even transposition sort in the one item per processor case?
Signup and view all the answers
What is the isoefficiency function of the parallel formulation for p = O(log n)?
What is the isoefficiency function of the parallel formulation for p = O(log n)?
Signup and view all the answers
During the odd-even transposition sort, after how many phases is the sequence sorted?
During the odd-even transposition sort, after how many phases is the sequence sorted?
Signup and view all the answers
What is the serial complexity of the odd-even transposition sort?
What is the serial complexity of the odd-even transposition sort?
Signup and view all the answers
What operation replaces the compare-exchange operation in the parallel formulation of odd-even transposition sort?
What operation replaces the compare-exchange operation in the parallel formulation of odd-even transposition sort?
Signup and view all the answers
Study Notes
Sorting Algorithms
- One of the most well-studied and common parallel computing algorithms is sorting
- Sorting algorithms can be categorized as internal or external
- Sorting algorithms may also be classified as comparison-based or non-comparison-based
- The fundamental operation in comparison-based sorting is compare-exchange
- Finding the lower bound of any comparison-based sort of n numbers is a function of Θ(nlog n)
- This presentation focuses on comparison-based sorting algorithms
Sorting: Basics
- A parallel sorted sequence is a sequence that is partitioned with the property that each partitioned list is sorted and each element is progressively larger towards the higher processor IDs
- The input and output lists are assumed to be distributed
Sorting: Parallel Compare Exchange Operation
- A parallel compare-exchange operation is where processes Pi and Pj send their elements to each other. Pi keeps min{ai,aj} and Pj keeps max{ai, aj}.
Sorting: Basics Continued
- The parallel counterpart to a sequential comparator is the compare-exchange operation. Each processor has one element, the compare exchange operation stores the smaller element at the processor with a smaller id. This can be done in ts + tw time
- When dealing with more than one element per processor, the compare-exchange operation is called a compare split
- It is assumed that each of the two processors has n/p elements. After the compare split operation, the smaller n/p elements are at processor Pi, and the larger n/p elements are at Pj, where i < j
- The time for a compare-split operation is (ts + twn/p), assuming that the two partial lists were initially sorted
Sorting: Parallel Compare Split Operation
- A compare-split operation is where each process sends its block of size n/p to the other process. Each process merges the received block with its own block and retains only the appropriate half of the merged block
Sorting Networks
- Networks of comparators are specifically designed for sorting
- A comparator has two inputs x and y and two outputs x' and y'
- An increasing comparator x' = min{x,y} and y' = min{x,y}, and vice-versa
- An increasing comparator is denoted by and a decreasing comparator by Ө
- The network's speed is proportional to its depth
Sorting Networks: Comparators
- Schematic representations of comparators include an increasing comparator and a decreasing comparator
Sorting Networks
- Every sorting network is made up of a series of parallel columns, where each column contains a number of comparators connected in parallel
Sorting Networks: Bitonic Sort
- A bitonic sorting network sorts n elements in Θ(log2n) time
- A bitonic sequence has two tones, increasing and decreasing, or vice versa.
- A cyclic rotation of a bitonic sequence is also considered bitonic
- The kernel of the network is the rearrangement of a bitonic sequence into a sorted sequence
Sorting Networks: Bitonic Sort Continued
- Given s = a0,a1,…,an-1 as a bitonic sequence such that a0 ≤ a 1 ≤ ··· ≤ an/2-1 and an/2 ≥ an/2+1 ≥ ··· ≥ an-1
- There are two subsequences of s:
- s1 = min{a0,an/2},min{a1,an/2+1},…,min{an/2-1,an-1}
- s2 = max{a0,an/2},max{a1,an/2+1},…,max{an/2-1,an-1}
- s1 and s2 are both bitonic, and each element of s1 is less than every element in s2
- The procedure can be applied recursively on s1 and s2 to get the sorted sequence
Sorting Networks: Bitonic Sort
- To sort an unsorted sequence using bitonic merge, we must build a single bitonic sequence from the given sequence
- A sequence of length 2 is a bitonic sequence
- A bitonic sequence of length 4 can be built by sorting the first two elements using BM and next two using ӨBM
- This process can be repeated to generate larger bitonic sequences
Bitonic Sort Example
- This example gives an example of how a sequence is sorted using bitonic merges
Sorting Networks: Bitonic Sort Example
- A schematic representation of a network that converts an input sequence into a bitonic sequence is presented, where the last merging network (BM) sorts the input
Sorting Networks: Bitonic Sort Example
- Another schematic representation of a comparator network that transforms an input sequence of 16 unordered numbers into a bitonic sequence is presented
Sorting Networks: Bitonic Sort
- The depth of the network is Θ(log2 n)
- Each stage of the network contains n/2 comparators
- A serial implementation of the network would have a complexity of Θ(nlog2 n)
Mapping Bitonic Sort to Hypercubes
- This maps bitonic sort to a hypercube with each wire mapped to a hypercube process, and each connection represents a compare-exchange between processes
Mapping Bitonic Sort to Hypercubes
- This presents the communication characteristics of bitonic sort on a hypercube, with each stage communicating along the dimensions shown
Mapping Bitonic Sort to Hypercubes
- This presents a Parallel formulation of bitonic sort on a hypercube with n = 2d processes.
Mapping Bitonic Sort to Hypercubes Continued
- During each step of the algorithm, every process performs a compare-exchange operation (single nearest neighbor communication of one word)
- The parallel time is Tp = Θ(log2n)
- This algorithm is cost optimal with respect to its serial counterpart but not with respect to the most optimal sorting algorithm
Mapping Bitonic Sort to Meshes
- Mapping bitonic sort to a mesh has an overhead due to lower mesh connectivity than a hypercube
- Consider the row-major shuffled mapping of wires to processors
Mapping Bitonic Sort to Meshes
- The different ways of mapping input wires of the bitonic sorting network to a mesh are included
Mapping Bitonic Sort to Meshes
- This shows the last stage of the bitonic sort algorithm for n = 16 on a mesh using the row-major shuffled mapping
- Each process pair compares and exchanges their elements
- Arrows show the pairs of processes that perform compare-exchange operations
Mapping Bitonic Sort to Meshes: Details
-
Wires that differ at the ith least significant bit are mapped onto mesh processes that are 2(i-1)/2 communication links away
-
The total amount of communication performed by each process is 2 7 n , or ( n ) log n i i =1 j =1 ( j −1) / 2
-
The total computation performed by each process is Θ(log2n)
-
The parallel run time is Θ(log2 n + 7n)
-
This is not cost-optimal
Block of Elements Per Processor
- Each process is assigned a block of n/p elements.
- The first step is to sort the local block.
- Each subsequent compare-exchange operation is replaced by a compare-split operation.
- The bitonic network can be viewed as having (1 + log p)(log p)/2 steps.
Block of Elements Per Processor: Hypercube
- Initially, the processes sort their n/p elements (using merge sort) in time Θ((n/p)log(n/p)) and then perform Θ(log2p) compare-split steps.
- The parallel run time of this formulation is Θ((n/p)log(n/p) + log2p).
- This algorithm can efficiently use up to p = (2 logn ) processes.
- The isoefficiency function due to communication and extra work is Θ(plog plog2p).
Block of Elements Per Processor: Mesh
- The parallel runtime in this case is given by: Tp = Θ((n/p)log(n/p) + plog2p)
- This formulation can efficiently use up to p = Θ(log2n) processes.
- The isoefficiency function is Θ(p2p)
Performance of Parallel Bitonic Sort
- The chart shows the performance of parallel formulations of bitonic sort for n elements on p processes.
Bubble Sort and its Variants
- The sequential bubble sort algorithm compares and exchanges adjacent elements in the sequence to be sorted.
Bubble Sort and its Variants
- The complexity of bubble sort is Θ(n2)
- Bubble sort is difficult to parallelize since the algorithm has no concurrency
Odd-Even Transposition
- The sequential odd-even transposition sort algorithm is provided
Odd-Even Transposition
- Sorting n = 8 elements is shown, using the odd-even transposition sort algorithm
- Each phase has n = 8 elements compared
Odd-Even Transposition
- After n phases of odd-even exchanges, the sequence is sorted
- Each phase (odd or even) requires Θ(n) comparisons
- Serial complexity is Θ(n2)
Parallel Odd-Even Transposition
- There are n iterations where each processor does one compare-exchange
- The parallel run time of this formulation is Θ(n)
- This formulation is cost-optimal with respect to the base serial algorithm, but not optimal
Parallel Odd-Even Transposition
- The Parallel formulation of odd-even transposition is shown
Parallel Odd-Even Transposition
- Using a block of n/p elements per processor, the first step is a local sort.
- In each subsequent step, the compare exchange operation is replaced by the compare split operation
- The parallel run time of the formulation is Θ(n/p)log(n/p) + n)
Parallel Odd-Even Transposition
- The parallel formulation is cost-optimal for p = O(log n).
- The isoefficiency function of this parallel formulation is Θ(p2p)
Closing
- Thank you.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
This quiz explores the principles and techniques behind sorting algorithms, focusing on their classification as internal or external, and comparison-based vs. non-comparison-based types. It also covers the parallel compare-exchange operation, highlighting the mechanics of sorting in a distributed system. Test your understanding of these key concepts in parallel computing!