Sorting Algorithms PDF
Document Details
Uploaded by CelebratoryHouston3497
Tags
Summary
This document presents sorting algorithms, focusing on internal sorting methods like Insertion Sort and Shell Sort. It analyzes their time complexities and concepts like inversions. The document discusses the average and worst-case scenarios to understand the efficiency of these algorithms. It also touches upon the notion of lower bounds in sorting.
Full Transcript
SORTING Sorting is possibly the most frequently executed operation in computing! 1 SORTING Sorting is possibly the most frequently executed operation in computing! – Given an array of N comparable objects a1,a2,a3,….aN –...
SORTING Sorting is possibly the most frequently executed operation in computing! 1 SORTING Sorting is possibly the most frequently executed operation in computing! – Given an array of N comparable objects a1,a2,a3,….aN – Find a permutation of [1,2,…N] -> [i1,i2,i3,…,iN] such that ai1≤ai2 ≤ … ≤ aiN Many algorithms use sorting for implementing many operations faster 2 SORTING Sorting is possibly the most frequently executed operation in computing! – Internal Sorting All data to be sorted fits in the main memory 3 SORTING Sorting is possibly the most frequently executed operation in computing! – Internal Sorting All data to be sorted fits in the main memory – External Sorting Sorting can not be performed in main memory and external memory such as disks have to be used 4 SORTING We will concentrate on internal sorting – The keys of the items to be sorted come from an essentially infinite domain and can be compared with > and and 0 && tmp < a[j – 1]; j ––) a[j] = a[j –1]; a[j] = tmp; } } Insertion Sort Demo 25 INSERTION SORT template void insertionSort (vector & a) { int j for (int p = 1; p < a.size(); p++) { Comparable tmp = a[p]; for (j = p; j > 0 && tmp < a[j – 1]; j ––) a[j] = a[j –1]; a[j] = tmp; } } Loop for the passes 26 INSERTION SORT template void insertionSort (vector & a) { int j for (int p = 1; p < a.size(); p++) { Comparable tmp = a[p]; for (j = p; j > 0 && tmp < a[j – 1]; j ––) a[j] = a[j –1]; a[j] = tmp; } } Remember the element a[p] in the tmp variable 27 INSERTION SORT template void insertionSort (vector & a) { int j for (int p = 1; p < a.size(); p++) { Comparable tmp = a[p]; for (j = p; j > 0 && tmp < a[j – 1]; j ––) a[j] = a[j –1]; a[j] = tmp; } } Keep on moving elements toward right (to open up a hole for a[p]) until either j=0 or tmp is a[j – 1] 28 INSERTION SORT template void insertionSort (vector & a) { int j for (int p = 1; p < a.size(); p++) { Comparable tmp = a[p]; for (j = p; j > 0 && tmp < a[j – 1]; j ––) a[j] = a[j –1]; a[j] = tmp; } } Finally place a[p] in the hole found. 29 INSERTION SORT template void insertionSort (vector & a) { int j for (int p = 1; p < a.size(); p++) { Comparable tmp = a[p]; for (j = p; j > 0 && tmp < a[j – 1]; j ––) a[j] = a[j –1]; a[j] = tmp; } } Analysis: This test is executed at most p times for each value of p. N-1 (N -1)N Total Comparisons = å p = 2 = Q( 2 N ) p=1 30 INSERTION SORT template void insertionSort (vector & a) { int j for (int p = 1; p < a.size(); p++) { Comparable tmp = a[p]; for (j = p; j > 0 && tmp < a[j – 1]; j ––) a[j] = a[j –1]; a[j] = tmp; } } Analysis: If the input is already sorted (or almost sorted) the algorithm is very fast since the test will fail immediately! 31 LOWER-BOUND We can prove a simple lower bound for simple sorting algorithms that operate by interchanging adjacent elements. 32 LOWER-BOUND We can prove a simple lower bound for simple sorting algorithms that operate by interchanging adjacent elements. – An inversion in an array of numbers is any ordered pair, (i, j) having the property that i > j, but indexOf(i) < indexOf(j). 33 LOWER-BOUND We can prove a simple lower bound for simple sorting algorithms that operate by interchanging adjacent elements. – An inversion in an array of numbers is any ordered pair, (i, j) having the property that i > j, but indexOf(i) < indexOf(j). – 34, 8, 64, 51, 32, 21 has 9 inversions 34 LOWER-BOUND We can prove a simple lower bound for simple sorting algorithms that operate by interchanging adjacent elements. – An inversion in an array of numbers is any ordered pair, (i, j) having the property that i > j, but indexOf(i) < indexOf(j). – 34, 8, 64, 51, 32, 21 has 9 inversions – They are: (34,8), (34,32), (34,21) (64,51), (64,32), (64,21), (51,32), (51,21), (32,21) 35 INSERTION SORT 34 8 64 51 32 21 Initial State 0 1 2 3 4 5 After pass 1 8 34 64 51 32 21 8 moved 1 position After pass 2 8 34 64 51 32 21 64 moved 0 positions After pass 3 8 34 51 64 32 21 51 moved 1 position After pass 4 8 32 34 51 64 21 32 moved 3 positions After pass 5 8 21 32 34 51 64 21 moved 4 positions + 9 36 LOWER-BOUND Number of inversions = number of positions moved, Why? 37 LOWER-BOUND Number of inversions = number of positions moved, Why? Each adjacent swap fixes one inversion and a sorted array has no inversions. 38 LOWER-BOUND Number of inversions = number of positions moved, Why? Each adjacent swap fixes one inversion and a sorted array has no inversions. Since there are O(N) passes, total work for insertion sort is O(N+I) where I is the initial number of inversions. 39 LOWER-BOUND Since there are O(N) passes, total work for insertion sort is O(N+I) where I is the initial number of inversions. Thus, if the number of inversions is O(N) (which is nice definition for being almost sorted) then Insertion Sort takes O(N) time. 40 LOWER-BOUND What is the average number of inversions in a permutation of N numbers? 41 LOWER-BOUND What is the average number of inversions in a permutation of N numbers? – No duplicate elements. – Input is a permutation of 1.. N – All permutations are equally likely 42 LOWER-BOUND Theorem – The average number of inversions in an array of N distinct elements is N(N-1)/4 43 LOWER-BOUND Theorem – The average number of inversions in an array of N distinct elements is N(N-1)/4 Proof: – Given a list of elements L consider Lr which is the reverse of L. 44 LOWER-BOUND Theorem – The average number of inversions in an array of N distinct elements is N(N-1)/4 Proof: – Given a list of elements L consider Lr which is the reverse of L. – Consider any pair (x, y) in L with y > x. In exactly one of L or Lr, this ordered pair represents an inversion. 45 LOWER-BOUND Theorem – The average number of inversions in an array of N distinct elements is N(N-1)/4 Proof: – Given a list of elements L consider Lr which is the reverse of L. – Consider any pair (x, y) in L with y > x. In exactly one of L or Lr, this ordered pair represents an inversion. – The total number of such pairs in both L and Lr is N(N-1)/2. Why? 46 LOWER-BOUND Theorem – The average number of inversions in an array of N distinct elements is N(N-1)/4 Proof: – Consider any pair (x, y) in L with y > x. In exactly one of L or Lr, this ordered pair represents an inversion. – The total of such pairs in both L and Lr is N(N-1)/2. Why? – An average list has half this amount which is N(N-1)/4. 47 LOWER-BOUND Let M = N(N-1)/2 possible values of I 0 1 2... M 48 LOWER-BOUND Theorem – Any algorithm that sorts by exchanging adjacent elements, requires (N2) on the average. 49 LOWER-BOUND Theorem – Any algorithm that sorts by exchanging adjacent elements, requires (N2) steps on the average. Proof: – The average number of inversions is (N2) and each swap removes one inversion, so (N2) swaps are needed. 50 LOWER-BOUND The most important implication of this is that any algorithm that runs faster than O(N2) – should make comparisons and exchanges between far apart elements – it must eliminate more than just one inversion per exchange. 51 SHELLSORT One of the first algorithms to break the quadratic time barrier. 52 SHELLSORT One of the first algorithms to break the quadratic time barrier. It works by comparing elements that are distant. 53 SHELLSORT One of the first algorithms to break the quadratic time barrier. It works by comparing elements that are distant. – the distance between comparisons decreases as the algorithm progresses – Shellsort is referred to as the diminishing increment sort 54 SHELLSORT Shellsort uses a sequence of increments h1, h2,..., ht 55 SHELLSORT Shellsort uses a sequence of increments h1, h2,..., ht Any increment sequence will do as long as h1 = 1. 56 SHELLSORT Shellsort uses a sequence of increments h1, h2,..., ht Any increment sequence will do as long as h1 = 1. After a phase using some increment hk, for every i, we have a[i] a[i+hk] (where a[i+hk] exists); all elements spaced hk apart are hk-sorted. 57 SHELLSORT An important property of Shellsort is that an hk-sorted sequence that is then hk-1-sorted remains hk-sorted. 58 SHELLSORT 0 1 2 3 4 5 6 7 8 9 10 11 12 81 94 11 96 12 35 17 95 28 58 41 75 15 After h=5-sort 0 1 2 3 4 5 6 7 8 9 10 11 12 35 17 11 28 12 41 75 15 96 58 81 94 95 Sequence of increments: 1, 3, 5 59 SHELLSORT 0 1 2 3 4 5 6 7 8 9 10 11 12 81 94 11 96 12 35 17 95 28 58 41 75 15 After h=5-sort 0 1 2 3 4 5 6 7 8 9 10 11 12 35 17 11 28 12 41 75 15 96 58 81 94 95 After h=3-sort 0 1 2 3 4 5 6 7 8 9 10 11 12 28 12 11 35 15 41 58 17 94 75 81 96 95 Sequence of increments: 1, 3, 5 60 SHELLSORT 0 1 2 3 4 5 6 7 8 9 10 11 12 81 94 11 96 12 35 17 95 28 58 41 75 15 After h=5-sort 0 1 2 3 4 5 6 7 8 9 10 11 12 35 17 11 28 58 41 75 15 96 12 81 94 95 After h=3-sort 0 1 2 3 4 5 6 7 8 9 10 11 12 28 12 11 35 15 41 58 17 94 75 81 96 95 After h=1-sort 0 1 2 3 4 5 6 7 8 9 10 11 12 11 12 15 17 28 35 41 58 75 81 94 95 96 Sequence of increments: 1, 3, 5 61 SHELL’S INCREMENTS A popular but poor choice for increments: ht = N/2 and hk=hk+1/2 These are called Shell’s increments. 62 SHELLSORT template void shellsort (vector & a) { int j // Loop over increments for (int gap = a.size()/2; gap > 0; gap/=2) // Loop over elements for (int i = gap; i < a.size(); i++) { Comparable tmp = a[i]; // Loop over elements that are gap apart for (j = i; j >= gap && tmp < a[j – gap]; j –= gap) a[j] = a[j –gap]; a[j] = tmp; } } 63 SHELLSORT-ANALYSIS Theorem: – The worst-case running time of Shellsort, using Shell’s increments is (N2). Proof: – Lower-bound (i.e., (N2)) – Upper-bound (i.e., O(N2)) 64 SHELLSORT-ANALYSIS Theorem: – The worst-case running time of Shellsort, using Shell’s increments is (N2). Proof: – Lower-bound – Consider 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 65 SHELLSORT-ANALYSIS Theorem: – The worst-case running time of Shellsort, using Shell’s increments is (N2). Proof: – Lower-bound – Consider 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 After 8 sort: 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 66 SHELLSORT-ANALYSIS Theorem: – The worst-case running time of Shellsort, using Shell’s increments is (N2). Proof: – Lower-bound – Consider 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 After 8 sort: 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 67 SHELLSORT-ANALYSIS Theorem: – The worst-case running time of Shellsort, using Shell’s increments is (N2). Proof: – Lower-bound – Consider 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 After 8 sort: 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 After 4 sort: 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 68 SHELLSORT-ANALYSIS Theorem: – The worst-case running time of Shellsort, using Shell’s increments is (N2). Proof: – Lower-bound – Consider 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 After 8 sort: 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 After 4 sort: 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 69 SHELLSORT-ANALYSIS Theorem: – The worst-case running time of Shellsort, using Shell’s increments is (N2). Proof: – Lower-bound – Consider 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 After 8 sort: 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 After 4 sort: 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 After 2 sort: 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 After 1 sort: 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16 70 SHELLSORT-ANALYSIS Theorem: – The worst-case running time of Shellsort, using Shell’s increments is (N2). Proof: – Lower-bound – After 2 sort: 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 – The ith smallest (i N/2) is in position at 2i-1 (assume 1 based array). 71 SHELLSORT-ANALYSIS Theorem: – The worst-case running time of Shellsort, using Shell’s increments is (N2). Proof: – Lower-bound – After 2 sort: 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 – The ith smallest (i N/2) is in position at 2i-1 (assume 1 based array). – Restoring the ith element requires i – 1 moves. 72 SHELLSORT-ANALYSIS Theorem: – The worst-case running time of Shellsort, using Shell’s increments is (N2). Proof: – Lower-bound – After 2 sort: 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 – The ith smallest (i N/2) is in position at 2i-1 (assume 1 based array). – Restoring the ith element requires i – 1 moves. N / 2 – Total moves for N/2 smallest elements = i − 1 i =1 73 SHELLSORT-ANALYSIS Theorem: – The worst-case running time of Shellsort, using Shell’s increments is (N2). Proof: – Lower-bound – After 2 sort: 1,9,2,10,3,11,4,12,5,13,6,14,7,15,8,16 – The ith smallest (i N/2) is in position at 2i-1 (assume 1 based array). – Restoring the ith element requires i – 1 moves. N / 2 – Total moves for N/2 smallest elements = i − 1 i =1 – = (N ) 2 74 SHELLSORT-ANALYSIS Theorem: – The worst-case running time of Shellsort, using Shell’s increments is (N2). Proof: – Upper bound A pass with increment hk is hk insertion sort with N/hk elements O(hk (N/hk)2) = O(N2/hk) Summing over all passes gives 2 N 1 O( ) = O( N ) = O( N ) t t 2 2 h h i =1 i =1 i i 75 SHELLSORT-ANALYSIS The problems with Shell’s increments is that pairs of increments are not relatively prime, so smaller increments have little effect. 1 2 3 4 5 6 7 8 9 Increments: [1, 2, 4] h = 4: Starting from 1, comparing 1, 5, 9 h = 2: Starting from 1, comparing 1, 3, 5, 7, 9 h = 1: Comparing 1, 2, 3, 4, 5, 6, 7, 8, 9 76 SHELLSORT-ANALYSIS The problems with Shell’s increments is that pairs of increments are not relatively prime, so smaller increments have little effect. Hibbard’s increments – 1,3,7,..., 2k-1 – Consecutive increments have no common factors 77 SHELLSORT-ANALYSIS The worst-case running time of Shellsort using Hibbard’s increments is (N3/2) The average-case running time of Shellsort, using Hibbard’s increment, is tought to be O(N5/4), based on simulation, but nobody has been able to prove this. Sedgwick has proposed several increment sequences that give an O(N4/3) worst-case running time (conjectured average time is O(N7/6)) 78