Sorting Algorithms PDF

# Sorting ## 6.2 Selectionsort Selectionsort (or Selectsort) is a very simple sorting algorithm that follows the following idea: Take all the cards you want to sort on your hand, find the smallest card among all cards on your hand, put it on the table and repeat with the remaining cards. Technically, when having a list of elements, we subdivide the list in a sorted part (front) and the remaining unsorted part. Then we repeat the following step until the list is sorted: Find the smallest element in the unsorted part and swap it with the first element in the unsorted part. In the following figure you can see one step of selectsort: *** **Minimum** *** The following function is needed to implement this step: min() will return the index of the smallest element in a given range of a list. ```python #min(List [U], int): int # U is an arbitrary type # Precondition: O<=i<len(s), the elements of as are from a totally ordered universe. # Effect: None # Result: The index of the smallest element in xs [i:] is returned. '''Tests: Your exercise ;-) def min(xs, i): n = len(xs) m = i for j in range(i+1, n): if xs[j] < xs [m]: # if we have a smaller element m = j # update index return m ``` Since we can find the minimum in a given range we can now present the implementation of selectsort: ```python # selectionsort (List [U]): NoneType # U is an arbitrary type # Precondition: The elements of as are from a totally ordered universe. # Effect: The elements of xs are sorted in increasing order. # Result: None '''Tests: Your exercise ;-) def selection_sort(xs): n = len(xs) for i in range(n): # find the smallest element ins[i:] and swap it with # the element at position i mini = min(xs, i) xs [mini], xs[i] = xs[i], xs [mini] ``` ## 6.3 Insertionsort Insertionsort (or Insertsort) is a simple algorithm that is very similar to Selectsort. This time we let the cards we want to sort on the table and pick the cards one by one. Each time we pick a new card, we find the right position in the list of already sorted cards and put it at there. Technically, we again subdivide the list in a sorted part (front) and an unsorted part. Next, assume that the elements at positions 0 toi 1 are already sorted. We pick the element at position i and swap it to the left until we find its position. In the following figure you can see two steps of this insertion process: The following method describes one insertion step and is the crucial building block for the insertsort algorithm: ```python # insert (List [U], int): NoneType # U is an arbitrary type # Precondition: O<=i<len(s), the elements on the postions 0...i-1 are sorted # Effect: the elements on the positions O...i are sorted # Result: None '''Tests: Your exercise ;-)''' def insert(xs, i): key = xs [i] j = i while(j > 0 and xs [j-1] > key): xs[j] = xs[j-1] j -= 1 xs[j] = key ``` Finally, we can implement insertionsort as follows: ```python # insertionsort (List [U]): NoneType # U is an arbitrary type # Precondition: The elements of as are from a totally ordered universe. # Effect: The elements of as are sorted in increasing order. # Result: None '''Tests: Your exercise ;-)''' def insertion_sort(xs): n = len(xs) # insert the elements step by step in the sorted front part for i in range(1,n): insert(xs, i) ``` ## 6.4 Mergesort Mergesort is a recursive algorithm that works by the paradigm Divide-and-Conquer. More specifically, we can say that mergesort uses a stupid divide and intelligent conquer strategy. We can summarize the algorithm as follows: * Divide: First, subdivide the sequence into two sequences of equal size. (If the size is odd, one sequence has an element more.) * Sort the two sequences recursively and independently. * Conquer: Merge the two sorted sequences using a merge process to a completely sorted list. The overall idea is depicted in the following figure: Let us first focus on the last step: How can we merge two already sorted lists to a complete sorted list? Using another metaphor, assume you have two bunches of books, both bunches are already sorted by title. You want to sort the two bunches into one bookshelf. How would you do this? For this, let us consider the following implementation: ```python # merge(List [U], List[U], List [U]): NoneType # U is an arbitrary type # Precondition: len(left) + len(right) == len(xs), left and right are sorted inc. # Result: None # Effect: The elements of left and right are in zs sorted increasingly. "Tests: Your exercise ;-)''' def merge(left, right, xs): #lis the index of the first element of left, that has not been copied to zs #ris the index of the first element of right, that has not been copied to as l=0 r = 0 while l + r < len(xs): if l == len(left): # No more elements in left xs [l+r] = right [r] r += 1 elif r == len(right): # No more elements in right xs [l + r] = left [l] l += 1 elif left [l] <= right [r]: # In left is the smaller element xs [l + r] = left [l] l += 1 else: # left[1] > right[r] # In right is the smaller element xs [l+r] = right [r] r += 1 ``` In this process we simultaneously scan two increasingly sorted lists from left to right and copy always the smaller element of both to the list xs. Once we have this method, we can easily implement the mergesort algorithm as follows: ```python # mergesort (List [U]): NoneType # U is an arbitrary type # Precondition: The elements of as are from a totally ordered universe. # Effect: The elements of as are sorted in increasing order. # Result: None '''Tests: Your exercise ;-)''' def mergesort(xs): n = len(xs) if n < 1: # recursion anchor return # take a copy of the left and right half of S # (This corresponds to the lower part of the figure from above.) left = xs [:n//2] right = xs [n//2:] # sort the two copies recursively mergesort (left) mergesort (right) # ... and merge them back into the original list æs # (This corresponds to the upper part of the figure from above) merge(left, right, xs) ``` Last but not least, we have to briefly think about the two question marks in the figure from above. Here, the mergesort algorithm recursively sorts the two smaller lists. In the end, the figure can be completed as follows: The first three steps correspond to the splitting part. The lists are split until they have size one. After that, the merge process brings the lists together. ## 6.5 Quicksort Quicksort is another recursive algorithm using a Divide-and-Conquer paradigm. More precisely, we now use an intelligent dividing process and e stupid conquer process. The general idea can be described as follows: - Divide: Choose an element of the list (which we call pivot element) and subdivide the list in all elements smaller and all elements larger than the pivot element. - Sort the two lists recursively. - Conquer: Concatenate the two sorted lists and the pivot element in the correct order. Thus, we have to think about the crucial part of quicksort: How can we subdivide the elements into a left an a right part. Mergesort used two copies for the subdivision. If the list is large, this would mean that we need a lot of extra memory. For quicksort we can find a solution that does not use any copies of the list. Consider the following implementation: ```python # U is an arbitrary type # partition(List[U], int, int): int # Precondition: 0 <start < end <=len(as), # the elements of as are from a totally ordered universe # Effect: If I is the return value, then all elements are still in as, and # # for start <i<l we have as[i] < as [1] and # for l << end we have as [1] <= s[i] # Result: The indez of the pivot element is returned "Tests: Your exercise ;-)"'" def partition(xs, start, end): pivot = xs [start] l = start for i in range(start+1,end): if xs[i] < pivot: l += 1 xs [l], xs [i] = xs [i], xs[l] xs [start], xs [l] = xs [l], xs [start] return l ``` This function can now be used for the quicksort implementation: ```python # quicksort(List[U]): NoneType #U is an arbitrary type # Precondition: The elements of as are from a totally ordered universe. # Effect: The elements of as are sorted in increasing order. # Result: None ''Tests: Your exercise ;-)''' def quicksort(xs): # This help function sorts the elements in as between #start (inclusive) and end (exclusive) def quicksort_help(start, end): if (end-start <=1): # Recursion anchor return # The splitting part. # This corresponds to the lower part of the figure above. split = partition(xs, start, end) # split is the index of the pivot element #Sort the two remaining lists recursively. quicksort_help(start, split) quicksort_help(split+1, end) quicksort_help(0, len(xs)) ``` Finally, we can also show the figure how the quicksort algorithm works when replacing the question marks by the correct process: When comparing the two pictures of quicksort and mergesort we can see that quicksort is not as symmetric as mergesort. The reason is that in the first step we have chosen a pivot element that is rather small compared to the other elements. This results in a very small and a very large subsequence and therefore, in the mentioned asymmetry. Thus, quicksort is not always as quick as it has to be. However, we will discuss these issues in a later chapter. ## 6.6 Stability In this section we want to discuss the property stability of sorting algorithms. A sorting algorithm is stable, if elements with the same value are not switched in their order. Here - Selectionsort: The selectsort algorithm is not stable, since we change the minimum to the front which might change the order. Here is an example: `selectsort ([11, 12, 0]) [0, 12, 11]` - Insertionsort: The insertionsort algorithm is stable, because when moving elements to the left, we only do this as long as they are strictly smaller (and not equal), i.e., `xs [j-1] > xs[j]`. - Mergesort: The mergesort algorithm is stable, because during the merge process we always pick the element in the left half first, whenever they are equal, i.e., whenever we have `left [1] <= right [r]`. - Quicksort: The quicksort algorithm is not stable, because we swap the pivot with some element which might change the order. ## 6.7 Memory usage Sorting algorithms or algorithms in general might use extra memory for their process. Of course it must be a goal to use less space, since our machines have limited space. For sorting algorithms we only differ between in-place and out-of-place. A sorting algorithm is in-place if it uses only a constant number of extra memory, that means the number of used memory cells does not depend on the number of elements in the input list. Examples for such memory cells are numbers, for example indices used for loops or temporary variables that accumulate some temporary number. The algorithms insertsort and selectsort are examples for in-place sorting algorithms. Insertsort uses the two integer variables n and i and its help function has the variables key and j that only store a constant number. Similar arguments hold for selectsort, where we have the variables n, m, j, i and mini. In contrast to that, a sorting algorithm is out-of-place, if the number of used memory cells depend on the number of elements of the input list. ## 6.7.1 The hidden space of recursions When looking at the quicksort algorithm one might think, that it is in-place, because no function uses operations in which a lot of space is used. However, quicksort is not in-place (i.e. it is out-of-place), because it uses recursion. To understand the correspondence between recursion and the higher memory usage we have to go back to the von-Neumann architecture and take a closer look at the main memory when using programs. Memory image when calling a program. When we start a program, then memory is used for this program. The following does not depend on the operating system or the programming language. We can subdivide the memory for the program into four different parts: * The stack is used for local variables. This part of the memory can be accessed quite fast but is limited. * The heap (or free space) is used for dynamic allocation of new objects or data. The heap has much more space than the stack, but the access to the objects is slower. Objects are usually located in the heap. The local variable on the stack is simply some kind of signpost to the location in the heap. * The data segement contains all global variables. * The program segment contains the machine code of the program, i.e., the algorithms and so on are located here. The recursion and the stack. With recursive functions, the local variables of the calling function are temporarily stored on the stack. Once the function call ends, the memory is freed and can be reused. However, each recursive call needs some memory and therefore, the memory usage is proportional to the recursion depth. The quicksort algorithm has in the worst case n recursive calls in a row, i.e., its recursion depth is n. This happens for example when the chosen pivot element is always the smallest. Thus, quicksort is out-of-place.

Sorting Algorithms PDF

Document Details

Tags

Related

Summary

Full Transcript