Data Structures and Algorithms 2 - Priority Queues PDF
Document Details
Uploaded by Deleted User
The National Higher School of Artificial Intelligence
Ahmed Guessoum
Tags
Summary
These lecture notes cover priority queues, focusing on their implementation using binary heaps, leftist heaps, and skew heaps. The document provides motivating examples and a detailed explanation of each topic, including operations like insertion and deletion. The document also analyzes the performance of different implementation strategies.
Full Transcript
Data Structures and Algorithms 2 Prof. Ahmed Guessoum The National Higher School of AI Chapter 6 Priority Queues Motivating Example 3 jobs have been submitted to a printer, the jobs have sizes 100, 10, 1 page(s). Enqueued in this order. Ave...
Data Structures and Algorithms 2 Prof. Ahmed Guessoum The National Higher School of AI Chapter 6 Priority Queues Motivating Example 3 jobs have been submitted to a printer, the jobs have sizes 100, 10, 1 page(s). Enqueued in this order. Average time taken for FIFO service, (100+110+111)/3 = 107 time units Average time for shortest-job-first service, (1+11+111)/3 = 41 time units Same things for jobs to execute in a multiuser environment, the Operating System scheduler must decide which of several processes to run? short jobs finish as fast as possible (should have precedence) Important jobs should also have precedence Need to have an appropriate data structure for such tasks: 2 a Priority Queue Priority Queue A data structure that allows at least the following two operations: – Insert (enqueue); and – deleteMin, which finds, returns, and removes the minimum element in the priority queue (equivalent of dequeue in the Queue ADT). Different applications of Priority Queues: greedy algorithms and A* (finding the shortest path between two points), load balancing on servers and interrupt handling in Operating Systems, triage systems in emergency departments (medical systems), etc. 3 Simple Implementations of PQs 1. Using a Linked list 1. Insert at the front in O(1) 2. Find the minimum element in O(n), thus deletion is O(n) 2. Keeping the list always sorted insertions become expensive (O(N)) and deleteMins cheap (O(1)). First solution is probably better, given that there are never more deleteMins than insertions. 4 Simple Implementations of PQs (cont.) 3. Using a Binary Search Tree: both insert and deleteMin take O(log N) – Deletion of mins (i.e. from left subtree) would make rightsubtree heavier – Since the right subtree is random, in the worst case, where the deleteMins have depleted the left subtree, the right subtree would have at most twice as many elements as it should. Using a Search Tree is an overkill as it does many other operations that are not required (key-based search, finding kth smallest element, checking if 2 BSTs are identical, etc.) 5 Binary Heap Binary heap: a very popular implementation of Priority Queues. (By default, heap will refer to Binary Heap.) A Binary heap has a structure property and an order property. Heap structure: A heap is a binary tree that is completely filled, except possibly for the bottom level, which is filled from left to right (empty nodes are to the right of bottom level) If such a tree has height h, then it has between 2h and 2h+1 – 1 nodes. h is O(log n) Heap-order property: any node should be smaller than all of its descendants (Smaller than or equal) 6 Heap Not Heap Because a complete binary tree is so regular, it can be represented in an array and no links are necessary 7 Array implementaion We start from position 1 in the array. o The first element contains the root o Left child of element at position j is at position 2j, right child is at position 2j + 1 o Parent of element at position j is at j/2 1 1 2 5 4 3 6 2 5 4 3 6 Operations to traverse the tree are extremely simple & very fast Only problem with this implementation: an estimate of the maximum heap size is required in advance, but this is not a 8 problem (since we can resize if needed) In fact, array implementation of the heap looks more like the following! (will get clearer later) Class interface of Priority Queue 9 Inserting an element into a heap To insert X – Create a hole in the next available location, – If X can be placed in the hole without violating heap order, then do so and done. – Otherwise, slide the element that is in the hole’s parent node into the hole, thus bubbling the hole up toward the root. – Continue this process until X can be placed in the hole. This general strategy is known as a percolate up. 10 Insert 14 into Binary Heap (below left) Implementation of Insertion into a Binary Heap11 Performance issues We could have used swaps instead of percolation, but each swap takes 3 assignments If an element is percolated up d levels, the number of assignments performed by the swaps would be 3d. The percolate up strategy uses d + 1 assignments. A copy of X (item to be inserted) is placed in position 0 in order to make the loop terminate. Worst time to do the insertion is O(logN). On average, the percolation terminates early; – 2.607 comparisons are required on average to perform an insert. – the average insert moves an element up 1.607 levels. 12 DeleteMin Similar idea to Insert, but now percolate down: When the minimum is removed, a hole is created at the root. The heap has one element less last element X in the heap must move somewhere in the heap. If X can be placed in the hole, then done. Else, slide the smaller of the hole’s children into the hole, thus pushing the hole down one level. Repeat this step until X can be placed in the hole. Thus, the goal is to place X in its correct spot along a path from the root containing minimum children. Same technique used as in insertion to avoid swaps13 Delete the minimum (13) from the left tree below 14 The worst-case running time for this operation is O(log N). On average, the element that is placed at the root is percolated almost to the bottom of the heap (which is the level it came from), so the average running time is O(logN). 15 Code for DeleteMin-PercolateDown Other Heap Operations A heap has very little ordering information no way to find any particular element without a linear scan through the entire heap If it is important to know where elements are, some other data structure, such as a hash table, must be used in addition to the heap. If we assume that the position of every element is known by some other method, then several other operations become cheap. The first three operations increaseKey, decreaseKey, and remove; all run in logarithmic worst-case time. 16 DecreaseKey The decreaseKey(p, ∆) operation lowers the value of the item at position p by a positive amount ∆. Since this might violate the heap order, it must be fixed by a percolate up. o Decrease the value of the element, o Interchange it with its parent if it is less than its parent, o Continue doing this till a root is reached Example of application: could be useful to system administrators who can make their programs run with highest priority. 17 Example of decreaseKey 4 4 2 2 5 5 4 3 4 3 3.1 Decrease 6 6 by 2.9 (i.e. to 3.1) Fig 1 Fig 2 4 3.1 2 3.1 2 4 4 3 4 3 5 5 Fig 3 Fig 4 18 IncreaseKey The increaseKey(p, ∆) operation increases the value of the item at position p by a positive amount. This is done with a percolate down. o Increase the value of the element, o Interchange it with the lesser of its children if it is greater than any of its children, o Continue doing this till a leaf is reached Example application: Many schedulers automatically drop the priority of a process that is consuming excessive CPU time. 19 Example of increaseKey 10 1 Increase Root by 9 2 2 5 5 4 3 6 4 3 6 Fig 1 Fig 2 2 2 3 10 5 5 4 10 6 4 3 6 20 Fig 3 Fig 4 Remove & buildHeap remove(p): removes the node at position p from the heap. This is done by first performing decreaseKey(p,∞) and then performing deleteMin(). buildHeap: A binary heap is sometimes constructed from an initial collection of items. The constructor takes as input N items and places them into a heap. This can be done with N successive inserts. Each insert will take O(1) average and O(logN) worst- case time the total running time of buildHeap would be O(N) average but O(N logN) worst-case. With care a linear time bound can be guaranteed 21 buildHeap algorithm General algorithm: – place the N items into the tree in any order, maintaining the structure property. – if percolateDown(i) percolates down from node i, the buildHeap routine can be used by the constructor to create a heap-ordered tree. 22 23 buildHeap routine 24 Applications of PQs The Selection Problem: Given a list of N elements, which can be totally unordered, and an integer k, find the kth largest element. 2 algorithms from Chapter 1: – Algorithm 1A: read the elements into an array and sort them, returning the appropriate element. Assuming a simple sorting algorithm, the running time is O(N2). – Algorithm 1B: read k elements into an array and sort them. The smallest of these is in the kth position. Process the remaining elements one by one. As an element arrives, it is compared with the kth element in the array. If it is larger, then the kth element is removed, and the new element is placed in the correct place among the remaining k−1 elements. When the algorithm ends, the element in the kth position is the answer. The running time26 is O(N·k). Prove it! The Selection Problem (cont.) If k = ceiling(N/2), then both algorithms 1A and 1B are O(N 2). The next two algorithms run in O(N logN) in the extreme case of k = ceiling(N/2), which is a great improvement. 27 Algorithm 6A for the Selection Pb Suppose we are interested in finding the kth smallest element. The algorithm is simple: – Read the N elements into an array. – Apply the buildHeap algorithm to this array. – Perform k deleteMin operations. – The last element extracted from the heap is the answer. Clearly, by changing the heap-order property, the original problem of finding the kth largest element can be solved 28 Complexity of Algorithm 6A The worst-case timing is O(N) to construct the heap, if buildHeap is used Worst-case timing for each deleteMin is O(logN). Since there are k deleteMin operations, the total running time is O(N + k logN). If k = O(N / log N), then the running time is dominated by the buildHeap operation and is O(N). For larger values of k, the running time is O(k logN). If k = ceiling(N / 2), then the running time is θ(N logN). 29 Algorithm 6B for the Selection Pb Use the same idea from algorithm 1B but with a heap. At any point in time, maintain a set S of the k largest elements. After the first k elements are read, when a new element is read it is compared with the kth largest element, denoted by Sk. (Notice that Sk is the smallest element in S.) If the new element is larger, then it replaces Sk in S. S will then have a new smallest element, which may or may not be the newly added element. At the end of the input, find the smallest element in S and return it as the answer 30 Algorithm 6B for the Selection Pb 6B ≠ 1B in that a heap is used to implement S. The first k elements are placed into the heap in total time O(k) with a call to buildHeap. The time to process each of the remaining elements is – O(1), to test if the element goes into S, plus – O(log k), to delete Sk and insert the new element if this is necessary. Thus, the total time is O(k + (N − k) log k) = O(N log k). This algorithm also gives a bound of θ(N logN) for finding the median. 31 d-Heaps Binary heaps are so simple that they are almost always used when priority queues are needed. A d-heap is a simple generalization of a binary heap where all nodes have d children (thus, a binary heap is a 2-heap). Here is a d-heap where d=3 32 a d-heap is much shallower than a binary heap, improving the running time of inserts to O(logd N). However, for large d, the deleteMin operation is more expensive, because even though the tree is shallower, the minimum of d children must be found, which takes d −1 comparisons using a standard algorithm time for this operation O(d logd N). If d is a constant, both running times are, of course, O(logd N) Arrays can still be used for implementation. BUT unless d is a power of 2, the running time will seriously increase because we can no longer implement division by a bit shift. 33 D-heaps pros and cons d-heaps are interesting in theory, because there are many algorithms where the number of insertions is much greater than the number of deleteMins (and thus a theoretical speedup is possible). When a priority queue is too large to fit entirely in main memory, a d-heap can be advantageous in much the same way as B-trees. There is evidence suggesting that 4-heaps may outperform binary heaps in practice. Heaps cannot perform finds. Combining two heaps into one (merge) is a hard operation; different ways exist to implement heaps 34 with a running time of O(log N). Next topic! Leftist Heaps A leftist heap is a binary tree with the same heap-order property. Only difference between leftist heaps and binary heaps: leftist heaps are not perfectly balanced; actually they attempt to be very unbalanced Definition: The null path length, npl(X), of any node X, is the length of the shortest path from X to a node without two children. So the npl of a node with zero or one child is 0, while npl(nullptr) = −1. npl(x) = min(npl(Lchild(x)), npl(Rchild(x))) +1. (Also applies to nodes with 0 or 1 child since npl(nullptr)= −1) 35 Leftist heap property: for every node X in the heap, the null path length of the left child of X is at least as large as that of its right child. Not a leftist heap Leftist heap Leftist heap property goes out of its way to ensure the tree is unbalanced; it biases the tree to get deep toward the left. A tree consisting of a long path of left nodes is possible (and actually preferable to facilitate merging)—hence36the name leftist heap. The npl of a node can be seen as the distance from that node to the nearest empty position in the subtree rooted at that node. Leftist heaps tend to have deep left paths the right path ought to be short. The right path (path from root to the rightmost leaf) in a leftist heap is as short as any path in the heap. Theorem 6.2: A leftist tree with r nodes on the right path must have at least 2 r − 1 nodes. 37 Why the leftist property? Because it guarantees that: The right path is really short compared to the number of nodes in the tree. A Leftist tree of N nodes has a right path of at most log2 (N+1) nodes. So perform all the work (of merging) on the right path. Note: Performing inserts and merges on the right path could destroy the leftist heap property. But it turns out to be extremely easy to restore the property 38 Leftist Heap Operations The fundamental operation on LH is merge. (Insertion can be seen as a special case: a merge of a one-node heap with a larger heap.) A node entry: (Data, leftPtr, rightPtr, npl) Merge(H1, H2) of two LHs: If either one is empty, return the other LH Otherwise, compare their roots – First, recursively merge the heap that has the larger root with the right subheap of the heap with the smaller root. 39 Recursively merge H2 with the subheap rooted at 8. Swap subheaps if needed to restore Leftist heap property. 40 Attach the leftist heap of previous figure as H1’s right child Resulting right subheap is leftist by recursive construction; left subheap is not changed so it is leftist. Leftist property is violated at the root. Restore it by swapping the Left and Right subheaps. 41 Update the npl of the root (having the new npl of right child) Result of swapping children of H1’s root Leftist heap type declarations routines for merging leftist heaps 42 Insert / DeleteMin Insert item in a heap h1: – make item a one-node heap h2 and – perform a merge between h2 and h1. DeleteMin: – Destroy the root, creating two heaps, – Merge these two heaps Complexities: – Merge two heaps O (log N) – Insert an element into the heap O (log N). – DeleteMin from heap O (log N). Implementation of Insert and DeleteMin 43 Skew Heaps A Skew Heap is a self-adjusting version of a leftist heap that is very simple to implement Skew Heaps are binary trees with heap order, but there is no structural constraint on these trees. No information is maintained about the null path length of any node The right path of a skew heap can be arbitrarily long at any time the worst-case running time of all operations is O(N) For any M consecutive operations, the total worst-case running time is O(M logN) skew heaps have O(log N) amortized cost per oeration44 The fundamental operation on skew heaps is merging We perform the exact same operations as before, with one exception: the swap is unconditional; we always do it, with the one exception that the largest of all the nodes on the right paths does not have its children swapped Recursively merge H2 with the subheap rooted at 8. 45 Result of merging the two heaps 46 Final notes Because a right path could be long, a recursive implementation could fail because of lack of stack space, even though performance would otherwise be acceptable Advantage of Skew heaps: no extra space is required to maintain path lengths and no tests are required to determine when to swap children. 47 Slides based on the textbook Mark Allen Weiss, (2014 ) Data Structures and Algorithm Analysis in C++, 4th edition, Pearson. Acknowledgement: This course PowerPoints make substantial (non-exclusive) use of the PPT chapters prepared by Prof. Saswati Sarkar from the University of Pennsylvania, USA, themselves developed on the basis of the course textbook. Other references, if48any, will be mentioned wherever applicable.