Document Details

ObtainableHyperbolic

Uploaded by ObtainableHyperbolic

Freie Universität Berlin

Tags

priority queue data structures scala algorithms

Summary

This document details the implementation of a priority queue data structure in Scala. It covers the abstract data type (ADT) definition, different implementation strategies (using linked nodes and arrays), sorting algorithms, running times, etc.

Full Transcript

# 14 Priority Queue ## 14.1 ADT PrioQueue - In this unit we consider an abstract data type called Priority Queue (short: PrioQueue) - The PrioQueue works quite similar to the simple queue - The difference is, that an element has a priority - **Priority** is a value of totally ordered universe $K$. T...

# 14 Priority Queue ## 14.1 ADT PrioQueue - In this unit we consider an abstract data type called Priority Queue (short: PrioQueue) - The PrioQueue works quite similar to the simple queue - The difference is, that an element has a priority - **Priority** is a value of totally ordered universe $K$. The smaller this value, the more important it is - PrioQueue has an inserting-method and a removing-method - removing-method always removes the smallest element from the queue ### Trait Definition: ```scala trait MyPrioQueue [K: Ordering]: // In a PrioQueue we store elements from a totally // ordered universe. // Precondition: None // Effect: The element key is added to the priority queue. // Result: None def insert (key: K): Unit // Precondition: The queue is non-empty. // Effect: A smallest element is removed from the priority queue. // Result: A smallest element of the priority queue is returned. def extractMin(): K // Precondition: None // Effect: None // Result: true is returned if and only if // the priority queue has no elements. def isEmpty: Boolean ``` - The elements have to be of a totally ordered universe - The class `Ordering` ensures this - variant `PairPrioQueue` stores key-value pairs ### Signature change to: ```scala def insert (key: K, value: V): Unit def extractMin(): V ``` - `extractMin()` removes a value *v* whose key is minimal among all other keys - We will consider the simpler case where the keys are the values ## 14.1.1 Sorting with priority queues - **Algorithm**: ```scala def prioQueueSort [K: Ordering] (array: Array [K]): Unit val n = array.length val pq: MyPrioQueue [K] = PrioQueueImplementation [K] () for i < to n-1 do pq.insert(array(i)) for i < to n-1 do array(i) = pq.extractMin() ``` - The algorithm takes an array as input - It creates an empty priority queue (using an arbitrary implementation) - It inserts all elements of the input array into the priority queue - It removes element step-by-step from the priority queue and inserts them in increasing order in the array - This results in a new general sorting algorithm ## 14.1.2 LinkedNodes Implementation - Implement the priority queue using a linear ordering of linked nodes - Two different ways: ### Variant 1 - Elements are stored in increasing order - `extractMin()` is easy and costs $O(1)$ time - `insert()` costs $O(n)$ in worst case ### Variant 2 - Elements are stored in arbitrary order - `insert()` is easy and costs $O(1)$ time - `extractMin()` costs $O(n)$ in worst case ### The dummy-node technique - Used to make it easy to `extractMin()` (the smallest element) and hard to `insert()` - Instead of `head` and `tail` there is an `anchor`, that stores the reference to the dummy node - The dummy node is not stored, it only exists to indicate if the list is empty - The dummy node is never removed - This makes the implementation more easier ## 14.2 Array Implementation - The linked nodes implementation has the disadvantage, that either `insert()` or `extractMin()` is expensive - The caching behaviour is bad ### Array implementations are: - more complex - have better running times - have a better caching behaviour ### To implement using array, we need the concept of **binary trees** #### Node Definition: ```scala private class Node (parent: Node, elem A, lchild Node, rchild Node) ``` ## 14.2.1 Binary Trees - Non-linear structure that consists of nodes. - Nodes can store elements. - Each node has a parent node and at most two child nodes: *left* and *right* - Node whose parent is `null` is called **root** - Node that has no children is called **leaf** - Node that has at least one child is called **inner node** ## 14.2.2 Binary Heaps - Binary min-heap $T$ is a binary tree whose elements are from a totally ordered universe - The elements satisfy the following two properties: - **ordering property**: The values of the nodes are smaller or equal to the values of the corresponding child nodes - **completeness property**: Let $h$ be the height of $T$, the levels $0...h-1$ are completely filled and the leafs in level $h$ are left-aligned - **Lemma 3**: The minimum is saved in root - **Lemma 4**: Let $n$ be the size and $h$ the height of a min-heap. Then we have $h∈O(logn)$ ### Array-representation - Binary heap can be represented as an array - position $0$: root - position $1-2$: children of the root - position $3-6$: children of the children of the root - ... - positions $2^i -1$ to $2^{i+1} -2$: level $i$ - $i=0..., h-1$ - Level $h$ is left-aligned, so the elements are stored in positions $2^h -1$ till end #### Formula for calculating positions: - Position $i$: element in the heap - `parent(i)`: position of the parent of $i$ in the array - `lchild(i)`: position of the left child of $i$ in the array - `rchild(i)`: position of the right child of $i$ in the array ```scala private def parent (i: Int): Int = (i - 1)/2 private def lchild(i: Int): Int = 2* i + 1 private def rchild(i: Int): Int = 2* i + 2 ``` - Although Binary Heaps can be implemented using linked nodes, the array representation is used. ## 14.2.3 Header ```scala class BinaryHeap [K: Ordering ClassTag] (capacity: Int) extends MyPrioQueue [K]: private val ord summon [Ordering [K]] import ord.mkOrdering0ps private val array Array [K] = new Array [K] (if capacity 2 then 2 else capacity) private var last Int = -1 private def parent (i: Int): Int = (i - 1)/2 private def lchild(i: Int): Int = 2* i + 1 private def rchild(i: Int): Int = 2* i + 2 private def swap(i: Int, j: Int): Unit = val temp = array(i) array(i) = array (j) array(j) = temp def isEmpty Boolean = last == -1 ``` - `last`: index of the right-most element in level $h$ - `isEmpty`: if `last == -1` - `swap(i,j)`: swaps the elements at positions $i$ and $j$ in the array ## 14.2.4 Insertion - Assumption: Array is not full yet - Insert the element to the right of the last position - If level $h$ is completely full, insert the new element to the left in level $h+1$ ```scala def insert (key: K) Unit = if last == array.size - 1 then throw Exception("The heap is full.") last = last + 1 array (last) = key bubbleUp(last) ``` - The completeness property is still satisfied - The ordering property might be violated - The element might have to be in root - `bubbleUp()` moves the new element upwards until the ordering property is fulfilled (parent is smaller or equal) ```scala private def bubbleUp(i: Int) : Unit = if array (parent(i)) > array(i) then swap(i, parent(i)) bubbleUp(parent(i)) ``` - If the ordering property is violated, we swap the element at positon $i$ with its parent and recursively bubble up the element - If $i==0$ (root), the procedure terminates as well ### Running time - Worst case: The element has to be moved from leaf to root - Cost is $O(h)$, where $h$ is the height of the heap - Lemma 2: $h∈O(logn)$ - In conclusion, the running time of the insertion is $O(logn)$. ##14.2.5 Deletion - Store `array(0)` temporarily - Remove it from the array - Swap the last element with the root ```scala def extractMin(): K = if isEmpty then throw Exception("The heap is empty.") val result = array(0) array(0) = null.asInstanceof [K] swap(0, last) last = last - 1 bubbleDown (0) result ``` - The completeness property is satisfied - The ordering property might be violated - The large element might be at the root - `bubbleDown()` moves the large element downwards as long as there is at least one smaller child - This is done by checking both children - This is more complex than `bubbleUp()` which only checks parent ```scala private def bubbleDown(i : Int): Unit = if lchild(i) <= last && array (lchild(i)) < array(i) || rchild(i) <= last && array (rchild(i)) < array(i) then val child if rchild(i) <= last && array(rchild(i)) < array(lchild(i)) then rchild(i) else lchild(i) swap(i, child) bubbleDown(child) ``` - If the element has no children, the method terminates - Otherwise it checks if at least one child is smaller. If so, swaps the element with the smaller child - The function recurses on the child ### Running time - Worst case: Element has to be bubbled down to leaf - Cost is $O(h)$ - Lemma 2: $h∈O(logn)$ - In conclusion, the running time of `extractMin()` is also $O(log n)$ ## 14.3 Sorting with a PrioQueue - **Recall sorting Algorithm**: ```scala def prioQueueSort [K: Ordering] (array Array [K]): Unit = val n = array.length val pq: MyPrioQueue [K] = PrioQueueImplementation [K] () for i < to n-1 do pq.insert(array(i)) for i < to n-1 do array(i) = pq.extractMin() ``` - The running time of this algorithm depends on the concrete implementation - Let $T_{in}(n)$ be the running time of the insert-method and $T_{out}(n)$ be the running time of the remove-method - The running time $T(n)$ can be determined: - $T(n) = T_{in}(0) + T_{in}(1) +...+ T_{in}(n-2)+T_{in}(n - 1) + T_{out}(n)+T_{out}(n-1)+...+ T_{out}(2) + T_{out}(1)$ - $T(n)=∑_{i=0}^{n-1} T_{in}(i) + ∑_{i=1}^{n} T_{out}(i) = ∑_{i=1}^{n} (T_{in}(i-1)+T_{out}(i)) ≤ n. (T_{in}(n) + T_{out}(n))$ - $T(n) ≤ n (T_{in}(n) + T_{out}(n))$ ## 14.3.1 Insertionsort and Selectionsort - Consider the first variant of the linked-nodes implementation - $T_{in}(n) ∈ O(n)$ and $T_{out}(n) ∈ O(1)$ - Therefore $T(n) ∈ O(n²)$ - This algorithm is the insertion sort (in some sense) - Consider the second variant of the linked-nodes implementation - $T_{in}(n) ∈ O(1)$ and $T_{out}(n) ∈ O(n)$ - Therefore $T(n) ∈ O(n²)$ - This algorithm is the selection sort algorithm (in some sense) ## 14.3.2 Heapsort - The implementation of the priority queue using a binary heap. - The algorithm has the name heapsort - We know, that both insertion and deletion have running times $O(logn)$ - **Theorem 2**: The sorting algorithm heapsort has worst-case running time $O(nlogn)$ ### Other important properties of sorting algorithms: - **Memory usage**: Heapsort uses a linear amount of space and is therefore, out-of-place - **Stability**: Heapsort is not stable because the heap-structure removes any ordering of elements of the same value. ## 14.3.3 In-place Heapsort - To avoid copying the elements, we will only use the two bubbling processes to work directly on the input array ### Algorithm: ```scala def heapsort [K: Ordering] (array: Array [K]): Unit = // [...] val n = array.length for last < 1 to n-1 do bubbleUp (last) for last <- n-1 to 1 by 1 do swap(0,last) bubbleDown(0, last-1) ``` - We interpret the input array step-by-step as heap (last increases) - We retrieve the ordering property until the entire array is a heap - We swap the minimum with the last element - We retrieve the ordering property of the remaining heap ### Running time: - The algorithm runs in $O(nlogn)$ time. - It is an unstable, in-place sorting algorithm

Use Quizgecko on...
Browser
Browser