Queues and Stacks PDF
Document Details

Uploaded by ObtainableHyperbolic
Freie Universität Berlin
Tags
Summary
This document provides an introduction to queues and stacks, which are fundamental data structures in computer science. It explains how they are implemented using traits in Scala. Detailed explanations for concepts like working person and learning person traits are given.
Full Transcript
# Chapter 13: Queues and Stacks ## 13.1 Traits An abstract class is a class that has at least one unimplemented method. Although abstract classes have constructors and destructors like concrete classes, we are not able to create instances of abstract classes. Instead, abstract classes provide temp...
# Chapter 13: Queues and Stacks ## 13.1 Traits An abstract class is a class that has at least one unimplemented method. Although abstract classes have constructors and destructors like concrete classes, we are not able to create instances of abstract classes. Instead, abstract classes provide templates for subclasses. If a class inherits from an abstract class, it either must implement the abstract methods or must be declared as abstract as well. The signature of the implementation in the subclass must fit to the signature in the abstract super class. For example, the class Person should be implemented as abstract, since the method work() depends on the concrete subclasses: students work differently than lectures, and both work differently than physicians, etc. ```scala abstract class Person(val name: String, var age: Int): def isGrownUp: Boolean = age >= 18 def work(): Unit // abstract method class Studi(name: String, age: Int, val matr: Int) extends Person(name, age): def mat_nr = matr def work(): Unit = println("Study, study") // "override" not needed anymore ``` Now, by inclusion polymorphism we can still write the following assignment: ```scala var s: Person = Studi("Maaax", 37, 4164581) // var p: Person = Person("Maaax", 37) does not compile work Person is abstract ``` However, since Java and Scala do not allow multiple inheritance, we cannot extend classes from more than one (abstract) class. Let us consider the following situation: - Students and Lecturers are both persons. - Students are persons that learn. - Lecturers are persons that learn (by researching) but also work (by giving lectures) - they earn money. - Not all students earn money by working. Hence, not all of them work. If we want to model this behavior, we need a new concept in object-oriented programming: the trait. A trait can be seen as an abstract class - it usually has abstract methods - but it is more seen as a property that we *mix in* different classes. For example, we can mix the properties *working person* and *learning person* into lecturers. Like (abstract) classes, traits define a new type, e.g., Ordering is implemented as trait¹. In *most of the cases*, traits only have unimplemented methods. Nevertheless, it is technically possible to implement attributes, concrete methods and even constructors within traits. Moreover, traits can inherit from other traits. A class/trait should only inherit multiple traits if the traits do not have concrete methods of the same signature, otherwise, we would not be able to decide which implementation is the one we need.2 Concerning our example from above, we can do as follows: ```scala trait WorkingPerson: def work(): Unit def salary: Int trait LearningPerson: def learn(topic: String): Unit ``` These are two different traits: the working person and the learning person. Moreover, we have a class that can be used as super class: 1. [https://dotty.epfl.ch/api/scala/math/Ordering.html](https://dotty.epfl.ch/api/scala/math/Ordering.html) 2. Technically, it is possible to inherit from different traits that have implemented methods of the same signature, but then we have to override this method in the class/trait that mixes in the traits. However, this is bad programming style and should be avoided. ```scala class Person(private val name: String, private var age: Int): def isGrownUp: Boolean = age >= 18 override def toString: String = name ``` Now, a student is a person having the property to be a learning person. Hence, Studi inherits from Person and mixes the trait LearningPerson in. Syntactically, it looks like this: ```scala class Studi(name: String, age: Int, mat_nr: Int) extends Person(name, age), LearningPerson: def matrikel_nr: Int = mat_nr def learn(topic: String): Unit = println("I learn about " + topic) ``` Finally, the class for the lecturer looks like this: ```scala class Prof(name: String, age: Int, _salary: Int) extends Person(name, age), WorkingPerson, LearningPerson: def salary: Int = _salary def learn(topic: String): Unit = println("I research about " + topic) def work(): Unit = println("Hold monologue.") ``` A small test looks as follows: ```scala var p: LearningPerson = Studi("Wolfgang", 22, 3426167) p.learn("Traits") p = Prof("Lena", 42, 12) p.learn("Super-Traits") ``` ## 13.2 Abstract Data Types An abstract data type is used to describe the external functionality and the externally visible properties of certain objects and abstracts from the concrete way in which this functionality is implemented. The implementation of an abstract data type is then called data structure. An abstract data type is given by its specification: - **(i)** specification of the stored data - **(ii)** indication of the operations to performed on the data and their specification The great advantage of this approach is the abstraction. The user uses the interface (the specification) without knowing how it works internally. The developer can provide any implementations and subsequently provide the user with better implementations who does not have to change his own program. In Scala, we can use the concept **trait** to describe the abstract data types. The different implementations will then mix in this trait. In our class, we will consider four different abstract data types and discuss the different implementations: Stack, Queue, PriorityQueue, and Dictionary/Map. The four data types can be implemented either with arrays or with so-called linked nodes. We will then analyse and discuss the advantages and disadvantages of these implementations. ## 13.3 Stacks First, we consider the abstract data type Stack. In a stack, we can store elements of arbitrary types. We can insert and remove elements. The stack is a so-called Last-In-First-Out-storage (LIFO-principle), i.e., we always remove the most recent element of the stack. The specification looks as follows: ```scala trait MyStack[A]: // Objects of an arbitrary but fixed type A are stored in a stack. // Precondition: None // Result: None // Effect: I is now the most recent element in the stack. def push(x: A): Unit // Precondition: Stack is not empty. // Result: The most recent element is returned. // Effect: The most recent element is removed from the stack. def pop(): A // Precondition: Stack is not empty. // Result: The most recent element is returned. // Effect: None def top: A // Precondition: None // Effect: None // Result: true is returned if and only if the stack has no elements. def isEmpty: Boolean // Precondition: None // Effect: None // Result: The number of elements in the stack is returned. def size: Int ``` We can imagine a deck of cards or a stack of books: we can put elements on the top of the stack or remove the top element, but we are not allowed to remove (or even read) an element somewhere in the middle of the stack. Stacks are used for recursive programs: whenever we call a function, we put this function onto the stack. In primitive recursion, this stack grows until we reach a recursion anchor, than the top function is removed. Another example is the Back-button on your internet browser. You go back to the site you have visited before. In graph theory, we use stacks as data structures to implement depth-first-search - a very important algorithm³. This, is how we can imagine an empty stack or a stack with 3 elements: - -5 - 17 - 42 ### 13.3.1 LinkedNodes implementation First of all, we want to consider an implementation of the stack using linked nodes. Elements are stored in nodes, which also contain a reference/link to the node of the next element. In Scala, these nodes are represented by an inner class called Node - a private class that is only visible within the class LinkedNodesStack. ```scala private class Node(val item: A, val next: Node) ``` Moreover, in the header, we store the actual size of the stack as well as a reference to the highest (i.e., recent) element. The header corresponds to the private attributes of LinkedNodesStack. ```scala class LinkedNodesStack[A] extends MyStack[A]: // The inner class: private class Node(val item: A, val next: Node) // Header: private var topNode: Node = null private var size: Int = 0 // ... 7 ``` Here, we can see how an empty stack or a stack with three elements would look like: **this:** - size: 0 - topNode: **this:** - size: 3 - topNode: - -5 - 17 - 42 Hence, we can now easily implement the two isEmpty and size as follows: *"Spoiler alert: You learn this in „Discrete Structures" and "Algorithms and Data Structures"* ```scala class LinkedNodesStack[A] extends MyStack[A]: // [...] def isEmpty: Boolean = topNode == null def size: Int = size // [...] ``` In the push-method, we create a new Node object that stores the item and put the new object at the front of the linked list. The node that was previously referenced by the head now becomes the second node in the linked list. Finally, we increment the size. ```scala class LinkedNodesStack[A] extends MyStack[A]: // [...] def push(x: A): Unit = topNode = Node(x, topNode) _size = size + 1 // [...] ``` In the pop- and top-method, we first check whether the stack is non-empty. If yes, we temporarily store the element in the top node and return it at the end. In the pop-function, we update the variable topNode to the next node in the order and decrement the size.. ```scala class LinkedNodesStack[A] extends MyStack[A]: // [...] def pop(): A = if (!isEmpty) then val result: A = topNode.item // If the top node was simultaneously the last node // in the chain, topNode becomes null. topNode = topNode.next _size = size - 1 result else throw Exception("Stack is empty") def top: A = if (!isEmpty) then topNode.item else throw Exception("Stack is empty") ``` ### 13.3.2 Array implementation Next, we consider an implementation of the stack using arrays of static size. The header of the implementation contains two different variables. The elements are stored in an array. The array contains the elements of the stack in order from the oldest element to the most recent element (the oldest element is stored at index 0, the second most oldest element is at index 1, and so on). The other variable, called amount, stores the number of elements that are stored in the stack. Hence, if the stack has at least one element, the most recent element is stored at position amount-1. Since we use an array of static size, the stack has an upper bound of elements, which is called capacity. This parameter has to be passed to the constructor of the ArrayStack. ```scala class ArrayStack[A: ClassTag](capacity: Int) extends MyStack[A]: // Header: private val n: Int = if (capacity < 1) then 1 else capacity private val array: Array[A] = new Array[A](n) private var amount: Int = 0 // [...] ``` You might notice that we declared the type A to be an instance of ClassTag. The reason for that lies deep in the implementation of Scala, Java and the JVM and coincides with the fact that we created an array with elements of arbitrary type. We do not want to go into detail here. Instead we notice that, whenever we want to create an array of arbitrary type A, this type has to be an instance of ClassTag and we have to import scala.reflect.ClassTag. If you are curious about this, you might consult your favourite AI or the Scala documentation4. Here, we can see how an empty stack or a stack with three elements would look like: **this:** - amount: 0 - array: 0 1 2 3 4 **this:** - amount: 3 - array: 42 17 -5 0 1 2 3 4 In the push-method, we first check, whether we have not reached the maximum capacity. If not, we store the element at the correct position and increase the amount. ```scala class ArrayStack[A: ClassTag](capacity: Int) extends MyStack[A]: // [...] def push(x: A): Unit = if (amount < array.length) then array(amount) = x amount = amount + 1 else throw new Exception("The stack is full") // [...] ``` In the pop-method, we first check whether the stack is non-empty. If yes, we temporarily store the element at position amount-1 and return it at the end. After that, we write null to the position of the most recent element and decrease the amount afterwards 4. [https://dotty.epfl.ch/api/scala/reflect/ClassTag.html](https://dotty.epfl.ch/api/scala/reflect/ClassTag.html) [https://users.scala-lang.org/t/classtag-for-parametrized-array/3894](https://users.scala-lang.org/t/classtag-for-parametrized-array/3894) Otherwise the garbage collection would not remove the object. ```scala class ArrayStack[A: ClassTag](capacity: Int) extends MyStack[A]: // [...] def pop(): A = if (!isEmpty) then val result: A = array(amount - 1) array(amount - 1) = null.asInstanceOf[A] amount = amount - 1 result else throw new Exception("Stack is empty") def top: A = if (!isEmpty) then array(amount - 1) else throw new Exception("Stack is empty") ``` **Running times**. In both implementations, all methods work in constant time in terms of the O-notation. We do not have any loops or recursions or expensive calls of functions. However, if we take a closer look into the technical issues, we will observe that arrays have a much better caching performance as linked nodes. The reason is that elements stored in an array are always next to each and can be accessed faster than linked nodes that are spread among the entire program memory. In terms of running time, the array implementation is better. *"You will learn about this in higher classes"* **Memory**. The memory of the linked nodes implementation only depends on the number of elements in the stack. Moreover, the memory size changes dynamically with the number of elements: the more elements in the stack the higher the memory consumption and vice versa, because for each element in the stack we only store a reference to this element and a reference to a next node. The memory of the array implementation does not depend on the number of elements, it depends only on the capacity of the underlying array. This results in two problems: First, the ratio between the array size and the number of elements could be quite large: for example, our stack could store up to 10,000,000 elements, but the application only needs 10 - 15 - a lot of wasted space. The second problem is discussed under restrictions. In terms of memory, the linked nodes implementation is better. **Complexity of implementations**. In the complexity of implementation, the two implementations are not much different. They neither use complex concepts or functions and they have both more or less a comparable number of source code lines. **Restrictions**. The linked nodes implementation does not have any restrictions. The array implementation, however, does. The stack has a fixed capacity. Hence, the stack is not able to grow/shrink dynamically. ### 13.3.4 Arrays of dynamic size Although the running time of the array implementation is better due to the better caching performance, it has the big problem that it cannot shrink or grow dynamically - the array has static size. We can get rid of this problem by using an array of dynamic size. For this, we consider the two values amount and array.length of the original implementation. During the whole life time of a stack with arrays of dynamic size, we maintain the following invariant: ```scala array.length/4 <= amount < array.length ``` Whenever we call push or pop, we check whether the invariant has been violated. If this is the case, we create a new array whose size is either half the size of the old array (if array.length/4 > amount) or double the size of the old array (if amount>= array.length), copy all elements from the old to the new array and remove the old array. We achieve this by a private help function resize(), which is then used in the functions push and pop. The process can be seen in the following figure, where the shaded regions illustrate the amount of elements that are actually stored in the array. ```scala class DynamicArrayStack[A: ClassTag] extends MyStack[A]: // Header: private var array: Array[A] = new Array[A](1) private var amount: Int = 0 // Precondition: None // Result: None // Effect: (array.length/4 < amount <array.length) holds. private def resize(): Unit = val cap: Int = array.length if (cap/4<amount && amount < cap) then return // do nothing if INV holds // allocate a new array with half or double the size of the old array: val newArray: Array[A] = new Array[A](if (cap/4>amount) then cap/2 else (cap * 2)) // Copy all elements to new array: for (i <- 0 to amount-1) do newArray(i) = array(i) array = newArray // the old array is now removed // [...] ``` Now, we can implement the other two methods: ```scala class DynamicArrayStack[A: ClassTag] extends MyStack[A]: // [...] def push(x: A): Unit = array(amount) = x amount = amount + 1 resize() // Array too small? Resize! def pop(): A = if (!isEmpty) then val result: A = array(amount - 1) array(amount - 1) = null.asInstanceOf[A] amount = amount - 1 resize() // Array too large? Resize! result else throw new Exception("Stack is empty") ``` **Discussion**. Of course, we should now ask, what did we get from this feature and how does it compare to the linked nodes implementation? First of all, we removed any restrictions on the size of the stack. Hence, our stack is now able to grow and shrink dynamically. Moreover, the complexity of the implementation increased due to the help function resize(). Furthermore, since we always check the invariant, we can guarantee that the ratio between the number of elements in the stack and the size of the allocated arrays is always between 0.25 and 1, which is very good and comparable to the linked nodes implementation. Moreover, since we use arrays, the caching performance is much better than for the linked nodes, of course. However, a potential disadvantage is that whenever the invariant is violated, we have to copy ALL elements from one position to another. Hence, sometimes the push- and pop- operations have worst-case running time O(n). BUT: The larger an array is, the more unlikely it is that we have to do an expensive resize. For example, if we made the array larger and copied 1 Million elements, then there will be 1 Million preceding operations, where we do not have to do an expensive resize. We say that the amortized running time of push and pop is O(1). Thus, concerning the running time in terms of O-notation, the dynamic array implementation is not worse than the linked nodes implementation. ## 13.4 Queues Next, we consider the abstract data type Queue. In a queue, we can store elements of arbitrary types. We can insert and remove elements. The queue is a so-called First-In-First-Out-storage (FIFO-principle), i.e., we always remove the oldest element of the stack. The specification looks as follows: ```scala trait MyQueue[A]: // Objects of an arbitrary but fixed type A are stored in a queue. // Precondition: None // Result: None // Effect: z is now the most recent element in the queue. def enqueue(x: A) Unit // Precondition: Queue is not empty. // Result: The oldest element is returned. // Effect: The oldest element is removed from the queue. def dequeue(): A // Precondition: None // Effect: None // Result: true is returned if and only if the queue has no elements. def isEmpty: Boolean // Precondition: None // Effect: None // Result: The number of elements in the queue is returned. def size: Int ``` We can imagine a queue in a well-structured supermarket: the customers that enqueue first at the cash desk will be handled at first. Customers arriving later at the cash desk will be handled later. The operating system makes intensive use of queues, for example, to schedule different processes or for printer spooling. In graph theory, we use queues as data structures to implement breadth-first-search - a very important algorithm that is used to compute shortest paths between two vertices. This is how we imagine an empty queue or a queue with three elements: - 42 - 17 - -5 ### 13.4.1 LinkedNodes implementation Again, we start with an implementation of the queues using linked nodes. Our class is then called LinkedNodesQueue. As before, we have an inner class called Node that stores an element as well as a reference to the next element in the queue. Moreover, in the header, we store the actual size of the stack as well as two references: the head and the tail of the queue, they are initially `null`. ```scala class LinkedNodesQueue[A] extends MyQueue[A]: // The inner class: private class Node(val item: A, var next: Node) // Header: private var head: Node = null private var tail: Node = null private var size: Int = 0 // [...] ``` Here is an example how an empty queue or a queue with three elements might look like: **this:** - size: 0 - tail: - head: **this:** - size: 0 - tail: - head: - 42 - 17 - -5 We skip the implementation of the methods *isEmpty* and *size*, since they are straightforward. Let us first focus on the inserting-method. In the *enqueue* method, we must distinguish two cases: either the queue is empty or not. If the queue is empty, we create a new node with the enqueued element and `null` as successor and let *head* and *tail* reference this node. If otherwise, the queue is non-empty, we enqueue the new node next to the tail (which cannot bu `null`). After that, we update *tail* to the new node. Finally, we increment the size. ```scala class LinkedNodesQueue[A] extends MyQueue[A]: // [...] def enqueue(item: A): Unit = if (!isEmpty) then tail.next = Node(item, null) tail = tail.next else tail = Node(item, null) head = tail size = size + 1 // [...] ``` In the *dequeue()* operation, we first check, whether the queue is empty or not. If not, we throw an exception as usual. If not, we store the item of the head in a temporary variable which is later returned. After that, we update *head* to *head.next*. If this was the last element of the queue (i.e., if *head* is now `null`), we have to set *tail* to `null`. Finally, we decrement the size. ```scala class LinkedNodesQueue[A] extends MyQueue[A]: // [...] def dequeue(): A = if (!isEmpty) then val result: A = head.item head = head.next if (head == null) then tail = null _size = size - 1 result else throw Exception("Queue is empty") // [...] ``` ### 13.4.2 Array implementation Last but not least, we consider an implementation of the queue using arrays of static size. The header of the implementation contains three different variables; we have an array (that stores the elements) and two indices *front* and *back*. The index of the frontmost/first element in the queue is *array(front)*. The elements of the queue are stored in the order of their arrival in the queue, with a possible wrap-around technique. That means the successor of the element in the last position of the array is stored at the first position of the array. Moreover, *back* is the index of the first free slot in the array after the elements of the queue. Since we use an array of static size, the queue has an upper bound of elements, which is called capacity. This parameter has to be passed to the constructor of the ArrayStack. ```scala class ArrayQueue[A: ClassTag](capacity: Int) extends MyQueue[A]: // The header: private val n: Int = if (capacity < 1) then 1 else capacity private val array: Array[A] = new Array[A](n) private var front: Int=0 private var back: Int=0 // [...] ``` Here, we store `array.length` in the variable `n`, since it will make the code a bit more clear. Thus, the array will have the positions 0,...,n-1 and to ensure that *front* and *back* are always between 0 and n-1 we compute modulo n. The queue is interpreted as being empty if and only if *front == back*. In order to avoid ambiguities, we interpret the queue as full if and only if (back + 1) % n == front.. However, this means that *back* always indicates an empty slot in the array, so that there must always exist an empty position in the array. This implies that the capacity of the queue is one less than the number of positions in the array. Such a construction is called *ring buffer* because we imagine that the elements are arranged in a cyclic order. Initially, we have *front == back = 0*, so the queue is empty. Here is an example how an empty queue or a queue with three elements might look like: **this:** - front: 3 - back: 3 - array: 0 1 2 3 4 **this:** - front: 3 - back: 1 - array: -5 0 1 2 3 4 42 17 Next, we consider the *enqueue*-method. First, we check whether the queue is full, i.e., whether (back+1) %n == front. If not, we store the element in the array at position *back* and update the value *back* to the next position in the cyclic ordering. ```scala class ArrayQueue[A: ClassTag](capacity: Int) extends MyQueue[A]: // [...] def enqueue(x: A): Unit = // The next position after back // in the cycling ordering: val newBack: Int = (back + 1) % n if (newBack != front) then // stack is not full?? array(back) = x back = newBack else throw new Exception("The queue is full") // [...] ``` For the *dequeue*-method, we must first check, whether the queue is non-empty. If yes, we store the first element temporarily and return it later. We remove the object from the array by setting the content to `null`. Finally, we move the index *front* one step further in the cycling ordering. ```scala class ArrayQueue[A: ClassTag](capacity: Int) extends MyQueue[A]: // [...] def dequeue(): A = if (!isEmpty) then val result: A = array(front) array(front) = null.asInstanceOf[A] // one position further in the cyclic ordering front = (front + 1) % n result else throw new Exception("The queue is empty") // [...] ``` **Remarks**. Finally, both implementations can be compared according to the known criteria. Furthermore, it is again possible to use an array of dynamic size to implement the queue. Here, one should be careful with the resize()-method. We cannot copy it one to one, because the wrap-around technique could make some trouble here.