CE204 Data Structures and Algorithms - Lecture Notes PDF

Summary

These are lecture notes for the CE204 Data Structures and Algorithms course at the University of Essex, taught by David Richerby in the Autumn term of 2024-25. The notes cover topics such as self-balancing trees and AVL trees. These notes cover topics such as data structures and algorithms.

Full Transcript

CE204 Data Structures and Algorithms David Richerby University of Essex Autumn term 2024–25 David Richerby CE204 Data Structures and Algorithms 1 / 278 Unit 6: Self-Balancing Trees David Richerby CE2...

CE204 Data Structures and Algorithms David Richerby University of Essex Autumn term 2024–25 David Richerby CE204 Data Structures and Algorithms 1 / 278 Unit 6: Self-Balancing Trees David Richerby CE204 Data Structures and Algorithms 225 / 278 Binary search trees – refresher Each node stores a value. Values in left subtree are smaller; values in the right subtree are bigger. 5 2 8 4 6 12 3 Insertion, deletion, membership queries all take time O(height). In a typical binary search tree, height ≈ log n. In the worse case, height = n. David Richerby CE204 Data Structures and Algorithms 226 / 278 Balanced binary trees A binary search tree is balanced if: number of nodes in the left and right subtrees of every node differ by at most 1. 5 5 4 2 8 3 8 2 8 4 6 12 2 4 6 12 3 6 12 3 5 unbalanced balanced unbalanced A balanced binary tree with n nodes has height exactly 1 + blog2 nc. David Richerby CE204 Data Structures and Algorithms 227 / 278 The cost of keeping BSTs balanced Inserting into a balanced BST may unbalance it. 3 3 4 1 5 insert 7 1 5 rebalance 2 6 2 4 6 2 4 6 1 3 5 7 7 This is the only balanced BST storing 1–7. Every vertex has moved and every vertex except 7 now has a different parent! This shows rebalancing takes time Θ(n) – too expensive. David Richerby CE204 Data Structures and Algorithms 228 / 278 AVL-balance A binary search tree is AVL-balanced if: the heights of every node’s left and right subtrees differ by at most 1. Note: the height of a subtree with no vertices is 0. 5 5 4 2 8 3 8 2 8 4 6 12 2 4 6 12 3 6 12 3 5 (unbalanced) (balanced) (unbalanced) AVL-unbalanced AVL-balanced AVL-balanced AVL = Adelson-Velski and Landis, the two inventors. David Richerby CE204 Data Structures and Algorithms 229 / 278 Balance factors To track whether a BST is AVL-balanced, define each node’s balance factor as BF(x) = height(x.right) − height(x.left). For AVL-balanced trees, BF(x) ∈ {−1, 0, +1} for every node x. If BF(x) < 0, we say x is left-heavy – its left subtree is taller. If BF(x) > 0, we say x is right-heavy – its right subtree is taller. David Richerby CE204 Data Structures and Algorithms 230 / 278 Updating balance factors Balance factors can be stored in tree nodes and updated after inserting. In this diagram, the numbers are balance factors, not node values. (+1) (+1) (−1) (0) (−1) (0) (0) David Richerby CE204 Data Structures and Algorithms 231 / 278 Updating balance factors Balance factors can be stored in tree nodes and updated after inserting. In this diagram, the numbers are balance factors, not node values. (+1) (+1) (−1) (0) (0) (0) (0) (0) Updates only needed on the path up to the root. Updates stop when we set a balance factor to zero. David Richerby CE204 Data Structures and Algorithms 231 / 278 Updating balance factors Balance factors can be stored in tree nodes and updated after inserting. In this diagram, the numbers are balance factors, not node values. (+1) (+1) (0) (0) (0) (+1) (0) (0) (0) Updates only needed on the path up to the root. Updates stop when we set a balance factor to zero. David Richerby CE204 Data Structures and Algorithms 231 / 278 When the tree stops being AVL-balanced The insertions we just saw kept every node’s balance factor in {−1, 0, +1}. How can an insertion produce a balance factor outside {−1, 0, +1}? Either the left subtree of a left-heavy node became taller; or the right subtree of a right-heavy node became taller. (+1) h h+1 David Richerby CE204 Data Structures and Algorithms 232 / 278 When the tree stops being AVL-balanced The insertions we just saw kept every node’s balance factor in {−1, 0, +1}. How can an insertion produce a balance factor outside {−1, 0, +1}? Either the left subtree of a left-heavy node became taller; or the right subtree of a right-heavy node became taller. (+2) h h+2 David Richerby CE204 Data Structures and Algorithms 232 / 278 Rebalancing unbalanced trees (1) Suppose we’re updating balance factors and x is the first node where we set BF = +2. x (+2) y h h+2 David Richerby CE204 Data Structures and Algorithms 233 / 278 Rebalancing unbalanced trees (1) Suppose we’re updating balance factors and x is the first node where we set BF = +2. x (+2) y h h+1 Look more closely at y ’s subtrees. Suppose we inserted into the right subtree: it now has height h + 1. David Richerby CE204 Data Structures and Algorithms 233 / 278 Rebalancing unbalanced trees (1) Suppose we’re updating balance factors and x is the first node where we set BF = +2. x (+2) y h h h+1 y ’s left subtree must have height h. If not, we would have set BF(y ) = 0 and stopped updating. David Richerby CE204 Data Structures and Algorithms 233 / 278 Rebalancing unbalanced trees (1) Suppose we’re updating balance factors and x is the first node where we set BF = +2. x (+2) y h T1 h h+1 T2 T3 Nodes in T1 have value v with v < x Nodes in T2 have value v with x < v < y. Nodes in T3 have value v with v > y David Richerby CE204 Data Structures and Algorithms 233 / 278 Rebalancing unbalanced trees (1) Suppose we’re updating balance factors and x is the first node where we set BF = +2. Left rotation: y x h+1 h h T1 T2 T3 Nodes in T1 have value v with v < x Nodes in T2 have value v with x < v < y. Nodes in T3 have value v with v > y David Richerby CE204 Data Structures and Algorithms 233 / 278 Rebalancing unbalanced trees (1) Suppose we’re updating balance factors and x is the first node where we set BF = +2. Left rotation: y (0) x (0) h+1 h h T1 T2 T3 We have a valid BST and BF(y ) = 0 so the tree is AVL-balanced again. If x had had BF −2, we would do a right rotation – it’s a mirror image. David Richerby CE204 Data Structures and Algorithms 233 / 278 Rebalancing unbalanced trees (2) But what if it was y ’s left subtree that had height h + 1? x (+2) y h T1 h h+1 T3 T2 David Richerby CE204 Data Structures and Algorithms 234 / 278 Rebalancing unbalanced trees (2) But what if it was y ’s left subtree that had height h + 1? Left rotation: y x h h T3 h+1 T1 T2 Let’s try left rotation, again. David Richerby CE204 Data Structures and Algorithms 234 / 278 Rebalancing unbalanced trees (2) But what if it was y ’s left subtree that had height h + 1? Left rotation: y (−2) x (+1) h h T3 h+1 T1 T2 Left rotation has failed: now y ’s balance factor is bad instead of x’s. David Richerby CE204 Data Structures and Algorithms 234 / 278 Rebalancing unbalanced trees (2) But what if it was y ’s left subtree that had height h + 1? x (+2) y h T1 h h+1 T3 T2 We must look at y ’s left subtree T2 more closely. David Richerby CE204 Data Structures and Algorithms 234 / 278 Rebalancing unbalanced trees (2) But what if it was y ’s left subtree that had height h + 1? x (+2) y z h T1 h T3 T4 T5 One of T4 and T5 has height h; the other has height h − 1. (They both used to have height h − 1 but we just inserted into one of them to cause the imbalance.) David Richerby CE204 Data Structures and Algorithms 234 / 278 Rebalancing unbalanced trees (2) But what if it was y ’s left subtree that had height h + 1? x (+2) z y h T1 T4 h T5 T3 Perform a right-rotation at y Now, x’s right child is right-heavy, as before. We can make a left rotation. David Richerby CE204 Data Structures and Algorithms 234 / 278 Rebalancing unbalanced trees (2) But what if it was y ’s left subtree that had height h + 1? z x y h T1 T4 T5 T3 This combined operation is a right-left rotation. David Richerby CE204 Data Structures and Algorithms 234 / 278 Rebalancing unbalanced trees (2) But what if it was y ’s left subtree that had height h + 1? z (0) (0 or −1) x y (+1 or 0) h T1 T4 T5 T3 One of T4 and T5 has height h − 1. Either BF(x) = 0 and BF(y ) = +1 or BF(x) = −1 and BF(y ) = 0. In both cases, BF(z) = 0 so the tree is AVL-balanced again. David Richerby CE204 Data Structures and Algorithms 234 / 278 AVL rebalancing The ±2 node is: left-heavy right-heavy (−2) (+2) Its left child is: Its right child is: left-heavy right-heavy left-heavy right-heavy (−1) (+1) (−1) (+1) Right Left-right Right-left Left rotation rotation rotation rotation Right rotation and left-right rotation are the mirror image of the previous slides. David Richerby CE204 Data Structures and Algorithms 235 / 278 AVL tree insertion Insert as in an ordinary BST. Recalculate balance factors up path from the insertion point towards the root. Stop if a node’s new balance factor is 0. If a balance factor becomes ±2, rotate and stop. Running time: Insert takes time O(height). Recalculating balance factors takes time O(height). Single rotation (left or right) requires adjusting at most four nodes’ references: takes time O(1). Left-right rotation is a left rotation followed by right. David Richerby CE204 Data Structures and Algorithms 236 / 278 Height of AVL trees It can be shown that an AVL-balanced tree has height Θ(log n). The height Θ(n) worst-case of ordinary BSTs is impossible in an AVL tree. Therefore, insert runs in time O(log n). Search is also O(height) = O(log n). Deletion can also be done in time O(log n), using rotations appropriately. David Richerby CE204 Data Structures and Algorithms 237 / 278 AVL trees – example Start with an empty AVL tree and insert 2, 23, 8, 33, 45, 11, 17. 2 David Richerby CE204 Data Structures and Algorithms 238 / 278 AVL trees – example Start with an empty AVL tree and insert 2, 23, 8, 33, 45, 11, 17. 2 23 David Richerby CE204 Data Structures and Algorithms 238 / 278 AVL trees – example Start with an empty AVL tree and insert 2, 23, 8, 33, 45, 11, 17. 2 23 8 Right-left rotation David Richerby CE204 Data Structures and Algorithms 238 / 278 AVL trees – example Start with an empty AVL tree and insert 2, 23, 8, 33, 45, 11, 17. 2 8 23 Right-left rotation David Richerby CE204 Data Structures and Algorithms 238 / 278 AVL trees – example Start with an empty AVL tree and insert 2, 23, 8, 33, 45, 11, 17. 8 2 23 Right-left rotation David Richerby CE204 Data Structures and Algorithms 238 / 278 AVL trees – example Start with an empty AVL tree and insert 2, 23, 8, 33, 45, 11, 17. 8 2 23 33 David Richerby CE204 Data Structures and Algorithms 238 / 278 AVL trees – example Start with an empty AVL tree and insert 2, 23, 8, 33, 45, 11, 17. 8 2 23 33 45 Left rotation David Richerby CE204 Data Structures and Algorithms 238 / 278 AVL trees – example Start with an empty AVL tree and insert 2, 23, 8, 33, 45, 11, 17. 8 2 33 23 45 Left rotation (at 23, not 8!) David Richerby CE204 Data Structures and Algorithms 238 / 278 AVL trees – example Start with an empty AVL tree and insert 2, 23, 8, 33, 45, 11, 17. 8 2 33 23 45 11 Right-left rotation David Richerby CE204 Data Structures and Algorithms 238 / 278 AVL trees – example Start with an empty AVL tree and insert 2, 23, 8, 33, 45, 11, 17. 8 2 23 11 33 45 Right-left rotation David Richerby CE204 Data Structures and Algorithms 238 / 278 AVL trees – example Start with an empty AVL tree and insert 2, 23, 8, 33, 45, 11, 17. 23 8 33 2 11 45 Right-left rotation David Richerby CE204 Data Structures and Algorithms 238 / 278 AVL trees – example Start with an empty AVL tree and insert 2, 23, 8, 33, 45, 11, 17. 23 8 33 2 11 45 17 David Richerby CE204 Data Structures and Algorithms 238 / 278 AVL trees – summary Balance factor = (height of right) − (height of left). AVL trees are BSTs where BF(x) ∈ {−1, 0, +1} for all nodes x. Balance factor is stored in nodes and updated on inserts. Unbalanced nodes require rotations. Insert, search, delete in time O(log n). Avoids the Θ(n) worst-case of BSTs. David Richerby CE204 Data Structures and Algorithms 239 / 278