Unit 7 - Trees PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document provides an introduction to tree data structures, focusing on the characteristics of trees and binary trees. It also includes details on their representations, various traversal methods, and other relevant concepts.
Full Transcript
Unit 7 - Trees Presentation Instruction: 1. Each member will choose a topic to be discuss using multimedia presentation 2. One multimedia one member presentation 3. No group multimedia presentation 4. First screen picture of presenter 5. Second screen start of topic discus...
Unit 7 - Trees Presentation Instruction: 1. Each member will choose a topic to be discuss using multimedia presentation 2. One multimedia one member presentation 3. No group multimedia presentation 4. First screen picture of presenter 5. Second screen start of topic discussion 6. Last make a best presentation of your topic with good example of each presentation Overview: We have already considered two data structures, the stack and the queue. Both are linear in nature and are represented in the memory either by a sequentially allocated vector (array) or by a linked linear list. In this unit we now focus on two important non-linear structures- the tree and the binary tree. Module Objectives: After successful completion of this unit, you should be able to: Discuss what a tree and its characteristics Discuss the different representation of a tree Discuss what a binary tree and its characteristics Differentiate between a tree and a binary tree Demonstrate the conversion between a tree and a binary tree Discuss the concept of searching using a binary tree Implement your own tree and binary tree application Course Materials: A tree is a nonlinear data structure and is a special form of directed graph. It consist of a finite set of elements called nodes (or vertices) and a finite set of directed arcs called branches (or edge) that connect pairs of nodes with the following properties: a) there is a specially designated node called the root b) the remaining nodes are partitioned into n >= 0 disjoint sets T1, …, Tn where each of these sets is a tree. T1, …, Tn are called the subtrees of the root. Vertices of a tree which have successors are called ‘nonterminal vertices,’ while vertices that have no successors are called ‘terminal vertices’ or ‘leaves.’ Figure 38 Illustrates a tree with root vertex a; two nonterminal vertices a, c; and four terminal vertices b, d, e, f. Trees in which each nonterminal vertex has at most n successors are called ‘n-ary’ trees. Trees in which nonterminal vertex has at most two successors are called ‘binary’ trees. The tree in Figure 38 is an example of an n-ary tree or a General Tree. a b c d e f Figure 38 A tree showing terminal and nonterminal vertices Tree Representation Tree structures may be indicated by parentheses, nesting, or indentation as illustrated in Figure 39 which shows alternative representations of the tree of Figure 38. Figure 39 Alternative representations of tree Tree Vocabulary Node a component of the tree that holds data Edge/branch the line or connection between a parent and child (there may be a weight value associated with the edge) Root top element of the hierarchical structure; the starting element. There is only one root per tree Child a node's immediate successor; a node may have zero or more successors Parent a node's predecessor: if node C is a child of node P, then P is a parent of C; a node has exactly one parent Leaf /Terminal a node that has no children Descendant an extension of the child relationship: If C is a child of P, then C is a descendant of P, or if C is a child of a descendant D then C is a descendant of D Ancestor an extension of the parent relationship: If D is a descendant of A, then A is an ancestor of D. Sibling child node that has the same parent as another node Internal node/ a node that has at least one child non-terminal Path the sequence of edges between a node and one of its descendants Path length the number of edges in a path Depth or level the length of the path connecting a node to the root. The root's depth or level is 0. A level of a tree consists of all nodes at the same depth. The depth of the tree is equal to the height of a tree. Height the longest path length in the tree; the number of edges from the node to the deepest leaf; the height of a tree is the height of the root Subtree the tree made of all descendants using some non-root node Degree The number of subtrees of a node; the number of children of a node Degree of a tree The maximum degree of the nodes in the tree. Forest A set of n >=0 disjoint trees. If you remove the root of a tree you get a forest. If you add a root to a forest, you get a tree. N-ary Tree Traversals Many algorithms pertaining to tree structures usually involve a process in which each node of the tree is “visited”, or processed, exactly once. Such a process is called a traversal. A complete traversal of a tree yields a linear arrangement of the nodes of the tree. Such a linear ordering can be very useful in processing the information stored in the nodes of the tree. There are various ways to visit (traverse or search) a general tree but the two most common ones will be discussed here. The simplest two search techniques are known as Depth-First Search (DFS) and Breadth-First Search (BFS). These two searches are described by looking at how the search tree (representing all the possible paths from the start) will be traversed. Depth-First Search with a Stack In depth-first search we go down a path until we get to a dead end; then we backtrack or back up (by popping a stack) to get an alternative path. Create a stack Create a new choice point Push the choice point onto the stack while (not found and stack is not empty) Pop the stack Find all possible choices after the last one tried Push these choices onto the stack Return Breadth-First Search with a Queue In breadth-first search we explore all the nearest possibilities by finding all possible successors and enqueue them to a queue. Create a queue Create a new choice point Enqueue the choice point onto the queue while (not found and queue is not empty) Dequeue the queue Find all possible choices after the last one tried Enqueue these choices onto the queue Return Binary Tree The binary tree is by far the most important non-linear data structure in computer algorithms. A binary tree is a finite set of nodes which is either empty or consists of a root node and maximum of two disjoint binary trees called left and right subtrees of the root. Note that while a tree may tree must a have at of least one at node, a binary be empty. of Likewise, tree may have any number subtrees; a binary tree can have most two subtrees. Following are examples binary trees. Figure 40 Different binary trees Tree Vocabulary Left subtree The descendants to the left of a node Right subtree The descendants to the right of a node Similar Two binary trees are said to be similar if they have the same structure Equivalent Two binary trees are said to be equivalent if they are similar and contain the same information. Figure 41 Tree1, Tree2, Tree3 are all similar; Tree1, Tree2 are equivalent n-Ary Tree to Binary Tree Conversion One issue with n-ary traversal is that it is quite difficult to search or traverse as there can be uneven number of subtrees in the tree. A way to resolve this is to convert the n-ary tree into a binary tree. With a binary tree we can be assured of the search or traversal as there is only two possible paths to take. Following are the steps in converting a n-ary tree to a binary tree. Step 1 − The left child of a node in the n-ary tree becomes the left child of the node in the binary tree. Step 2 − The next sibling of each node in the n-ary tree becomes the right child of the node in the binary tree. Figure 42 Convert an N-ary Tree to a Binary Tree Types of Binary (based on Structure) Rooted binary tree: It has a root node and every node has at most two children. Full binary tree: It is a tree in which every node in the tree has either 0 or 2 children. The number of nodes, n, in a full binary tree is at least n = 2h – 1, and at most n = 2h+1 – 1, where h is the height of the tree. The number of leaf nodes l, in a full binary tree is number, L of internal nodes + 1, i.e, l = L+1. Figure 43 Full Binary tree Perfect binary tree: It is a binary tree in which all interior nodes have two children and all leaves have the same depth or same level. A perfect binary tree with l leaves has n = 2l-1 nodes. In perfect full binary tree, l = 2h and n = 2h+1 - 1 where, n is number of nodes, h is height of tree and l is number of leaf nodes Figure 44 A perfect binary tree Complete binary tree: It is a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible. Figure 45 An example of a complete binary tree The number of internal nodes in a complete binary tree of n nodes is floor(n/2). Balanced binary tree: A binary tree is height balanced if it satisfies the following constraints: a) The left and right subtrees' heights differ by at most one, AND b) The left subtree is balanced, AND c) The right subtree is balanced An empty tree is height balanced. Figure 4 Balanced binary 6 tree The height of a balanced binary tree is O(Log n) where n is number of nodes. Further discussion about balanced trees is presented in the next unit. Degenarate tree: It is a tree where each parent node has only one child node. It behaves like a linked list. This type of tree is considered a skewed tree. A skewed tree is one with more levels of nodes in one subtree of a node than in the other subtree. Figure 47 A Degenerate tree (skewed to the right) Binary Tree Traversals Recall that tree traversal is a process to visit all the nodes of a tree and may print their values too. Because, all nodes are connected via edges (links) we always start from the root (head) node. That is, we cannot randomly access a node in a tree. Generally, we traverse a tree to search or locate a given item or key in the tree or to print all the values it contains. In a binary tree, there are three different entities involved: a root, its left subtree, and its right subtree. The sequence in which we process these entities defines a particular traversal method. We will consider three such traversal methods, namely: preorder traversal, in order traversal, and post order traversal. These three methods are defined recursively. When a binary tree is empty, it is traversed by doing nothing. 1. Preorder traversal In this traversal method, the root node is visited first, then the left subtree and finally the right subtree. Figure 48 Pre Order Traversal We start from A, and following pre-order traversal, we first visit A itself and then move to its left subtree B. B is also traversed pre-order. The process goes on until all the nodes are visited. The output of pre-order traversal of this tree will be − A→B→D→E→C→F→G Algorithm Until all nodes are traversed − Step 1 − Visit root node. Step 2 − Recursively traverse left subtree. Step 3 − Recursively traverse right subtree. 2. Inorder (or symmetric order) traversal In this traversal method, the left subtree is visited first, then the root and later the right sub- tree. We should always remember that every node may represent a subtree itself. If a binary tree is traversed in-order, the output will produce sorted key values in an ascending order. Figure 49 In Order Traversal We start from A, and following in-order traversal, we move to its left subtree B. B is also traversed in-order. The process goes on until all the nodes are visited. The output of inorder traversal of this tree will be − D→B→E→A→F→C→G Algorithm Until all nodes are traversed − Step 1 − Recursively traverse left subtree. Step 2 − Visit root node. Step 3 − Recursively traverse right subtree. 3. Postorder traversal In this traversal method, the root node is visited last, hence the name. First, we traverse the left subtree, then the right subtree and finally the root node. Figure 50 Postorder Traversal We start from A, and following Post-order traversal, we first visit the left subtree B. B is also traversed post-order. The process goes on until all the nodes are visited. The output of post- order traversal of this tree will be − D→E→B→F→G→C→A Algorithm Until all nodes are traversed − Step 1 − Recursively traverse left subtree. Step 2 − Recursively traverse right subtree. Step 3 − Visit root node. Construction of a Tree Using Traversal Information We have seen in previous discussions how a tree can be presented in various forms. We also have learned of the different traversal methods that one can use in passing through the different nodes of the tree. By using the traversal information, we can re-construct a tree to its original form. To do this, one will need two pieces of information: the in-order traversal, and the pre-order or post- order traversal. With the pre-order or post-order traversal information, one can easily identify the root. The root is the first value or node in a pre-order traversal, and the last value or node for a post- order traversal. On the other hand, with the in-order traversal, we can easily identify which nodes falls on the left-side or on the right-side of a tree, given of course the root node (from the pre-order or post-order traversal). To present clearly, let us consider the binary tree below with its corresponding pre-order, post-order, and in-order traversal information. The tree will be our basis if what we have reconstructed is correct or not. Sample 1: Using the pre-order and in-order traversal information in Figure 51, reconstruct the binary tree. Pre-order: CBAGFDEH In-order: ABCDEFGH First, we begin by analyzing the pre-order traversal information and identify the root among the values. Since in a pre-order traversal, it is the root node that is first visited, we identify the value C to be the root node. We then draw the tree with the information we have as seen below. C Figure 52 Root node Using this information, we proceed with the second process, that is we look into the in- order traversal information and identify the nodes on the left of node C and the nodes on the right of node C. From here, we see that nodes A and B are found on the left of node C and thus form the left-subtree, while nodes D, E, F, G, and H are found on the right of node C forming the right subtree. Let’s take the left subtree of root C first. Here we see that nodes A and B are part of the left subtree. We need to identify which node is the root of the left subtree. For this, we need to go back to the pre-order traversal and identify among nodes A and B the root node. Again, in the pre-order traversal, the first node evaluated is always the root. Thus, we can say that node B is the root of the left subtree. From, the in-order traversal, we see that node A is at the left of node B, so we draw the tree as seen below. C B A Figure 53 Forming the left subtree for root C We check now the right subtree. From amongst nodes D, E, F, G, and H, we need to see using the pre-order traversal information the node that will serve as the root. Among said nodes, we find that G is at the front of the list, thus G serves as the root of the right subtree. We now have a tree with information on the roots of both its left and right subtree as seen in Figure 54. C B G A Figure 54 Roots for the left-subtree and right-subtree of node C Going back to the in-order traversal, we check the remaining nodes with respect to node G. Here we can ascertain that nodes D, E, and F are found on the left side of G, whereas node H is found at the right of G. We then update our tree, adding node H to the picture. C B G A H Figure 55 Updated tree with node H at the right of node G We are now left with nodes D, E, and F. Going back to the pre-order traversal information, we identified node F to be the root of the left subtree of node G. With that, we check back C B G on the inorder traversal and find nodes D and E at A F H the left of node F and thus forming the nodes of the leftsubtree of node F. The updated tree with the node F added can be seen below. Figure 56 Updated tree with node F inserted. Looking back at the pre-order traversal, we notice that node D comes first before node E and thus mean that node D is the root of the subtree of node F. At the in-order traversal, we can discern that node E is at the right of node D. The final tree is shown in Figure 57. C B G A F H D E Figure 57 Reconstructed tree from pre-order and in-order traversal information. Expression Trees An expression tree is a representation of expressions arranged in a tree-like data structure. In other words, it is a tree with leaves as operands of the expression and nodes contain the operators. Similar to other data structures, data interaction is also possible in an expression tree. Expression trees are mainly used for analyzing, evaluating and modifying expressions, especially complex expressions. Expression tree represents an expression as a binary tree. Operators are parent nodes to operands (leaf nodes) for example expression tree for 3 + ((5+9)*2) would be: Figure 58 Expression tree for 3 + ( (5 + 9) * 2) The specified order of tree traversal gives method for evaluating expression. Inorder traversal of expression tree produces infix version of given postfix expression (same with preorder traversal it gives prefix expression) Evaluating the expression represented by expression tree: Let t be the expression tree If t is not null then If t.value is operand then Return t.value A = solve(t.left) B = solve(t.right) // calculate applies operator 't.value' // on A and B, and returns value Return calculate(A, B, t.value) Construction of Expression Tree: In constructing an expression tree we use a stack. We loop through the input expression and do the following for every character. a) If character is operand push that into stack b) If character is operator pop two values from stack make them its child and push current node again. At the end only element of stack will be root of expression tree. Binary Search Tree A "binary search tree" (BST) or "ordered binary tree" is a type of binary tree where the nodes are arranged in order: for each node, all elements in its left subtree are less-or-equal to the node (). Recursively, each of the subtrees must also obey the binary search tree constraint: in the (1, 3, 4) subtree (see Figure 59), the 3 is the root, the 1 3. Figure 59 Example of a Binary Search Tree (BST) The above properties of Binary Search Tree provide an ordering among keys so that the operations like search, minimum and maximum can be done fast. If there is no ordering, then we may have to compare every key to search a given key. Tree Vocabulary Immediate The largest value in the left subtree. The immediate predecessor is located predecessor by choosing the left pointer and chasing right pointers until a node is reached with a nil right pointer. Immediate The smallest value in the right subtree. Located by choosing the right successor pointer of a node, then chase left pointers until a node is reached with a nil left pointer. Searching in BST Search operation in binary search tree will be very similar to that of binary search. Let’s say we want to search for number, what we’ll do is we’ll start at root and then we will compare the value to be searched with value of root if it’s equal we are done with the search if it’s lesser we know that we need to go to the left subtree because in a binary search tree all the elements in the left subtree are lesser and all the elements in right subtree are greater. Searching an element in the binary search tree is basically this traversal in which at each step we will go either towards left or right and hence in at each step we discard one of the subtrees. If the tree is balanced, we call a tree balanced if for all nodes the difference between the heights of left and right subtrees is not greater than one, we will start with a search space of ‘n’ nodes and when we will discard one of the sub-trees we will discard ‘n/2’ nodes so our search space will be reduced to ‘n/2’ and then in the next step we will reduce the search space to ‘n/4’ and we will go on reducing like this till we find the element or till our search space is reduced to only one node. The search here is also a binary search and that’s why the name binary search tree. Algorithm Until an element is found or not − Step 1 – Start from the root Step 2 − Compare the inserting element with root, if less than root, then recurse for left, else recurse for right. Step 3 − If element to search is found anywhere, return true, else return false. Insertion and Deletion of Nodes in a BST A new key is always inserted at leaf. We start searching a key from root till we hit a leaf node. Once a leaf node is found, the new node is added as a child of the leaf node. The worst-case time complexity of search and insert operations is O(h) where h is height of Binary Search Tree. In worst case, we may have to travel from root to the deepest leaf node. The height of a skewed tree may become n and the time complexity of search and insert operation may become O(n). Algorithm Inserting key as child of leaf node − Step 1 – Start from the root Step 2 − Compare the inserting element with root, if less than root, then recurse for left, else recurse for right. Step 3 − After reaching end, just insert that node at left(if less than current) else right. When we delete a node, three possibilities arise. 1. Node to be deleted is leaf: Simply remove from the tree. Figure 60 BST - node deleted as a leaf 2. Node to be deleted has only one child: Copy the child to the node and delete the child. Figure 61 BST - node deleted has one child 3. Node to be deleted has two children: Find inorder successor of the node. Copy contents of the inorder successor to the node and delete the inorder successor. Note that inorder predecessor can also be used. Figure 62 BST - node deleted has two children The important thing to note is, inorder successor is needed only when right child is not empty. In this particular case, inorder successor can be obtained by finding the minimum value in right child of the node. Heap Tree Heaps are based on the notion of a complete tree, for which we gave an informal definition earlier. A heap is a binary tree with two special properties: 1. Ordering property The value in a node is smaller than all the values in the node's subtrees (MinHeap property). This is very different than a binary search tree! Note that this property makes no distinction between ‘left’ and ‘right’. All the following trees have this property but only one is a heap: Figure 63 Ordering property - node is smaller than all the values in the node's subtree The smallest value must always be in the root, but the largest value could be in any leaf. Similar values are not collected together in any convenient location. Incidentally, heaps can also be defined with the opposite ordering property: the value in each node is larger than all the values in its subtrees (MaxHeap property). 2. Structural property All the levels are full, except perhaps the bottom level, and all the nodes in the bottom level are as far to the ‘left’ as possible. A level is full if there are no ‘missing nodes’ at the level - there are 2**L nodes in level L. This structural property could be rephrased: the only nodes that can have an empty subtree are ones in the bottom level (by definition these have 2 empty subtrees) and the rightmost nodes in the next-to-bottom level. A tree that has this property is called a complete tree. There are two motives for restricting ourselves to complete trees: a. In a complete tree Height = O(logN). As we'll see, our Heap operations are O(Height), so by restricting ourselves to complete trees, we get O(logN) operations. b. complete trees have certain advantages over general binary trees in some implementations. Although all 3 of the Figure 63 trees have the ordering property, only one of them satisfies the structural requirement – that is the middle one. Operations on Heaps The common operation involved using heaps are: Heapify → Process to rearrange the heap in order to maintain heap-property. Find-max (or Find-min) → find a maximum item of a max-heap, or a minimum item of a min-heap, respectively. Insertion → Add a new item in the heap. Deletion → Delete an item from the heap. Extract Min-Max → Returning and deleting the maximum or minimum element in maxheap and min-heap respectively. Heapify It is a process to rearrange the elements of the heap in order to maintain the heap property. It is done when a certain node causes an imbalance in the heap due to some operation on that node. The heapify can be done in two methodologies: up_heapify() → It follows the bottom-up approach. In this, we check if the nodes are following heap property by going in the direction of rootNode and if nodes are not following the heap property, we do certain operations to let the tree follows the heap property. down_heapify() → It follows the top-down approach. In this, we check if the nodes are following heap property by going in the direction of the leaf nodes and if nodes are not following the heap property, we do certain operations to let the tree follows the heap property. void down_heapify(int heap[], int parent, int size) { largest = parent leftChild = 2*parent + 1 rightChild = 2*parent + 2 if(leftChild < size && heap[leftChild] > heap[largest]) largest = leftChild if(rightChild < size && heap[rightChild] > heap[largest]) largest = rightChild if(parent != largest) { swap(heap[parent], heap[largest]) down_heapify(heap,largest,size) } } void up_heapify(int heap[], int child) { parent = (child-1)/2; if(heap[parent] < heap[child]) { swap(heap[parent], heap[child]) up_heapify(heap,parent) } } Insertion The insertion in the heap follows the following steps Insert the new element at the end of the heap. Since the newly inserted element can distort the properties of the Heap. So, we need to perform up_heapify() operation, in order to keep the properties of the heap in a bottom-up approach. Figure 64 Insertion process in (c) (a) (b) Heaps As an example, consider heap tree (a) in Figure 64 which follows max-heap property. New element to be inserted is 12. Performing Step 1: Insert the new element at the end will result in (b). Second step is to heapify the new element following bottom-up approach. Final heap is (c). void up_heapify(int heap[], int child) { parent = (child- 1)/2; if(heap[parent] < heap[child]) { swap(heap[parent], heap[child]) up_heapify(heap,parent) } } void insert(int heap[], int size, int key) { heap.append(key) up_heapify(heap,size+1,size) } Deletion The deletion operations follow the following step: Replace the element to be deleted by the last element in the heap. Delete the last item from the heap. Now, the last element is placed at some position in heap, it may not follow the property of the heap, so we need to perform down_heapify() operation in order to maintain heap structure. The down_heapify() operation does the heapify in the top-bottom approach. The standard deletion on Heap is to delete the element present at the root node of the heap. (a) (b) (c) Figure 65 Deletion process in Heaps For example, consider the heap tree (a) in Figure 65 following a max-heap property. Element to be deleted is 12. Replacing the root with the last element ‘4’ and deleting it will result to (b). Since the resulting tree does not follow the property of the heap we perform down_heapify() to the new root resulting to the final heap tree (c). void down_heapify(int heap[], int parent, int size) { largest = parent leftChild = 2*parent + 1 rightChild = 2*parent + 2 if(leftChild < size && heap[leftChild] > heap[largest]) largest = leftChild if(rightChild < size && heap[rightChild] > heap[largest]) largest = rightChild if(parent != largest) { swap(heap[parent], heap[largest]) down_heapify(heap,largest,size) } } void deleteRoot(int heap[], int size) { swap(heap, heap[size-1]) //swap first and last element heap.pop_back() //delete the last element down_heapify(heap,0,size-1) } Find-max (or Find-min) The maximum element and the minimum element in the max-heap and min-heap is found at the root node of the heap. int findMax(int heap[]) { return heap } Extract Min-Max This operation returns and deletes the maximum or minimum element in max- heap and min-heap respectively. The maximum element is found at the root node. int extractMax(int heap[], int size) { ans = heap deleteRoot(heap, size) return ans } Watch: Introduction to Trees By CS Dojo (https://youtu.be/1-l_UOFi1Xw) Introduction to Trees By mycodeschool (https://youtu.be/qH6yxkw0u78) Binary Search Tree By mycodeschool (https://youtu.be/pYT9F8_LFTM) Deleting a node in a BST By mycodeschool (https://youtu.be/gcULXE7ViZw) Binary tree traversal - breadth-first and depth-first strategies By mycodeschool (https://youtu.be/9RHO6jU--GU) Binary tree traversal – Preorder, Inorder, Postorder By mycodeschool (https://youtu.be/gm8DUJJhmY4) Binary Expression Trees By Miran Fattah (https://youtu.be/_LxbhLNRZkI) Introduction to Binary Heaps By Algorithms with Attitude (https://youtu.be/WCm3TqScBM8) Heaps (MinHeaps) By HackerRank (https://youtu.be/t0Cq6tVNRBA) Read: Chapter 8 Trees Goodrich, Michael T. (2014). Data Structures and Algorithms in Java 6th Edition Chapter 4 – Trees (Sec. 4.1 – 4.3, pp. 121-144) Weiss, Mark Allen (2014). Data Structures And Algorithm Analysis In C++ 4th Edition Section. 4.6 Tree Traversals (pp. 166-168) Weiss, Mark Allen (2014). Data Structures And Algorithm Analysis In C++ 4th Edition Section 9.3 Heaps (pp. 370-384) Goodrich, Michael T. (2014). Data Structures and Algorithms in Java 6th Edition Review: 1. What is a tree? 2. What is a binary tree? How is it different from a tree? 3. What is an advantage of a binary search tree over a binary tree? 4. How to check if a given Binary Tree is BST or not? 5. What is a heap? 6. Describe the difference between a Max Heap over a Min Heap.