Trees PPT PDF
Document Details
Uploaded by Deleted User
Ashwitha Thomas
Tags
Summary
This presentation covers the fundamental concepts of trees in data structures. It defines key terms like root, child, sibling, ancestor, and descendant, and discusses properties such as degree, leaf nodes, and internal nodes. It also introduces the concept of a binary tree.
Full Transcript
NONLINEAR DATA STRUCTURES - TREE DATA STRUCTURES Ashwitha Thomas What is a Tree? A tree is a finite set of one or more nodes that shows parent-child relation such that: i) there is a specially designated node called the root; ii) The remaining nodes are partitioned into disjoint subsets T1...
NONLINEAR DATA STRUCTURES - TREE DATA STRUCTURES Ashwitha Thomas What is a Tree? A tree is a finite set of one or more nodes that shows parent-child relation such that: i) there is a specially designated node called the root; ii) The remaining nodes are partitioned into disjoint subsets T1, T2, T3 ……Tn, n ≥ 0where T1, T2, T3 …… Tn which are all children of root node are themselves trees called subtrees Terminology Root node: A first node written at the top is root node. The root node does not have the parent. The node 100 is the root node in above tree. Child: The node obtained from the parent node is called child node. A parent node can have zero or more child nodes. 50 and 60 are children of 100 80 and 40 are children of 60 70 is child of 50 35 and 30 are children of 80 Siblings: Two or more nodes having the same parent are called siblings. For Example, 50 and 60 are siblings since they have same parent 100 80 and 40 are siblings since they have same parent 60 35 and 30 are siblings since they have same parent 80 Ancestors: The nodes obtained in the path from the specified node x while moving upwards towards the root node are called ancestors. For example, 100 is the ancestor of 50 and 60 60 is the ancestor of 35, 30, 80 and 40 50 and 100 are the ancestors of 70 60 and 100 are the ancestors of 80, 40, 35, 30 80, 60 and 100 are the ancestors of 35 and 30 Descendants: The nodes in the path below the parent are called descendants. In other words, the nodes that are all reachable from a node x while moving downwards are all called descendants of x. For example, All the nodes below 100 are descendants of 100 All the nodes below 50 are descendants of 50 and so on. Left descendants: The nodes that lie towards left subtree of node x are called left descendants. For example, 50 and 70 are left descendants of 100 80, 35 and 30 are the left descendants of 60 35 and 80 are left descendants of 60 Right descendants: The nodes that lie towards right subtree of node x are called right descendants. For example The right descendants of 100 are 60, 80, 40, 35 and 30 The right descendent of 80 is 30 The right descendent of 60 is 40 Leftsubtree: All the nodes that are all left descendants of a node x form the left subtree of x. For example, The left subtree of 100 are 50 and 70 The left subtree of 60 are 80, 35 and 3 Right subtree: All the nodes that are all right descendants of a node x form the right subtree of x. For example, The are right descendants of 100 are 60, 80, 40, 35 and 30 The right descendent of 80 is 30 The right descendent of 60 is 40 Parent: A node having left subtree or right subtree or both is said to be a parent node for the left subtree and/or right subtree. The word father is also used in place of parent. For example, The parent for 50 and 60 is 100 The parent for 70 is 50 The parent for 80 and 40 is 60 The parent for 35 and 30 is 80 Degree: The number of subtrees of a node is called its degree. For example, The node 100 has two subtrees. So, degree of node 100 is 2 The node 50 has one subtree. So, degree of node 50 is 1 The node 70 has no subtree. So, degree of node 70 is 0 Leaf: A node in a tree that has a degree of zero is called a leaf node. In other words, a node with an empty left child and an empty right child is called a leaf node. It is also called a terminal node. For example, 70, 35, 30 and 40 are the leaf nodes Internal nodes: The nodes except leaf nodes in a tree are called internal nodes. For example, External nodes:The NULL link of any node in a tree is an external node. For example, rlink of 50, rlink and llink of nodes 70, 35, 30 and 40 are all external nodes Level: The distance of a node from the root is called level of the node. The distance from root to itself is 0. So, level of root node is 0. The node 50 is at a distance of 1 from root node. So, its level is 1. The node 70 is at a distance of two nodes from root node. So, its level is 2. Height The node (Depth): 35 and 30 The height are of the tree at a distance is nodes of 3 defined as the from the root maximum levellevels node. So, their of anyare leaf 3.in the tree. For example, height of above tree is 4 Binary trees Definition: A binary tree is a tree which has finite set of nodes that is either empty or consist of a root and two subtrees called left subtree and right subtree. A binary tree can be partitioned into three subgroups namely root, left subtree and right subtree. Root – If tree is not empty, the first node in the tree is called root node. left subtree – It is a tree which is connected to the left of root. Since this tree comes towards left of root, it is called left subtree. right subtree – It is a tree which is connected to the right of root. Since this tree comes towards right of root, it is called right subtree. A tree in which each node has either zero, one or two subtrees is called a binary tree. The pictorial representation of a typical node in a binary tree is shown below: Note that a node in a tree consists of three fields namely llink, info and rlink: llink –contains address of left subtree info – This field is used to store the actual data or information to be manipulated rlink –contains address of right subtree. The pictorial representation of above node can also be written as shown below: The following figure shows some of the binary trees: Are they binary trees? The tree shown in Fig (a) is not binary tree. This is because, it has more than two subtrees. The tree shown in Fig (b) is not a binary tree. There is a cycle from 100 to 200, 200 to 300 and 300 back to 100. The tree should not have cycles. The node 100 has subtree 10 and 10 has the subtree 100. If 10 is subtree of 100, then 100 cannot be the subtree of 10. So, it is not a binary tree. Properties of binary trees Types of binary trees Strictly binary tree In strictly binary tree, every node should have exactly two children or none. That means every internal node must have exactly two children. A binary tree in which every node has either two or zero number of children is called Strictly Binary Tree. Strictly binary tree is also called as Full Binary Tree Proper Binary Tree or 2-Tree A binary tree having 2^i nodes in any given level i is called strictly binary tree. Here, every node other than the leaf node has two children. A strictly binary tree is also called full binary tree or proper binary tree. For example, in full binary tree shown, The number of nodes at level 0 = 2^0 = 1 The number of nodes at level 1 = 2^1 = 2 The number of nodes at level 2 = 2^2 = 4 Complete Binary Tree In complete binary tree every level except possibly the last level is completely filled. If the nodes in the last level are not completely filled, then all the nodes in that level should be filled only from left to right. Skewed tree A skewed tree is a tree consisting of only left subtree or only right subtree. Expression Tree The expression tree is a binary tree in which each internal node corresponds to the operator and each leaf node corresponds to the operand so for example expression tree for 3 + ((5+9)*2) would be: Binary Search Tree A Binary Search Tree (or BST) is a data structure used in computer science for organizing and storing data in a sorted manner. Each node in a Binary Search Tree has at most two children, a left child and a right child, with the left child containing values less than the parent node and the right child containing values greater than the parent node. This hierarchical structure allows for efficient searching, insertion, and deletion operations on the data stored in the tree. Binary tree representation Array representation (uses static allocation technique) Linked representation (uses dynamic allocation technique) Linked representation In a linked representation, a node in a tree has three fields: info – which contains the actual information llink NODE– which contains address of the left subtree root = NULL; rlink or – contains address of the right subtree. struct node * root = NULL; Array representation A tree can also be represented using an array, which is called sequential representation. A tree can be represented using arrays in two different methods as shown below: Method 1: In the first representation shown below, some of the locations may be used and some may not be used. For this, a flag field namely, used is used just to indicate whether a memory location is used to represent a node or not. If flag field used is 0, the corresponding memory location is not used and indicates the absence of node at that position. So, each node contains two fields: An array a of type NODE can be used to store different items and the declaration for this is shown below: NODE a [MAX_SIZE]; Method 2: An alternate representation is that, instead of using a separate flag field used to check the presence of a node, one can initialize each location of the array to 0 indicating the node is not used. Non-zero value in the location indicates the presence of the node. Binary tree Traversals Preorder traversal Definition: The preorder traversal of a binary tree can be recursively defined as follows: 1. Process the root Node [N] 2. Traverse the Left subtree in preorder [L] 3. Traverse the Right subtree in preorder [R] Postorder traversal Definition: The postorder traversal of a binary tree can be recursively defined as follows: 1. Traverse the Left subtree in postorder [L] 2. Traverse the Right subtree in postorder [R] 3. Process the root Node [N] Inorder traversal Definition: The inorder traversal of a binary tree can be recursively defined as follows: 1. Traverse the Left subtree in inorder [L] 2. Process the root Node [N] 3. Traverse the Right subtree in inorder [R] Threaded binary trees Disadvantages of binary trees: In a binary tree, more than 50% of link fields have \0 (null) values and more memory space is wasted by storing \0 (null) values Traversing a tree with binary tree is time consuming. This is because, the traversal of a tree either uses implicit stack (in case of recursive programs) or explicit stack (in case of iterative programs). Whatever it is stack is must. Most of the time is spent in pushing and popping activities during traversing. Computations of predecessor and successor of given nodes is time consuming In binary trees, only downward movements are possible All these disadvantages can be overcome using threaded binary tree: A threaded binary tree is a binary tree which contains threads (i.e., addresses of some nodes) which facilitate upward movement in the tree. Each node in a threaded binary tree either contains a link to its child node or thread to other nodes in the tree. Types of Threaded Binary Tree One-way threaded Binary Tree/Single Threaded Binary Tree. Two-way threaded Binary Tree/Double Threaded Binary Tree One-way threaded Binary trees: In one-way threaded binary trees, a thread will appear either in the right or left link field of a node. If it appears in the right link field of a node then it will point to the next node that will appear on performing in order traversal. Such trees are called Right threaded binary trees. If thread appears in the left field of a node then it will point to the nodes inorder predecessor. Such trees are called Left threaded binary trees. Two-way threaded Binary Trees: Advantages: In a binary tree, more than 50% of link fields have \0 (null) values and more memory space is wasted by storing \0 (null) values. This wastage of memory space is avoided by storing addresses of some nodes. Traversing of a threaded binary tree is very fast. This is because, it does not use implicit or explicit stack. Computations of predecessor and successor of given nodes is very easy and efficient Any node can be accessed from any other node. Using threads, upward movement is possible and using links downward movement is possible. Thus, in a threaded binary tree, we can move in either directions. This is not possible in un-threaded binary trees. Even though insertion into a threaded binary tree and deletion from a threaded binary are time consuming operations, they are very easy to implement. Disadvantages: Here extra fields are required to check whether a link is a thread or not and hence occupy more memory when compared with un-threaded binary trees. Insertion and deletion of a node consumes more time than its counter part because many fields have to be modified. Binary Search tree A binary search tree (BST) is a data structure that stores sorted data in a tree-like structure. It's a rooted binary tree where each internal node has two subtrees, left and right, and the keys in each node follow the binary search property: The key in each node is greater than or equal to any key in its left subtree The key in each node is less than or equal to any key in its right subtree Advantages of Binary search tree Searching an element in the Binary search tree is easy as we always have a hint that which subtree has the desired element. As compared to array and linked lists, insertion and deletion operations are faster in BST. Creating a binary search tree Suppose the data elements are - 45, 15, 79, 90, 10, 55, 12, 20, 50 Step 1 - Insert 45. Step 2 - Insert 15. Step 3 - Insert 79. Step 4 - Insert 90. 90 is greater than 45 and 79, so it will be inserted as the right subtree of 79. Step 5 - Insert 10. 10 is smaller than 45 and 15, so it will be inserted as a left subtree of 15. Step 6 - Insert 55. Step 7 - Insert 12. Step 8 - Insert 20. Step 9 - Insert 50. Searching in Binary search tree The steps of searching a node in Binary Search tree are listed as follows : First, compare the element to be searched with the root element of the tree. If root is matched with the target element, then return the node's location. If it is not matched, then check whether the item is less than the root element, if it is smaller than the root element, then move to the left subtree. If it is larger than the root element, then move to the right subtree. Repeat the above procedure recursively until the match is found. Now, let's understand the searching in binary tree using an example. We are taking the binary search tree formed above. Suppose we have to find node 20 from the below tree. Function to search for an item in binary search tree using iteration Function to search for an item in BST using recursion Deletion in Binary Search tree To delete a node from BST, there are three possible situations occur : The node to be deleted is the leaf node, or, The node to be deleted has only one child, and, The node to be deleted has two children When the node to be deleted is the leaf node Here, we have to replace the leaf node with NULL and simply free the allocated space. In below image, suppose we have to delete node 90, as the node to be deleted is a leaf node, so it will be replaced with NULL, and the allocated space will free. When the node to be deleted has only one child We can see the process of deleting a node with one child from BST in the below image. In the below image, suppose we have to delete the node 79, as the node to be deleted has only one child, so it will be replaced with its child 55. So, the replaced node 79 will now be a leaf node that can be easily deleted. When the node to be deleted has two children This case of deleting a node in BST is a bit complex among other two cases. In such a case, the steps to be followed are listed as follows : First, find the inorder successor of the node to be deleted. After that, replace that node with the inorder successor until the target node is placed at the leaf of tree. And at last, replace the node with NULL and free up the allocated space. We can see the process of deleting a node with two children from BST in the below image. In the below image, suppose we have to delete node 45 that is the root node, as the node to be deleted has two children, so it will be replaced with its inorder successor. Now, node 45 will be at the leaf of the tree so that it can be deleted easily. Traverse the given BST Expression Tree The expression tree is a binary tree in which each internal node corresponds to the operator and each leaf node corresponds to the operand Example : 3 + ((5+9)*2) Properties of an Expression tree In this tree, the internal node always denotes the operators. The leaf nodes always denote the operands. The operations are always performed on these operands. The operator present in the depth of the tree is always at the highest priority. The operator, which is not much at the depth in the tree, is always at the lowest priority compared to the operators lying at the depth. The operand will always present at a depth of the tree; hence it is considered the highest priority among all the operators. Construction of Expression Tree Let us consider a postfix expression is given as an input for constructing an expression tree. Following are the step to construct an expression tree: Read one symbol at a time from the postfix expression. Check if the symbol is an operand or operator. If the symbol is an operand, create a one node tree and push a pointer onto a stack If the symbol is an operator, pop two pointers from the stack namely T1 & T2 and form a new tree with root as the operator, T1 & T2 as a left and right child A pointer to this new tree is pushed onto the stack Example - Postfix Expression Construction The input is: a b + c * The first two symbols are operands, we create one- node tree and push a pointer to them onto the stack. Next, read a '+' symbol, so two pointers to tree are popped, a new tree is formed and push a pointer to it onto the stack. Next, 'c' is read, we create one node tree and push a pointer to it onto the stack. Finally, the last symbol is read ' * ', we pop two tree pointers and form a new tree with a, ' * ' as root, and a pointer to the final tree remains on the stack. Evaluation of an Expression Tree To evaluate an expression tree: Recursively evaluate the left subtree. Recursively evaluate the right subtree. Apply the operator at the current node to the results of the left and right subtrees. Example Expression: 3 + ((5 * 4) - 2) Final Result: 21 Expression tree traversal In-Order Traversal (Left → Root → Right) 3 + (5 * 4) - 2 Pre-Order Traversal (Root → Left → Right) +3-*542 Post-Order Traversal (Left → Right → Root) 354*2-+ Graphs Definition: Formally, a graph G is defined as a pair of two sets V and E denoted by G = (V, E) where V is set of vertices and E is set of edges. For example, consider the graph shown below: Terminologies Directed graph A graph G = (V, E) in which every edge is directed is called a directed graph. The directed graph is also called digraph. Undirected graph Self-loop A loop is an edge which starts and ends on the same vertex. A loop is represented by an ordered pair (i, i). This indicates that the edge originates and ends in the same vertex. A loop is also called self-edge or self-loop. In the given graph shown below, there are two self-loops namely, and. Multigraph A graph with multiple occurrence of the same edge between any two vertices is called multigraph. Here, there are two edges between the nodes 1 and 4 and there are three edges between the nodes 4 and 3. Complete graph A graph G = (V, E) is said to be a complete graph, if there exists an edge between every pair of vertices Path Walk A walk is a sequence of vertices and edges of a graph i.e. if we traverse a graph then we get a walk. Simple path A simple path is a path in which all vertices except possibly the first and last are distinct. Length of the path The length of the path is the number of edges in the path. Cycle (circuit) A cycle is a path in which the first and last vertices are same Eg: The path shown in above directed graph is a cycle, since the first node and last node are same. It can also be represented as , ,. Note: A graph with at least one cycle is called a cyclic graph and a graph with no cycles is called acyclic graph. Connected Graph A graph is said to be connected if there is a path between every pair of vertex. From every vertex to any other vertex, there should be some path to traverse. A graph that is not connected is said to be disconnected graph. It is easy to see that a disconnected graph consists of two or more connected graphs. Each of these connected sub graphs is called a component. Representation of graph Adjacency Matrix Definition: Let G = (V, E) be a graph where V is set of vertices and E is set of edges. Let N be the number of vertices in graph G. The adjacency matrix A of a graph G is formally defined as shown below: Adjacency List Euler Path An Euler path is a path that uses every edge of a graph exactly once. An Euler circuit is a circuit that uses every edge of a graph exactly once. An Euler path starts and ends at different vertices. An Euler circuit starts and ends at the same vertex. A connected graph has an Euler circuit if and only if each of its vertices is of even degree.At every vertex, need one edge to get in and one edge to get out (or one to get out and one to get back in) A connected graph has an Euler path but not an Euler circuit if and only if it has exactly two vertices of odd degree. The first and last vertices are distinct. Remember that an Euler circuit is also an Euler path Hamiltonian Circuits A Hamiltonian circuit is a circuit that visits every vertex once with no repeats. Being a circuit, it must start and end at the same vertex. A Hamiltonian path also visits every vertex once with no repeats, but does not have to start and end at the same vertex. Practice: Algorithms & Functions based on Trees/graphs operations Hashing Hash key generation: Division Method (Modulo Operation): The formula to generate a hash key depends on the hashing function you choose. A hashing function is designed to transform a key into an index in a hash table. The general formula is: Hash Key=ℎ(key)Hash Key=h(key) Overflow Overflow happens when the hash table becomes full, or when the data structure used to handle collisions (like a chain) exceeds its capacity. Example: In chaining, if the linked list at a particular index grows too long due to repeated collisions, it might lead to an overflow if the memory allocated is insufficient. How to handle: Resize the table: Increase the size of the hash table. Use a more efficient hash function: To distribute keys more evenly Self Study: Learn more on overflow handling Learn different types of hashing functions