DSA Unit 7 - Trees PDF
Document Details
Uploaded by WealthyOrphism
Tags
Summary
This document provides an overview of tree data structures, including their characteristics, representations, types (like binary, complete, balanced), and operations. Key concepts such as traversals (preorder, inorder, postorder) and heap-based operations (heapify, insertion, deletion) are highlighted, along with examples and figures. The content is suitable for undergraduate computer science students.
Full Transcript
Unit 7 - Trees ============== Overview: --------- Module Objectives: ------------------ - **Discuss what a tree and its characteristics** - **Discuss the different representation of a tree** - **Discuss what a binary tree and its characteristics** - **Differentiate between a tree and a...
Unit 7 - Trees ============== Overview: --------- Module Objectives: ------------------ - **Discuss what a tree and its characteristics** - **Discuss the different representation of a tree** - **Discuss what a binary tree and its characteristics** - **Differentiate between a tree and a binary tree** - **Demonstrate the conversion between a tree and a binary tree** - **Discuss the concept of searching using a binary tree** - **Implement your own tree and binary tree application** Course Materials: ----------------- a. there is a specially designated node called the root b. the remaining nodes are partitioned into n \>= 0 disjoint sets T1,..., Tn where each of these sets is a tree. T1,..., Tn are called the subtrees of the root. Indentation -- -- -- -- ***Figure 40 Different binary trees*** -- -- -- -- ![](media/image2.jpeg) n. Ary Tree to Binary Tree Conversion *n*-ary tree - **Rooted binary tree:** It has a root node and every node has at most two children. - ![](media/image14.png)**Full binary tree**: It is a tree in which every node in the tree has either 0 or 2 children. - **Perfect binary tree:** It is a binary tree in which all interior nodes have two children and all leaves have the same depth or same level. - **Complete binary tree:** It is a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible. ![](media/image16.png) - **Balanced binary tree:** A binary tree is height balanced if it satisfies the following constraints: a. The left and right subtrees\' heights differ by at most one, AND b. The left subtree is balanced, AND c. The right subtree is balanced ***Figure 46 Balanced binary tree*** - **Degenarate tree:** It is a tree where each parent node has only one child node. It behaves like a linked list. This type of tree is considered a skewed tree. A skewed tree is one with more levels of nodes in one subtree of a node than in the other subtree. ![](media/image18.png) ***Figure 47 A Degenerate tree (skewed to the right)*** 1. **Preorder traversal** 2. **Inorder (or symmetric order) traversal** ![](media/image20.jpeg) 3. **Postorder traversal** Algorithm Expression Trees ![](media/image22.png) a. If character is operand push that into stack b. If character is operator pop two values from stack make them its child and push current node again. ***Figure 52 Example of a Binary Search Tree (BST)*** -- -- -- -- 1. Node to be deleted is leaf: Simply remove from the tree. ![](media/image24.jpeg) 2. Node to be deleted has only one child: Copy the child to the node and delete the child. 3. Node to be deleted has two children: Find inorder successor of the node. Copy contents of the inorder successor to the node and delete the inorder successor. Note that inorder predecessor can also be used. ![](media/image26.jpeg) 1. Ordering property 2. Structural property a. In a complete tree Height = O(logN). As we\'ll see, our Heap operations are O(Height), so by restricting ourselves to complete trees, we get O(logN) operations. b. complete trees have certain advantages over general binary trees in some implementations. - **Heapify** → Process to rearrange the heap in order to maintain heap-property. - **Find-max (or Find-min)** → find a maximum item of a max-heap, or a minimum item of a min-heap, respectively. - **Insertion** → Add a new item in the heap. - **Deletion** → Delete an item from the heap. - **Extract Min-Max** → Returning and deleting the maximum or minimum element in max- heap and min-heap respectively. - **up\_heapify()** → It follows the **bottom-up approach**. In this, we check if the nodes are following heap property by going in the direction of rootNode and if nodes are not following the heap property, we do certain operations to let the tree follows the heap property. - **down\_heapify()** → It follows the **top-down approach**. In this, we check if the nodes are following heap property by going in the direction of the leaf nodes and if nodes are not following the heap property, we do certain operations to let the tree follows the heap property. - Insert the new element at the end of the heap. - Since the newly inserted element can distort the properties of the Heap. So, we need to perform up\_heapify() operation, in order to keep the properties of the heap in a bottom-up approach. ![](media/image28.jpeg)![](media/image30.jpeg) a. \(b) (c) - Replace the element to be deleted by the last element in the heap. - Delete the last item from the heap. - Now, the last element is placed at some position in heap, it may not follow the property of the heap, so we need to perform down\_heapify() operation in order to maintain heap structure. The down\_heapify() operation does the heapify in the top-bottom approach. ![](media/image32.jpeg) - Introduction to Trees By CS Dojo ([https://youtu.be/1-l\_UOFi1Xw]) - Introduction to Trees By mycodeschool ([https://youtu.be/qH6yxkw0u78]) - Binary Search Tree By mycodeschool ([https://youtu.be/pYT9F8\_LFTM]) - Deleting a node in a BST By mycodeschool ([https://youtu.be/gcULXE7ViZw]) - Binary tree traversal - breadth-first and depth-first strategies By mycodeschool ([https://youtu.be/9RHO6jU\--GU]) - Binary tree traversal -- Preorder, Inorder, Postorder By mycodeschool ([https://youtu.be/gm8DUJJhmY4]) - Binary Expression Trees By Miran Fattah ([https://youtu.be/\_LxbhLNRZkI]) - Introduction to Binary Heaps By Algorithms with Attitude ([https://youtu.be/WCm3TqScBM8]) - Heaps (MinHeaps) By HackerRank ([https://youtu.be/t0Cq6tVNRBA]) - Chapter 8 Trees - ![](media/image34.png)Chapter 4 -- Trees (Sec. 4.1 -- 4.3, pp. 121-144) - Section. 4.6 Tree Traversals (pp. 166-168) - Section 9.3 Heaps (pp. 370-384) 1. What is a tree? 2. What is a binary tree? How is it different from a tree? 3. What is an advantage of a binary search tree over a binary tree? 4. How to check if a given Binary Tree is BST or not? 5. What is a heap? 6. Describe the difference between a Max Heap over a Min Heap. Activities/Assessments: ----------------------- 1. The following questions refer to the tree below. a. Which node is the root? b. What are the internal nodes? c. How many descendants does node cs016/ have? d. How many ancestors does node cs016/ have? e. What are the siblings of node homeworks/? f. Which nodes are in the subtree rooted at node projects/? g. What is the depth of node papers/? h. What is the height of the tree? 2. Draw a sketch of the logical representation for the following tree. 3. Draw a sketch of the logical representation for the following tree. ![](media/image36.png) 4. Draw a sketch of the logical representation for the following tree. 5. Draw a sketch of the logical representation for the following tree. ![](media/image38.png) 6. Draw the binary tree representation of the following arithmetic expression: 7. Draw the structure of a binary search tree i. after these values have been inserted: **19, 34, 23, 16, 54, 89, 24, 29, 15, 61, 27.** j. after two delete operations (deleting the root) 8. Draw the structure of a binary heap (MinHeap): k. after these priorities have been inserted: **19, 34, 23, 16, 54, 89, 24, 29, 15, 61, 27.** l. after two deleteMin operations (using the tree in (a)). 1. Write a program that implements BINARY TREE TRAVERSALS using preorder traversal, inorder traversal, and postorder traversal to produce a linear arrangement of the nodes. Program input consists of the root nodes and its left and right subtrees. 1. the left and right subtrees of each node (show null if none) 2. the root of the tree 3. preorder traversal 4. inorder traversal 5. postorder traversal Example -- -- -- -- -- -- Unit 8 -- Balanced Trees ======================== Overview: --------- Module Objectives: ------------------ - Describe what a balanced tree is - Discuss AVL tree and its characteristics - Discuss B-tree and its characteristics Course Materials: ----------------- 1. A binary tree is **height-balanced** if for each node the heights of its subtrees differ by at most 1. (Remember, the height of an empty subtree is -1). 2. A binary tree is **weight-balanced** if for each node the numbers of inner nodes in its left subtree and right subtree differ by at most 1. ![](media/image41.png) ***Figure 62 Height balanced tree*** ![](media/image43.png)![](media/image45.png) a. C and D are leaves. Their subtrees are all height 0 so C and D are both perfectly balanced. Having finished D we can compute the heights of B\'s subtrees. b. B is not perfectly balanced, but the heights of its subtrees differ only by 1, so B is regarded as height balanced. Now we can compute the balance of A. c. Like B, A\'s two subtrees differ by 1 in height. We have now looked at every node; every node is height-balanced, so the tree as a whole is considered to be height-balanced. 1. Left Rotation 2. Right Rotation - b becomes the new root. - a takes ownership of b\'s left child as its right child, or in this case, null. - b takes ownership of a as its left child. - b becomes the new root. - c takes ownership of b\'s right child, as its left child. In this case, that value is null. - b takes ownership of c, as it\'s right child. ![](media/image58.png) ![](media/image63.png)![](media/image66.png) ![](media/image69.png) 1. If X \< V1, recursively search for X in V1\'s left subtree. 2. If X \> Vk, recursively search for X in Vk\'s right subtree. 3. If X=Vi, for some i, then we are done (X has been found). 4. the only remaining possibility is that, for some i, Vi \< X \< V(i+1). In this case recursively search for X in the subtree that is in between Vi and V(i+1). 1. It is perfectly balanced: every leaf node is at the same depth. 2. Every node, except perhaps the root, is at least half-full, i.e. contains M/2 or more values (of course, it cannot contain more than M-1 values). The root may have any number of values (1 to M-1). ![](media/image70.png) 1. Using the SEARCH procedure for M-way trees find the leaf node to which X should be added. 2. Add X to this node in the appropriate place among the values already there. Being a leaf node there are no subtrees to worry about. 3. If there are M-1 or fewer values in the node after adding X, then we are finished. ![](media/image72.png) - Left = \[ 2 3 \] - Middle = 5 - Right = \[ 6 7 \] ![](media/image74.png) - left = \[ 17 21 \] - Middle = 22 - Right = \[ 44 45 \] - Left = \[ 55 66 \] - Middle = 67 - Right = \[ 68 70 \] - Left = \[ 5 10 \] (along with their children) - Middle = 22 - Right = \[ 50 67 \] (along with their children) ![](media/image77.png) 1. Remove X from the current node. Being a leaf node there are no subtrees to worry about. 2. Removing X might cause the node containing it to have too few values. - The parent node contributes 1 value. - The node that underflowed contributes exactly (M-1)/2-1 values. - The neighboring node contributes somewhere between (M-1)/2 and (M-1) values. - Left = \[ 7 10 \] - Middle = 17 - Right = \[ 22 44 45 \] ![](media/image78.png) ![](media/image80.png) - AVL Trees & Rotations (Self-Balancing Binary Search Trees) By Back To Back SWE ([https://youtu.be/vRwi\_UcZGjU]) - Introduction to B-Trees By David Taylog ([https://youtu.be/I22wEC1tTGo]) - File Indexes with B-Trees By Jacob Schrum ([https://youtu.be/2AD13EPQAms]) - Section 11.2 Balanced Search Trees (pp. 472-478) - Section 11.3 AVL Trees (pp. 479-487) - Section 4.4 AVL Trees (pp. 144-157) - Section 11.5 (2,4) Trees (pp. 500-509) - Section 4.7 B-Trees (pp. 168-173) 1. What is an AVL tree? How is it different from a BST? 2. Why is it important to keep the tree height balanced? 3. What is a B-Tree? How is it different from an AVL tree? Activities/Assessments: ----------------------- 1. Show the result of inserting 3, 1, 4, 6, 9, 2, 5, 7 into an initially empty binary search tree. 2. Given the tree in (1), show the result of deleting the root. 3. Draw the AVL Tree of the following. NOTE: Show the tree before and after rotations. C O M P U T I N G 4. Using the AVL in (3), add S E A R 5. Draw the AVL Tree of the following. NOTE: Show the tree before and after rotations. G O L D S T A R 6. Using the AVL in (5), delete S L O T 7. What will be the resulting B- Trees (2-3 trees) for the following items? a. Insert 112 b. Insert 1 c. Delete 73 d. Delete 31 Unit 9 - Graphs =============== Overview: --------- Module Objectives: ------------------ - **Discuss and describe a graph** - **Discuss the different representations of graphs** - **Discuss searching with graphs** Course Materials: ----------------- ![](media/image82.png) - **Nodes:** These are the most important components in any graph. Nodes are entities whose relationships are expressed using edges. If a graph comprises 2 nodes A and B and an undirected edge between them, then it expresses a bi-directional relationship between the nodes and edge. - **Edges:** Edges are the components that are used to represent the relationships between various nodes in a graph. An edge between two nodes expresses a one-way or two-way relationship between the nodes. - **Undirected:** An undirected graph is a graph in which all the edges are bi-directional - **Directed:** A directed graph is a graph in which all the edges are uni-directional i.e. the edges point in a single direction. - **Weighted:** In a weighted graph, each edge is assigned a weight or cost. ***Figure 86 Directed Graph (unweighted at left; weighted at right)*** - **Cyclic:** A graph is cyclic if the graph comprises a path that starts from a vertex and ends at the same vertex. That path is called a cycle. An acyclic graph is a graph that has no cycle. - (v0, e1, v1, e2, v2,..,vn-1, en, vn) \-- alternating vertices and edges - (v0, v1, v2,..,vn-1, vn) \-- vertices only ![](media/image84.png) - **Adjacency Matrix** - **Adjacency List** ![](media/image86.png) - First move horizontally and visit all the nodes of the current layer - Move to the next layer - Pick a starting node and push all its adjacent nodes into a stack. - Pop a node from stack to select the next node to visit and push all its adjacent nodes into a stack. - Introduction to Graphs By mycodeschool ([https://youtu.be/gXgEDyodOJU]) - Graph Data Structure (Representations) By freeCodeCamp.org ([https://youtu.be/DBRW8nwZV-g]) - Graph Search (Depth First Search and Breadth First Search) By HackerRank (https://youtu.be/zaBhtODEL0w) - Graph Algorithms (pp. 379-382) - Graph Algorithms (Sec. 4.1 & 4.2, pp. 612-629) - Section 14.3 Graph Traversals (pp. 630-642) 1. What is a graph? 2. How are graphs similar to trees? How are they different? 3. If two graph nodes are connected by an edge, what are they called? 4. How can you tell if two graph nodes are adjacent? Activities/Assessments: ----------------------- 1. Explain how an adjacency matrix is used to represent a graph. 2. Use the following description of an undirected graph and draw the graph: 3. Described the graph pictured below using the formal graph notation. 4. Draw an adjacency matrix representation of the undirected graph shown in (3). 5. Draw an adjacency list representation of the undirected graph shown in (3). 6. Draw the graph for the following adjacency matrix. 7. Let G be an undirected graph whose vertices are the integers 1 through 8, and let the adjacent vertices of each vertex be given by the table below: -- -- -- -- a. Draw G. b. Give the sequence of vertices of G visited using a DFS traversal starting at vertex 1. c. Give the sequence of vertices visited using a BFS traversal starting at vertex 1. Unit 10 -- Hashing and Collision ================================ Overview: --------- Module Objectives: ------------------ - **Discuss and differentiate the different types of file organization** - **Discuss how the concept of mapping is made** - **Discuss the different addressing techniques** - **Demonstrate the concept of address calculation techniques in** Course Materials: ----------------- 1. File Activity - specifies percent of actual records which proceed in a single run. 2. File Volatility - addresses the properties of record changes. It helps to increase the efficiency of disk design than tape. 1. Sequential access file organization 2. Direct access file organization 3. Indexed sequential access file organization 1. It is simple to program and easy to design. 2. Sequential file is best use if storage space. 1. Sequential file is time consuming process. 2. It has high data redundancy. 3. Random searching is not possible. 1. Direct access file helps in online transaction processing system (OLTP) like online railway reservation system. 2. In direct access file, sorting of the records are not required. 3. It accesses the desired records immediately. 4. It updates several files quickly. 5. It has better control over record allocation. 1. Direct access file does not provide backup facility. 2. It is expensive. 3. It has less storage space as compared to sequential file. 1. In indexed sequential access file, sequential file and random file access is possible. 2. It accesses the records very fast if the index table is properly organized. 3. The records can be inserted in the middle of the file. 4. It provides quick access for sequential and direct processing. 5. It reduces the degree of the sequential search. 1. Indexed sequential access file requires unique keys and periodic reorganization. 2. Indexed sequential access file takes longer time to search the index for the data access or retrieval. 3. It requires more storage space. 4. It is expensive because it requires special software. 5. It is less efficient in the use of storage space as compared to other file organizations. ![](media/image90.jpeg) ***Figure 91 Basic relative file organization*** 1. Direct Mapping a. absolute addressing b. relative addressing 2. Directory Look up Techniques 3. Address Calculation Techniques 1. [cylinder number], [surface number], and [record number], if cylinder addressing is used. 2. [sector number] and [record number], if sector addressing is used. ![](media/image92.jpeg) There are two advantages to this absolute addressing scheme: 1. Mapping function R is very simple; and 2. Given a record's key value, no processing time is required to determine the record's location on secondary storage. 1. Logical and physical considerations are not independent with absolute addressing. This means that the user must know how the records are stored physically. 2. Users generally do not supply the appropriate types of key values. 3. Absolute addresses are device dependent. As the device upon which the file resides is updated or change, more likely that key values would also need to be changed. 4. Absolute addresses are address-space-dependent. As the relative file is reorganize for purposes of enlarging the address space, it is likely that key values would need to be changed. ![](media/image94.jpeg) 1. A record's location can be determined with essentially no processing, once the key value is found in the directory. 2. Keys can be represented in intuitively understandable forms, since they are translated "internally" to addresses. 3. Logical-physical independence is achieved, because the key values are address- space-independent. If the relative file is reorganized, address entries in the directory must be changed, but the key values will not be affected. 1. Prime-Number Division Remainder 2. Digit Extraction 3. Folding 4. Radix Conversion 5. Mid-Square ![](media/image97.jpeg) 𝑙𝑜𝑎𝑑 𝑓𝑎𝑐𝑡𝑜𝑟 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑘𝑒𝑦 𝑣𝑎𝑙𝑢𝑒𝑠 - Prime-Number Division Remainder - Digit Extraction ![](media/image100.jpeg) - Folding - Radix Conversion - Mid-Square Other Hashing Schemes 1. Quotient Reduction 2. Remainder Reduction 3. Associated Value Hashing 4. Reciprocal Hashing Approaches to the Problem of Collisions 1. Open addressing a. Linear probing b. Two pass file creation with hashing c. Separate overflow area d. Double hashing 2. Chaining e. Synonym chaining f. Bucket addressing ![](media/image103.jpeg) - Understanding File Structure By Papercoinagevideo ([https://youtu.be/anCLeogSTE0]) - What is Hashing? By Lisk ([https://youtu.be/2BldESGZKB8]) - Introduction to Hashing By Kindson The TechPro ([https://youtu.be/2bAeF2NhO-g]) - Hash Table and Hash Functions By Computer Science ([https://youtu.be/KyUTuwz\_b7Q]) - Chapter 5 -- Hashing - Section 10.2 Hash Tables (pp. 410-416) - Section 3.4 Hash Tables (pp.458- 477) - Section 10.2.2 Collision Handling Schemes (pp. 417-421) 1. What are the different types of accessing a file? 2. What is hashing? How can it help in solving the problem with file access? 3. What are the different ways to implement hashing? 4. What is a collision? How do we handle the problem of collision? Activities/Assessments: ----------------------- 1. Complete the table below -- -- -- -- -- -- -- -- -- -- -- -- 2. From the table above, how many synonyms are there? 3. Given a hash table with **m=11** entries and the following hash function **h1** and step function ### h1(key) = key mod m a. Chaining with h1 ⟹ ***h(k) = h1(k)*** b. Linear-Probing with h1 ⟹ ***h(k,i) = (h1(k)+i) mod m*** c. Double-Hashing with h1 as the hash function and h2 as the step function ### ⟹ h(k,i) = (h1(k) + ih2(k)) mod m -- -- -- -- -- -- -- -- ![](media/image107.jpeg)