Data Structures PDF
Document Details
Uploaded by Deleted User
Tags
Related
- Data Structures & Algorithms Lecture Notes 1 PDF
- Data Structures and Algorithms Study Guide (Sorsogon State University)
- Introduction to Java Programming and Data Structures (2019) by Y. Daniel Liang - PDF
- II-Sem Computer Science Syllabus PDF
- Data Structures and Algorithms with Python and C++ PDF
- Data Structures and Algorithms(All Sessions) PDF
Summary
These notes provide an overview of data structures and their types, including linear structures (arrays, stacks, queues, linked lists) and non-linear structures (trees, graphs). The document also summarizes various operations and properties of each type.
Full Transcript
# 05208 ## Data Structure Data is a set of values. In case of data structure, it can be a group of items. * **Field:** A single elementary unit of information that represents an attribute of an entity, e.g. name, age, etc. * **File:** A collection of records for an entity. * **Record:** A collection...
# 05208 ## Data Structure Data is a set of values. In case of data structure, it can be a group of items. * **Field:** A single elementary unit of information that represents an attribute of an entity, e.g. name, age, etc. * **File:** A collection of records for an entity. * **Record:** A collection of all files. **Complete data structure types:** * **Logical or mathematical structure:** * **Quantitative Structure Analysis:** * **Implementation** ## Array An array is a variable that can store multiple values. * **S/N:** is an array. ## Types of Data Structures 1. **Linear array:** A one-dimensional array: linear, list, formula or 1-dimensional array. * A[1], A[2], A[3], ... A[n] 2. **Two-dimensional::** A collection of similar multiple values - name. * A[1,2], A[2,2], A[3,2], ... A[m,n] **Data item** refers to a single set of values: * divided into sub-items are called **group items** * those that are not divided are called **elementary items** For example: * **name** of a student. * **entity:** _student_ * **field:** _DC_ * **record:** _2001_ * **file:** _student_ * **reg no** serves as the **primary key** * **Name, Age, Address:** attributes **Primary Key** must be different for every unique group item or name. * **Group Item:** is a name. * **Link List:** one type of data structure. ## 2nd Approach | Customer | Link | Sales p | Pointer | |---|---|---|---| | Adams | 5 | John | 3 | | Brown | 4 | Day | 2 | | Clark | 6 | Smith | 1 | | Drew | 7 | | | | Evans | 8 | | | | Farmer | 0 | | | | Geller | 9 | | | | Haill | 0 | | | | Infaeld | 0 | | | | Sales per | Pointer | |---|---| | John | 3, 7, 6 | | Day | 2, 4, 7, 9 | | Smith | 8, 5, 7, 8 | ## Stack A stack is also called **LIFO Sys**. It is a linear list where insertion and deletion can take place only at one end called the **top**. ## Queue A queue is also called **FIFO Sys**. It is a linear list where deletion can take place only at one end of the list. ## Tree A tree is a single entity. - Employee - Success No - Name - First - Last - 3rd - Age - Salary - Address - Street - Area - Zip - State **Data item** refers to a single unit of value divided into sub-items are called **group items**. Those that are not called are **elementary items**. **Collections of data are organized into File records** and fields. ## Entity An entity is something that has certain attributes or properties which may be assigned values. It may be numeric or non-numeric. **Records** can be classified according to their length: * **Fixed-length records:** all records contain the same data items with the same amount of space assigned to each data item. * **Variable-length records:** may contain different lengths. For example student records since a student could take a number of courses. ## One-dimensional array A one-dimensional array is a list of finite numbers of similar data elements referenced respectively by a set of consecutive numbers usually 1,2, 3, ... n. * Example using parenthesis notation: A(1), A(2), A(3), ... A(n) * Example using bracket notation: A[1], A[2], A[3], ... A[n] **Regardless of the notation, the number in the A[] is called a subscript**. Such arrays are called matrices in mathematics and tables in business applications. ## Review of Data Structures * **Static-Linear Data Structure:** * **Array** * **Linked List** * **Dynamic:** * **Stack** * **Queue** * **Non-Linear Data Structures:** * **Trees:** * **Binary tree** * **Binary search tree** * **AVL tree** * **Adelson-Velskii and Landis (AVL) tree** * **Red-black tree** * **B-tree** * **Graphs:** * **Hashing, Heaps** ## Overview of Data Structures Data structures are one of the most powerful tools available to any business organizations, not only to survive but raise to the top in today's competitive and challenging world. However, this data brings some demand including a need to keep the information organized and easily accessible. But the data in the world won't help a business. It can't reach the data and turn it into actionable assets. This situation leads to what is termed as "data structures". ## Definition of Data Structures Data structures are a specific way of organizing data in a specific specialized format on a computer so that information can be organized, processed, stored, and retrieved quickly and efficiently. ## X-tris of Data Structures X-tris of data structures are the systematic ways used to organize the data. * **Linear:** This arranges data in sequential or non-sequential order - arrays, stacks, trees, graphs, and queues. * **Static** and **dynamic** have fixed formats and sizes along with memory locations. * While **dynamic** have no fixed formats and sizes. * **Time Complexity:** The time factor should be very functional. The running time or execution time of a program should be limited. The less the running time, the execution time, the more accurate devices. * **Space Complexity:** The space in a device should be managed carefully. The memory usage should be used properly. The space should be less occupied which indicates the properties of the device have finally. * **Correctness:** Each data must definitively have an interface. Interface depicts the set of data structure data structure should be implemented accurately in the data structure. ## Types of Data Structures ### Linear #### Array An Array is a static linear data structure for the collection/formation of data items that are of the same data types, stored together in adjoining memory locations. Each data item is known as an element while the reference to each data item is known as "index": ``` 0 1 2 3 4 5 7 8 9 10 11 12 -> Index -> Syntax //elements to locate to num (2,3,7)=10 ``` #### **nums[1] + nums[4] = 8+11=19** ``` 0 1 2 3 4 5 "A" 1 "S" "H" "A" "T" nums[0] + nums[1] + nums[2] + nums[3] + nums[4] + nums[5] = A + 1 + S + H + A + T = AISHAT num[0 to 5] = ``` **How to declare an element in data space of nums[7]** When you declare numeric in your memory you cannot put string. * **Inserting element/data in memory location:** * **How to create memory location using array:** * **How to insert data into memory location created:** * **Deleting element in an array:** * **Modification of element in an array:** **i nums:** * for creating memory location. * for inserting elements in the memory. * for deleting elements from the memory. * for modification. **Application of Array/When do we apply arrays:** 1. **For sorting info in a linear fashion,** e.g., student reg no. 2. Suitable for applications that require frequent searching e.g., browser. ## Dynamic Linear Data Structures ### Stack A stack is a linear data structure that accepts different data types. The basic operation of a stack: * **push:** adding an element or inputting it into the stack. * **pop:** for removing an element from the stack. * **peek:** for viewing an element in the stack. **Applications of stack:** An example of stack applications: - Undo is also called an editor. - Back tracking (parsing): ``` A / \ B C / \ / \ D E G H / \ F I DFS (Depth First Search) ``` **Recursion/Start from (LIFO)** e.g. [{()}] - Invalid ### Queue A queue is another dynamic linear data structure that deals with different data types which follow a specific order while performing various operations. A queue is defined as a linear data structure that is open at both ends, and operations are performed in FIFO order. We define a queue to be a list in which all additions to the list are made at one end, and all deletions are made at the other end. e.g. ``` front/had / 2/7/9/18/20 \ 3 Dequeue front Back/tail 11 enqueu ``` **Example of queue:** queueing at messages, WhatsApp, SMS. Each cell is **4 bytes** in memory, 4 x8 bits, each cell **Operations in queue:** * **enque( )**: is used to store the elements in the queue. * **deque**: is used to remove/delete elements in the queue. * **peek**: for viewing elements in the queue. * **Is Full()**: is used to check if the queue is full. * **IsEmpty()**: is used to check if the queue is empty. **Applications of Queue:** 1. **In operating systems** e.g., Linux, Windows etc., when you use a keyboard to type each alphabet is stored in queue, and it appears on the screen after a little delay. 2. **Printers:** To print documents, there would be on queue to maintain orders of pages which solves the problem of deadlock. 3. **In software application:** when we place an order, it would be in a queue of first come first serve. ## Linked List A linked list is a dynamic linear data structure whose size increases as you add elements and decreases as you remove elements from the list. The elements are stored in containers. The list holds the link to the first container. The linked list class is a collection which can contain many objects of the same data type, just like arrays, but both have interfaces from where you can add, change, remove, or clear the list in the same way. Linked lists are a link that takes you from one object to another object in a list. **Singly Linked List:** ``` head | v 5 ->1 -> 9 next() ``` We use "next()" to navigate the elements. You can add another element in a linked list but not in a stack or queue because elements in a linked list follow certain orders. **Doubly Linked List:** ``` ^ | Node | v 5 <-> 1 <-> 9 / \ 17 ``` **Applications Area:** * When filling a form (filling a form). * Navigating between objects. ## Tree A tree is a non-linear data structure that consists of nodes and connections between them. Nodes are connected by edges. Trees are used to represent hierarchical relationships and are an essential element for efficient data management: * **Node:** A node represents an object. * **Edge:** An edge is a connection between two nodes. * **Root Node:** The topmost node of a tree. * **Child Node:** A node directly connected to another node below it. * **Left Node:** A child node to the left of a parent node. ``` A | |---Edge B | | C | | D | | E | 1 2 3 4 5 6 7 Sequential = Linear root node / \ / \ / \ child node child node left node / / / | | | ``` ### Advantages of Tree Data Structures * The data structure is non-linear, making it easier to quickly access data elements. ### Tree Terminologies * **Node:** An entity that contains keys and pointers to the child nodes. * **Root node:** The node at the top of the tree. * **Parent node:** A node that has a child node. * **Child node:** A node that is directly connected to another node below it. * **Leaf node:** A node that has no child nodes. * **Edges:** The lines that connect nodes in a tree. * **Internal node:** A node that has at least one child node. * **Node degree:** The total number of children of a node. #### Example: ``` A / \ B C / \ / \ D E F G / \ H I Node degree of: A=2, B=2, C=2, E=1, D=0, F=0, G=0 ``` ### Terminologies of TDS (Tree data structure): * **Node level:** The root node is said to be at level 0. The children of the root node are at level 1 and the children of the second level are at level 2, and so on. * **Internal node:** A, B, and C are internal nodes. * **Node height:** The total number of edges from a leaf node to a particular node to the longest path is called the height of that node. * **Length of a path:** The length of a path is the number of edges that occur or compose the path. * **Node depth:** The total number of edges from the root node to a particular node is called the depth of that node. ### Types of Trees: * **Binary tree:** A tree data structure in which each parent node at most has two (2) children * **Binary Search Tree (BST):** A tree data structure in which each node has a maximum of two (2) children. All nodes of the left subtree are less than the root node. All nodes of the right subtree are more than the root node. * **Input K=75** * **Bf=1-1=0** * **Left >> root node = 50 >> 100 (Not possible)** * **Right >> root node = 150 >> 100** * We cannot insert 75 in the given BST. ``` 50 (100) / \ / \ 50 60 / \ / \ 23 20 ``` * **AVL Tree:** This got its name after inventors Adelson-Velskii and Landis. An AVL tree is a self-balancing tree in which each node maintains a balance factor whose value is either -1, 0, or 1. * **Balance Factor:** Height of left subtree – height of right subtree. (and vice versa). - **ht-3** - **ht-2** * **Whether it is Balanced or not:** - **B = hOL - hOR** - **B = 3 - 2 = 1** - **If B is greater than 1 or less than -1** , the tree is treated as **unbalanced**, and then we need to perform a **balancing operation** (rotations) to restore the balance of the tree. #### AVL Tree Example: ``` ht=1 (60) / / 0 (BF=1-0=1) ht=0 hr=0 B=0-0=0 (Un balanced ) / / 50 60 / ht=0 ht=0 BF=2-0=2 (Tree satisfied) 25 ``` ``` 1-2=-1 (Not balanced) ht=1 ht= 0 BF=1-3=2 (Un balanced) / \ / / / / 50 60 / \ / / \ / \ 25 80 / \ 20 20 / / 60 ``` * **B-tree:** This is a special kind of self-balancing search tree in which each node can contain more than one key, and can have more than two (2) children. A B tree is also known as a **height-balanced m-way tree**. ``` 3 / \ 1 6 / \ / \ 2 4 5 7 ``` * **Red-black tree:** This is another type of self-balancing binary search tree with additional properties to ensure that a tree remains balanced. It uses color-coding to maintain balance, allowing for efficient operations in logarithmic time complexity. #### Properties of a Red Black Tree * **The root node must be black.** * **The children of a red-colored node must be black. (i.e. there should not be two consecutive red nodes)** * **In all the paths of the tree, there should be the same number of black colored nodes.** * **Every new node must be inserted with red color.** * **Every leaf (e, NULL node) must be colored black** #### Example of a Red Black Tree ``` 3 (black) / \ 2 4 (red) | / \ 1 5 6 / / \ null null 7 (red) / null ``` #### The red-black tree satisfies all of the properties: - All red-black trees are binary search trees, but not all binary search trees are red-black trees - The possible result of adding a new entry e to a two-node red-black tree is as follows: - **Case 1** **The tree is balanced** **Before adding e to the red-black tree - equivalent 2-4 tree** ``` x / y ``` **After adding e to the red-black tree** ``` x / \ y e ``` **After adding e to the 2-4 tree** ``` x / \ y e ``` **The action is none because the new entry is red.** **Balancing Red-Black Tree Examples:** ``` Red-black tree equivalent 2-4 tree ---------------------------------------------------------------- - x (red) x y (16 Dec 24) / \ null y (black) / \ null null Action after adding element: None * x (red) x y (None) / \ null null Action after adding element: None * x (red) x y e / \ / \ y null (black) y e / \ / \ null null null null Action after adding element: Single flip to the left hand side * x (red) x y e / \ / \ null y (black) y e / \ / \ null null null null Action after adding element: double flip to the left and right (unbalanced, balanced) This step is called Color flip! NOTE - Adding an entry in to a Red Black tree results in a new Red leaf. The color of this leaf can change later when other entries are added or removed. ``` ## Tree traversal Tree traversal is a process of moving from one (1) node to another node to search for a particular element in a specific order. For mobile agents, Tree Traversal helps to visit the required node in the tree to perform a specific operation. Tree traversal can be performed in four ways: 1. **Pre-order traversal:** * Visit the root node. * Visit all nodes from the left side. * Visit the nodes from the right side. ``` A / \ B C / \ / \ D E F G / \ H I ``` 2. **In-order traversal:** * Visit all nodes from the left side. * Visit the root node. * Visit all nodes from the right side. ``` A / \ B C / \ / \ D E F G / \ H I ``` 3. **Post-order traversal:** * Visit all nodes from the left side. * Visit all nodes from the right side. * Visit the root node. ``` A / \ B C / \ / \ D E F G / \ H I ``` 4. **Level-order traversal:** * Visit level 0 node (root node). * Visit level 1 nodes. * Visit level 2 nodes, and so on. ``` A / \ B C / \ / \ D E F G / \ H I LI ---------- A ---------- LI ---------- B C ---------- L3 ---------- D E F G ---------- H I ---------- ``` ## Theorem Let T be a non-empty full binary tree. Then: * I = Number of Internal Nodes. * a. T has internal nodes, I+1. * b. T has total of N nodes, N = 2I+1. * c. T has internal nodes, N = (N-1)/2. * d. A binary tree allows for a max of two child nodes per parent node. * e. The max number of nodes in a binary tree, Nmax: 2^t. * f. The total number of children of a given node is called the degree of that node and denoted by nd(n). * g. To calculate the number of nodes with a given degree(nd), nd + N = N^2 +N+4 /2.