8239_ECAP770_ADVANCED_DATA_STRUCTURES.pdf
Document Details
Uploaded by IdyllicAcer
Lovely Professional University
Tags
Related
- Data Structures & Algorithms Lecture Notes 1 PDF
- Data Structures and Algorithms Study Guide (Sorsogon State University)
- DATA STRUCTURES AND ALGORITHMS-2015 Edition.pdf
- 1 Introduction to Data Structures and Algorithms.pdf
- DSA Notes - Data Structures and Algorithms PDF
- Data Structures and Algorithms with JavaScript PDF
Full Transcript
Advanced Data Structures ECAP770 Edited by Balraj Kumar Advanced Data Structures Edited By: Balraj Kumar CONTENT Unit 1: Introduction 1 Ashwani Kumar, Lovely Professio...
Advanced Data Structures ECAP770 Edited by Balraj Kumar Advanced Data Structures Edited By: Balraj Kumar CONTENT Unit 1: Introduction 1 Ashwani Kumar, Lovely Professional University Unit 2: Arrays vs Linked Lists 18 Ashwani Kumar, Lovely Professional University Unit 3: Stacks 43 Ashwani Kumar, Lovely Professional University Unit 4: Queues 57 Ashwani Kumar, Lovely Professional University Unit 5: Search Trees 73 Ashwani Kumar, Lovely Professional University Unit 6: Tree Data Structure 1 87 Ashwani Kumar, Lovely Professional University Unit 7: Tree Data Structure 2 102 Ashwani Kumar, Lovely Professional University Unit 8: Heaps 125 Ashwani Kumar, Lovely Professional University Unit 9: More on Heaps 139 Ashwani Kumar, Lovely Professional University Unit 10: Graphs 163 Ashwani Kumar, Lovely Professional University Unit 11: More on Graphs 180 Ashwani Kumar, Lovely Professional University Unit 12: Hashing Techniques 202 Ashwani Kumar, Lovely Professional University Unit 13: Collision Resolution 215 Ashwani Kumar, Lovely Professional University Unit 14: More on Hashing 229 Ashwani Kumar, Lovely Professional University Notes Ashwani Kumar, Lovely Professional University Unit 01: Introduction Unit 01: Introduction CONTENTS Objectives Introduction 1.1 Data Structure 1.2 Data Structure Operations 1.3 Abstract Data Type 1.4 Algorithm 1.5 Characteristics of an Algorithm 1.6 Types of Algorithms 1.7 Algorithm Complexity 1.8 Asymptotic Notations Summary Keywords Self Assessment Answers for Self Assessment Review Question Further Readings Objectives After studying this unit, you will be able to: Describe basic concepts of data structure Learn Algorithm and its complexity Know Abstract data type Data structure types Introduction The static representation of a linear ordered list using an array wastes resources and, in some situations, causes overflows. We no longer want to pre-allocate memory to any linear list; instead, we want to allocate memory to elements as they are added to the list. This necessitates memory allocation that is dynamic. Semantically data can exist in either of the two forms – atomic or structured. In most of the programming problems data to be read, processed and written are often related to each other. Data items are related in a variety of different ways. Whereas the basic data types such as integers, characters etc. can be directly created and manipulated in a programming language, the responsibility of creating the structured type data items remains with the programmers themselves. Accordingly, programming languages provide mechanism to create and manipulate structured data items. A data structure is a type of storage that is used to organize and store data. It is a method of organizing data on a computer so that it may be easily accessible and modified. Lovely Professional University 1 Notes Advanced Data Structures It's critical to choose the correct data format for your project based on your requirements and project. If you wish to store data sequentially in memory, for example, you can use the Array data structure. 1.1 Data Structure A data structure is a set of data values along with the relationship between the data values. Since, the operations that can be performed on the data values depend on what kind of relationships exists among them, we can specify the relationship amongst the data values by specifying the operations permitted on the data values. Therefore, we can say that a data structure is a set of values along with the set of operations permitted on them. It is also required to specify the semantics of the operations permitted on the data values, and this is done by using a set of axioms, which describes how these operations work, and therefore a data structure is made of: 1. A set of data values. 2. A set of functions specifying the operations permitted on the data values. 3. A set of axioms describing how these operations work. Hence, we conclude that a data structure is a triple (D,F,A), where 1. D is a set of data values 2. F is a set of functions 3. A is a set of axioms A triple (D, F, A) is referred to as an abstract data structure because it does not tell anything about its actual implementation. It does not tell anything about how these values will be physically represented in the computer memory and these functions will be actually implemented. Therefore, every abstract data structure is required to be implemented, and the implementation of an abstract data structure requires mapping of the abstract data structure to be implemented into the data structure supported by the computer. For example, if the abstract data structure to be implemented is integer, then it can be implemented by mapping into bits which is a data structure supported by hardware. This requires that every integer data value is to be represented using suitable bit patterns and expressing the operations on integer data values in terms of operations for manipulating bits. Data Structure mainly two types: 1. Linear type data structure 2. Non-linear type data structure Linear data structure: A linear data structure traverses the data elements sequentially, in which onlyone data element can directly be reached. Ex: Arrays, Linked Lists Arrays: An array is a collection of similar type of data items and each data item is called an element of the array. The data type of the element may be any valid data type like char, int, float or double. — The individual elements of the array age are: — age, age, age, age, age, age. Linked List: Linked list is a linear data structure which is used to maintain a list in the memory. It can be seen as the collection of nodes stored at non-contiguous memory locations. Each node of the list contains a pointer to its adjacent node Stack: Stack is a linear list in which insertion and deletions are allowed only at one end, called top. A stack is an abstract data type, can be implemented in most of the programming languages. It is named as stack because it behaves like a real-world stack, for example: - piles of plates or deck of cards etc. Queue is a linear list in which element can be inserted only at one end called rear and deleted only at other end called front. It is abstract data structure, similar to stack. It is open at both end therefore if follows first-in-first- out (FIFO) technique for storing the data items. 2 Lovely Professional University Notes Unit 01: Introduction Non-linear data structure: Every data item is attached to several other data items in a way that is specific for reflecting relationships. The data items are not arranged in a sequential structure. Ex: Trees, Graphs. Trees: Trees are multilevel data structures with a hierarchical relationship among its elements known as nodes. Graphs: Graphs can be defined as the pictorial representation of the set of elements (represented by vertices) connected by the links known as edges Basic Terminology Data: Data can be defined as an elementary value or the collection of values, for example, student's name and its id are the data about the student. Group Items: Data items which have subordinate data items are called Group item, for example, name of a student can have first name and the last name. Record: Record can be defined as the collection of various data items, for example, if we talk about the student entity, then its name, address, course and marks can be grouped together to form the record for the student. Field: A File is a collection of various records of one type of entity, for example, if there are 60 students in the class, then there will be 20 records in the related file where each record contains the data about each student. Need of Data Structures As applications are getting complex and amount of data is increasing day by day, there may arises many problems: Processor speed: To handle very large amount of data, high speed processing is required, but as the data is growing day by day to the billions of files per entity, processor may fail to deal with that much amount of data. Data Search: Consider an inventory size of 100 items in a store, If our application needs to search for a particular item, it needs to traverse 100 items every time, results in slowing down the search process. Multiple requests: If thousands of users are Searching the data simultaneously on a web server, then there are the chances that a very large server can be failed during that process To solve these problems data structures are used. Lovely Professional University 3 Notes Advanced Data Structures Basic Concept of Data The memory (also called storage or core) of a computer is simply a group of bits (switches). At any instant of the computer’s operation any particular bit in memory is either 0 or 1 (off or on). The setting or state of a bit is called its value and that is the smallest unit of information. A set of bit values form data. Some logical properties can be imposed on the data. According to the logical properties data can be segregated into different categories. Each category having unique set of logical properties is known as data type. Data type are of two types: 1. Simple data type or elementary item like integer, character. 2. Composite data type or group item like array, structure, union. Data structures are of two types: 1. Primitive Data Structures: Data can be structured at the most primitive level, wherethey are directly operated upon by machine-level instructions. At this level, data may becharacter or numeric, and numeric data may consist of integers or real numbers. 2. Non-primitive Data Structures: Non-primitive data structures can be classified as arrays,lists, and fi les. An array is an ordered set which contains a fixed number of objects. No deletions or insertionsare performed on arrays i.e. the size of the array cannot be changed. At best, elements may bechanged. A list, by contrast, is an ordered set consisting of a variable number of elements to whichinsertions and deletions can be made, and on which other operations can be performed. When alist displays the relationship of adjacency between elements, it is said to be linear; otherwise, it issaid to be non- linear. A file is typically a large list that is stored in the external memory of a computer. Additionally, afile may be used as a repository for list items (records) that are accessed infrequently. From a real world perspective, very often we have to deal with structured data items whichare related to each other. For instance, let us consider the address of an employee. We can takeaddress to be one variable of character type or structured into various fields, as shown below: As shown above Address1 is unstructured address data. In this form you cannot access individual items from it. You can at best refer to the entire address at one time. While in the second from, i.e., Address2, you can access and manipulate individual fi elds of the address – House No., Street, PIN etc. Given hereunder are two instances of the address1 and address2 variables. 1.2 Data Structure Operations The data appearing in our data structure is processed by means of certain operations. The particular data structure that one chooses for a given situation depends largely on the frequency with which specific operations are performed. The following four operations play a major role: 1. Traversing: Accessing each record exactly once so that certain items in the record may be processed. (This accessing or processing is sometimes called ‘visiting” the records.) 2. Searching: Finding the location of the record with a given key value, or finding the locations of all records, which satisfy one or more conditions. 4 Lovely Professional University Notes Unit 01: Introduction 3. Inserting: Adding new records to the structure. 4. Deleting: Removing a record from the structure. Sometimes two or more data structure of operations may be used in a given situation; e.g., we may want to delete the record with a given key, which may mean we first need to search for the location of the record. 1.3 Abstract Data Type Before we move to abstract data type let us understand what data type is. Most of the languages support basic data types viz. integer, real, character etc. At machine level, the data is stored as strings containing 1’s and 0’s. Every data type interprets the string of bits in different ways and gives different results. In short, data type is a method of interpreting bit patterns. Every data type has a fixed type and range of values it can operate on. For example, an integer variable can hold values between the min and max values allowed and carry out operations like addition, subtraction etc. For character data type, the valid values are defined in the character set and the operations performed are like comparison, conversion from one case to another etc. There are fixed operations, which can be carried out on them. We can formally defi ne data types as a formal description of the set of values and operations that a variable of a given type may take. That was about the inbuilt data types. One can also create user defined data types, decide the range of values as well as operations to be performed on them. The first step towards creating a user defined data type or a data structure is to defi ne the logical properties. A tool to specify the logical properties of a data type is Abstract Data Type. Data abstraction can be defined as separation of the logical properties of the organization of programs’ data from its implementation. This means that it states what the data should be like. It does not consider the implementation details. ADT is the logical picture of a data type; in addition, the specifications of the operations required to create and manipulate objects of this data type. While defining an ADT, we are not concerned with time and space efficiency or any other implementation details of the data structure. ADT is just a useful guideline to use and implement the data type. An ADT has two parts: 1. Value definition 2. Operation definition. Value definition is again divided into two parts: 1. Definition clause 2. Condition clause As the name suggests the definition clause states the contents of the data type and condition clause defines any condition that applies to the data type. Definition clause is mandatory while condition clause is optional. In operation definition, there are three parts: 1. Function 2. Precondition 3. Postcondition The function clause defines the role of the operation. If we consider the addition operation inintegers the function clause will state that two integers can be added using this function. Ingeneral, precondition specifies any restrictions that must be satisfied before the operation canbe applied. This clause is optional. If we consider the division operation on integers then theprecondition will state that the divisor should not be zero. So any call for divide operation, whichdoes not satisfy this condition, will not give the desired output. Lovely Professional University 5 Notes Advanced Data Structures Precondition specifies any condition that may apply as a pre-requisite for the operation definition.There are certain operations that can be carried out if certain conditions are satisfied. For example,in case of division operation the divisor should never be equal to zero. Only if this condition issatisfied the division operation is carried out. Hence, this becomes a precondition. In that case &(ampersand) should be mentioned in the operation definition. Postcondition specifies what the operation does. One can say that it specifies the state after theoperation is performed. In the addition operation, the post condition will give the addition of thetwo integers. Component of ADT As an example, let us consider the representation of integer data type as an ADT. We will consider only two operations addition and division. Value Definition 1. Definition clause: The values must be in between the minimum and maximum values specified for the particular computer. 2. Condition clause: Values should not include decimal point. Operations 1. add (a, b) Function: add the two integers a and b. Precondition: no precondition. Postcondition: output = a + b 2. Div (a, b) Function: Divide a by b. Precondition: b != 0 Postcondition: output = a/b. There are two ways of implementing a data structure viz. static and dynamic. In static implementation, the memory is allocated at the compile time. If there are more elements than the specified memory then the program crashes. In dynamic implementation, the memory is allocated as and when required during run time. Any type of data structure will have certain basic operations to be performed on its data like insert, delete, modify, sort, search etc. depending on the requirement. These are the entities that decide the representation of data and distinguish data structures from each other. Let us see why user defined data structures are essential. Consider a problem where we need to create a list of elements. Any new element added to the list must be added at the end of the list and whenever an element is retrieved, it should be the last element of the list. One can compare this to a pile of plates kept on a table. Whenever one needs a plate, the last one on the pile is taken and if a plate is to be added on the pile, it will be kept on the top. The description wants us to implement a stack. Let us try to solve this problem using arrays. We will have to keep track of the index of the last element entered in the list. Initially, it will be set to –1. Whenever we insert an element into the list, we will increment the index and insert the value into the new index position. To remove an element, the value of current index will be the output 6 Lovely Professional University Notes Unit 01: Introduction and the index will be decremented by one. In the above representation, we have satisfied the insertion and deletion conditions. Using arrays we could handle our data properly, but arrays do allow access to other values in addition to the top most one. We can insert an element at the end of the list but there is no way to ensure that insertion will be done only at the end. This is because array as a data structure allows access to any of its values. At this point we can think of another representation, a list of elements where one can add at the end, remove from the end and elements other than the top one are not accessible. As already discussed, this data structure is called as STACK. The insertion operation is known as push and removal as pop. You can try to write an ADT for stacks. Another situation where we would like to create a data structure is while working with complex numbers. The operations add, subtract division and multiplication will have to be created as per the rules of complex numbers. The ADT for complex numbers is given below. Only addition and multiplication operations are considered here, you can try to write the remaining operations. Abstract Data Type (ADT) 1. A framework for an object interface 2. What kind of stuff it’d be made of (no details)? 3. What kind of messages it would receive and kind of action it’ll perform when properly triggered? From this we figure out 1. Object make-up (in terms of data) 2. Object interface (what sort of messages it would handle?) 3. How and when it should act when triggered from outside (public trigger) and by another object friendly to it? These concerns lead to an ADT – a definition for the object. An Abstract Data Type (ADT) is a set of data items and the methods that work on them. An implementation of an ADT is a translation into statements of a programming language, of the declaration that defines a variable to be of that ADT, plus a procedure in that language for each operation of the ADT. An implementation chooses a data structure to represent the ADT; each data structure is built up from the basic data types of the underlying programming language. Thus, if we wish to change the implementation of an ADT, only the procedures implementing the operations would change. This change would not affect the users of the ADT. Although the terms ‘data type’, ‘data structure’ and ‘abstract data type’ sound alike, they have different meanings. In a programming language, the data type of a variable is the set of values that the variable may assume. For example, a variable of type Boolean can assume either the value true or the value false, but no other value. An abstract data type is a mathematical model, together with various operations defined on the model. As we have indicated, we shall design algorithms in terms of ADTs, but to implement an algorithm in a given programming language. we must find some way of representing the ADTs in terms of the data types and operators supported by the programming language itself. To represent the mathematical model underlying an ADT, we use data structures, which are a collection of variables, possibly of several data types, connected in various ways. Lovely Professional University 7 Notes Advanced Data Structures The cell is the basic building block of data structures. We can picture a cell as a box that is capable of holding a value drawn from some basic or composite data type. Data structures are created by giving names to aggregates of cells and (optionally) interpreting the values of some cells as representing relationships or connections (e.g., pointers) among cells. 1.4 Algorithm Algorithm is set of rules/ instructions that step-by-step define how a work is to be executed upon in order to get the expected results. systematic procedure that produces in a finite number of steps the answer to a question or the solution of a problem. Computer algorithms work via input and output. They take the input and apply each step of the algorithm to that information to generate an output. E.g. a search engine is an algorithm that takes a search query as an input and searches its database for items relevant to the words in the query. It then outputs the results. Financial companies use algorithms in areas such as loan pricing, stock trading, asset-liability management, and many automated functions. For example, algorithmic trading, known as algo trading, is used for deciding the timing, pricing, and quantity of stock orders. Also referred to as automated trading or black-box trading, algo trading uses computer programs to buy or sell securities at a pace not possible for humans. Computer algorithms make life easier by trimming the time it takes to manually do things. In the world of automation, algorithms allow workers to be more proficient and focused. Algorithms make slow processes more proficient. In many cases, especially in automation, algos can save companies money. 1.5 Characteristics of an Algorithm Well defined Input and output Clear and Unambiguous Finite-ness Feasible Language Independent Input and output should be defined precisely. Each step in the algorithm should be clear and unambiguous. Algorithms should be most effective among many different ways to solve a problem. An algorithm shouldn't include computer code. Instead, the algorithm should be written in such a way that it can be used in different programming languages. The algorithm must be finite, i.e. it should not end up in an infinite loops or similar. The algorithm must be simple, generic and practical, such that it can be executed upon will the available resources. It must not contain some future technology, or anything. The Algorithm designed must be language-independent, i.e. it must be just plain instructions that can be implemented in any language, and yet the output will be same, as expected. 1.6 Types of Algorithms Algorithms are categorized based on the concepts that they use to accomplish a task. Divide and conquer algorithms Brute force algorithms Greedy algorithms Backtracking algorithms Randomized algorithms 8 Lovely Professional University Notes Unit 01: Introduction Example: Step 1: Start Step 2: Declare variables num1, num2 and sum. Step 3: Read values num1 and num2. Step 4: Add num1 and num2 and assign the result to sum. Sum = num1+num2 Step 5: Display sum Step 6: Stop 1.7 Algorithm Complexity Space Complexity Time Complexity Space Complexity: Space complexity of an algorithm refers to the amount of memory that this algorithm requires to execute and get the result. This can be for inputs, temporary operations, or outputs. Fixed Part: This refers to the space that is definitely required by the algorithm. For example, input variables, output variables, program size, etc. Variable Part: This refers to the space that can be different based on the implementation of the algorithm. For example, temporary variables, dynamic memory allocation, recursion stack space, etc. Time Complexity: Time complexity of an algorithm refers to the amount of time that this algorithm requires to execute and get the result. This can be for normal operations, conditional if-else statements, loop statements, etc. Constant time part: Any instruction that is executed just once comes in this part. For example, input, output, if-else, switch, etc. Variable Time Part: Any instruction that is executed more than once, say n times, comes in this part. For example, loops, recursion, etc. 1.8 Asymptotic Notations To measure the efficiency of an algorithm asymptotic analysis is used. The efficiency of an algorithm depends on the amount of time, storage and other resources required to execute the algorithm. Performance of algorithm is change with different type of inputs. The study of change in performance of the algorithm with the change in the order of the input size is defined as asymptotic analysis. Asymptotic notations are the mathematical notations used to describe the running time of an algorithm when the input tends towards a particular value or a limiting value. Types of asymptotic notations There are three major asymptotic notations Big-O notation Omega notation Theta notation Lovely Professional University 9 Notes Advanced Data Structures Big-O notation represents the upper bound of the running time of an algorithm. It gives the worst- case complexity of an algorithm. O(n) is useful when we only have an upper bound on the time complexity of an algorithm. It is widely used to analyses an algorithm as we are always interested in the worst-case scenario. O(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0 } Omega notation represents the lower bound of the running time of an algorithm. It provides the best-case complexity of an algorithm. Omega Notation can be useful when we have lower bound on time complexity of an algorithm.Omega notation is the least used notation among all three. Ω (g(n)) = {f(n): there exist positive constants c and n0 such that 0 A[PTR+1], then: Interchange A[PTR] and A[PTR +1] [End of If structure] Set PTR := PTR+1 [End of inner loop] [End of Step 2 outer loop] 5. Exit In the algorithm, there is an inner loop, which is controlled by the variable PTR, and an index K controls the outer loop. K is used as a counter and PTR is used as an index. The below program illustrates the concept of sorting an array using bubble sort. 22 Lovely Professional University Notes Unit 02: Arrays vs Linked Lists Example: #include #include int A = {55, 22, 2, 43, 12, 8, 32, 15}; //Declaring the array with 8 elements int N = 8; //Size of the array void BUBBLE (void); //BUBBLE Function declaration void main() { int i; //Variable declaration clrscr(); printf("\n\nValues present in array A ="); for (i=0; ilink; Lovely Professional University 29 Notes [Subject] temp-> data = n; temp-> link = p; } return (p); } void printlist( struct node *p ) { struct node *temp; temp = p; printf(“The data values in the list are\n”); if(p!= NULL) { do { printf(“%d\t”,temp->data); temp=temp->link; } while (temp!= p); } else printf(“The list is empty\n”); } void main() { int n; int x; struct node *start = NULL ; printf(“Enter the nodes to be created \n”); scanf(“%d”,&n); while ( n -- > 0 ) { printf( “Enter the data values to be placed in a node\n”); scanf(“%d”,&x); start = insert ( start, x ); } printf(“The created list is\n”); printlist( start ); } Deleting the Specified Node in Singly Linked List To delete a node, first we determine the node number to be deleted (this is based on the assumption that the nodes of the list are numbered serially from 1 to n). The list is then traversed to get a pointer to the node whose number is given, as well as a pointer to a node that appears before the 30 Lovely Professional University Notes Unit 02: Arrays vs Linked Lists node to be deleted. Then the link fi eld of the node that appears before the node to be deleted is made to point to the node that appears after the node to be deleted, and the node to be deleted is freed. Lovely Professional University 31 Notes [Subject] Lab Exercise: # include # include struct node *delet( struct node *, int ); int length ( struct node * ); struct node { int data; struct node *link; }; struct node *insert(struct node *p, int n) { struct node *temp; if(p==NULL) { p=(struct node *)malloc(sizeof(struct node)); if(p==NULL) { printf(“Error\n”); exit(0); } p-> data = n; p-> link = NULL; } else { temp = p; while (temp->link != NULL) temp = temp->link; temp-> link = (struct node *)malloc(sizeof(struct node)); if(temp -> link == NULL) { printf(“Error\n”); exit(0); } temp = temp->link; temp-> data = n; temp-> link = NULL; } return (p); 32 Lovely Professional University Notes Unit 02: Arrays vs Linked Lists } void printlist( struct node *p ) { printf(“The data values in the list are\n”); while (p!= NULL) { printf(“%d\t”,p-> data); p = p->link; } } void main() { int n; int x; struct node *start = NULL; printf(“Enter the nodes to be created \n”); scanf(“%d”,&n); while ( n- > 0 ) { printf( “Enter the data values to be placed in a node\n”); scanf(“%d”,&x); start = insert ( start, x ); } printf(“ The list before deletion id\n”); printlist( start ); printf(“% \n Enter the node no \n”); scanf( “ %d”,&n); start = delet (start , n ); printf(“ The list after deletion is\n”); printlist( start ); } struct node *delet( struct node *p, int node_no ) { struct node *prev, *curr ; int i; if (p == NULL ) { printf(“There is no node to be deleted \n”); } Lovely Professional University 33 Notes [Subject] else { if ( node_no> length (p)) { printf(“Error\n”); } else { prev = NULL; curr = p; i=1; while ( ilink; i = i+1; } if ( prev == NULL ) { p = curr ->link; free ( curr ); } else { prev -> link = curr ->link ; free ( curr ); } } } return(p); } int length ( struct node *p ) { int count = 0 ; while ( p != NULL ) { count++; p = p->link; } 34 Lovely Professional University Notes Unit 02: Arrays vs Linked Lists return ( count ) ; } Inserting a Node after the Specified Node in a Singly Linked List To insert a new node after the specified node, first we get the number of the node in an existinglist after which the new node is to be inserted. This is based on the assumption that the nodes ofthe list are numbered serially from 1 to n. The list is then traversed to get a pointer to the node,whose number is given. If this pointer is x, then the link field of the new node is made to point tothe node pointed to by x, and the link field of the node pointed to by x is made to point to the newnode. Figures 2.3 and 2.4 show the list before and after the insertion of the node, respectively. Insertion in Linked List can happen at following places: At the beginning of the linked list. At the end of the linked list. At a given position in the linked list. Algorithm: Insertion at beginning Step 1: IF PTR = NULL Write OVERFLOW Go to Step 7 [END OF IF] Step 2: SET NEW_NODE = PTR Step 3: SET PTR = PTR → NEXT Step 4: SET NEW_NODE → DATA = VAL Step 5: SET NEW_NODE → NEXT = HEAD Step 6: SET HEAD = NEW_NODE Step 7: EXIT Deletion from a Linked List Delete from beginning Delete from end Delete from middle/ given position Find the previous node of the node to be deleted. Change the next pointer of the previous node Free the memory of the deleted node. In case of first node deletion, we need to update the head of the linked list. Algorithm: Deletion at beginning Step 1: IF HEAD = NULL Write UNDERFLOW Go to Step 5 [END OF IF] Step 2: SET PTR = HEAD Step 3: SET HEAD = HEAD -> NEXT Step 4: FREE PTR Step 5: EXIT Lovely Professional University 35 Notes [Subject] Searching in linked list Searching is performed to find the location of a particular element in the list. Traversing is performed in the list and make the comparison of every element of the list with the specified element. If the element is matched with any of the list element then the location of the element is returned from the function. Algorithm: Searching in linked list Step 1: SET PTR = HEAD Step 2: Set I = 0 STEP 3: IF PTR = NULL WRITE "EMPTY LIST" GOTO STEP 8 END OF IF STEP 4: REPEAT STEP 5 TO 7 UNTIL PTR != NULL STEP 5: if ptr → data = item write i+1 End of IF STEP 6: I = I + 1 STEP 7: PTR = PTR → NEXT [END OF LOOP] STEP 8: EXIT Linked List Common Errors Here is summary of common errors of linked lists. Read these carefully, and read them againwhen you have problem that you need to solve. 1. Allocating a new node to step through the linked list; only a pointer variable is needed. 2. Confusing the and the -> operators. 3. Not setting the pointer from the last node to 0 (null). 4. Not considering special cases of inserting/removing at the beginning or the end of thelinked list. 5. Applying the delete operator to a node (calling the operator on a pointer to the node)before it is removed. Delete should be done after all pointer manipulations are completed. 6. Pointer manipulations that are out of order. These can ruin the structure of the linked list. Sorting and Reversing a Linked List To sort a linked list, fi rst we traverse the list searching for the node with a minimum data value.Then we remove that node and append it to another list which is initially empty. We repeat thisprocess with the remaining list until the list becomes empty, and at the end, we return a pointerto the beginning of the list to which all the nodes are moved. Sorting of linked list 36 Lovely Professional University Notes Unit 02: Arrays vs Linked Lists To reverse a list, we maintain a pointer each to the previous and the next node, then we make thelink field of the current node point to the previous, make the previous equal to the current, and thecurrent equal to the next. Arrays vs. Linked list Array Linked list New elements can be stored anywhere and a Data elements are stored in contiguous locations reference is created for the new element using in memory. pointers. Insertion and Deletion operations are costlier Insertion and Deletion operations are fast and since the memory locations are consecutive and easy in a linked list. fixed. Memory is allocated during the compile time Memory is allocated during the run-time (Static memory allocation). (Dynamic memory allocation). Size of the array must be specified at the time of Size of a Linked list grows/shrinks as and when array declaration/initialization. new elements are inserted/deleted. Summary An array is a set of same data elements grouped together. Arrays can be one-dimensional ormultidimensional. A linear or one-dimensional array is a structured collection of elements (often called as array elements) that are accessed individually by specifying the position of each element with a single index value. Multidimensional arrays are nothing but "arrays of arrays". Two subscripts are used to refer to the elements. The operations that are performed on an array are adding, sorting, searching, and traversing. Traversing an array refers to moving in inward and outward direction to access each element in an array. Linked list is a technique of dynamically implementing a list using pointers. A linked list contains two fields namely, data field and link field. A singly-linked list consists of only one pointer to point to another node and the last node always points to NULL to indicate the end of the list. A doubly-linked list consists of two pointers, one to point to the next node and the other to point to the previous node. In a circular singly-linked list, the last node always points to the first node to indicate the circular nature of the list. A circular doubly-linked list consists of two pointers for forward and backward traversal and the last node points to the first node. Lovely Professional University 37 Notes [Subject] Searching operation involves searching for a specific element in the list using an associated key. Insertion operation involves inserting a node at the beginning or end of a list. Deletion operation involves deleting a node at the beginning or following a given node or at the end of a list. Keywords Non-linear Data Structure: Every data item is attached to several other data items in a way thatis specific for reflecting relationships. The data items are not arranged in a sequential structure. Searching: Finding the location of the record with a given key value, or fi nding the locations ofall records, which satisfy one or more conditions. Traversing: Accessing each record exactly once so that certain items in the record may beprocessed. Circular Linked List: A linear linked list in which the last element points to the fi rst element, thus,forming a circle. Doubly Linked List: A linear linked list in which each element is connected to the two nearestelements through pointers. Self-Assessment 1. Which one is correct statement? A. Search in array is delete an element from array B. Search in array is find an element from array C. Search in array is insert an element in array D. None of above 2. Which is not correct syntax? A. abcname[ ]; B. abcname; C. abcname = {20,25,35}; D. abcname = {15;20;28}; 3. Searching is a process in which we find element in array A. True B. False 4. Operations can be performed on array A. Sorting B. Merging C. Traversing D. All of above 5. To merge two arrays, how many variables are required (minimum)? 38 Lovely Professional University Notes Unit 02: Arrays vs Linked Lists A. 1 B. 2 C. 5 D. 3 6. Elements of arrays are stored in ……….. memory locations A. Random B. Sequential C. Both random and sequential D. None of above 7. Which is correct statement? A. Insertion at the given index of an array B. Insertion after the given index of an array C. Insertion before the given index of an array D. All of above 8. Traversal is process of visit each element of an array A. True B. False 9. Which statement is incorrect? 1. int arr[MAX]= {10,12,15,20},i,val; 2. printf("the array element are \n’); 3. for(i=0;i (*) > (/) > (+) > (-) Brackets have the highest priority. To evaluate an infix expression, We need to perform 2 steps: Convert infix to postfix Evaluate postfix Sorting A Sorting process is used to rearrange a given array or elements based upon selected algorithm/ sort function. Quick Sort is used for sorting a list of data elements.Quicksort is a sorting algorithm based on the divide and conquer approach.An array is divided into subarrays by selecting a pivot element.During array dividing, the pivot element should be positioned in such a way that elements less than pivot are kept on the left side and elements greater than pivot are on the right side of the pivot.The left and right subarrays are also divided using the same approach. This process continues until each subarray contains a single element There are many different versions of quick Sort that pick pivot in different ways. Always pick first element as pivot. Always pick last element as pivot Pick a random element as pivot. Pick median as pivot. Lovely Professional University 49 Notes Advanced Data Structures Algorithm quickSort(arr, beg, end) if (beg < end) pivotIndex = partition(arr,beg, end) quickSort(arr, beg, pivotIndex) quickSort(arr, pivotIndex + 1, end) partition(arr, beg, end) set end as pivotIndex pIndex = beg - 1 for i = beg to end-1 if arr[i] < pivot swap arr[i] and arr[pIndex] pIndex++ swap pivot and arr[pIndex+1] return pIndex + 1 Tower of Hanoi The Tower of Hanoi, is a mathematical problem which consists of three rods and multiple disks.Initially, all the disks are placed on one rod, one over the other in ascending order of size similar to a cone-shaped tower. The objective of this problem is to move the stack of disks from the source to destination, following these rules: 1. Only one disk can be moved at a time. 2. Only the top disk can be removed. 3. No large disk can sit over a small disk. 50 Lovely Professional University Notes Unit 03: Stacks Lovely Professional University 51 Notes Advanced Data Structures Iterative Algorithm 1. At First Calculate the number of moves required i.e. "pow(2,n) - 1" where "n" is number of discs. 2. If the number of discs i.e n is even then swap Destination Rod and Auxiliary Rod. 3. for i = 1 upto number of moves: Check if "i mod 3" == 1: Perform Movement of top disc between Source Rod and Destination Rod. Check if "i mod 3" == 2: Perform Movement of top disc between Source Rod and Auxiliary Rod. Check if "i mod 3" == 0: Perform Movement of top disc between Auxiliary Rod and Destination Rod. Simulating Recursive Function using Stack A recursive solution to a problem is often more expensive than a non-recursive solution, both in terms of time and space. Frequently, this expense is a small price to pay for the logical simplicity and self-documentation of the recursive solution. However, in a production program (such as a compiler, for example) that may be run thousands of times, the recurrent expense is a heavy burden on the system’s limited resources. Thus, a program may be designed to incorporate a recursive solution in order to reduce the expense of design and certifi cation, and then carefully converted to a non-recursive version to be put into actual day-to-day use. As we shall see, in performing such as conversion it is often possible to identify parts of the implementation of recursion that are superfluous in a particular application and thereby significantly reduce the amount of work that the program must perform. Suppose that we have the statement rout (x); where route is defi ned as a function by the header rout(a); x is referred to as an argument (of the calling function), and a is referred to as a parameter(of the called function). What happens when a function is called? The action of calling a function may be divided intothree parts: 1. Passing Arguments 2. Allocating and initializing local variables 3. Transferring control to the function. 1. Passing arguments: For a parameter in C, a copy of the argument is made locally within the function, and any changes to the parameter are made to that local copy. The effect to this scheme is that the original input argument cannot be altered. In this method, storage for the argument is allocated within the data area of the function. 2. Allocating and initializing local variables: After arguments have been passed, the local variables of the function are allocated. These local variables include all those declared directly in the function and any temporaries that must be created during the course of execution. 3. Transferring control to the function: At this point control may still not be passed to the function because provision has not yet been made for saving the return address. If a function is 52 Lovely Professional University Notes Unit 03: Stacks given control, it must eventually restore control to the calling routine by means of a branch. However, it cannot execute that branch unless it knows the location to which it must return. Since this location is within the calling routine and not within the function, the only way that the function can know this address is to have it passed as an argument. This is exactly what happens. Aside from the explicit arguments specified by the programmer, there is also a set of implicit arguments that contain information necessary for the function to execute and return correctly. Chief among these implicit arguments is the return address. The function stores this address within its own data area. When it is ready to return control to the calling program, the function retrieves the return address and branches to that location. Once the arguments and the return address have been passed, control may be transferred to the function, since everything required has been done to ensure that the function can operate on the appropriate data and then return to the calling routine safely. Summary A stack is a linear data structure in which allocation and deallocation are made in a last-in- first-out (LIFO) method. The basic operations of stack are inserting an element on the stack (push operation) and deleting an element from the stack (pop operation). Stacks are represented in main memory by using one-dimensional array or a singly linked list. To implement a stack structure, an array can be used as its storage structure. Each element of the stack occupies one array element. Static implementation of stack can be achieved using arrays. Stack is used to store return information in the case of function/procedure/subroutine calls. Hence, one would fi nd a stack in architecture of any Central Processing Unit (CPU). In infix notation operators come in between the operands. An expression can be evaluated using stack data structure. Keywords LIFO: (Last In First Out) the property of a list such as stack in which the element which can be retrieved is the last element to enter it. Pop: Stack operation retrieves a value form the stack. Infix: Notation of an arithmetic expression in which operators come in between their operands. Postfix: Notation of an arithmetic expression in which operators come after their operands. Prefix: Notation of an arithmetic expression in which operators come before their operands. Push: Stack operation which puts a value on the stack. Stack: A linear data structure where insertion and deletion of elements can take place only at one end. SelfAssessment 1. Which method is followed by Stack? A. FILO B. LIFO C. Both FILO and LIFO D. None of above Lovely Professional University 53 Notes Advanced Data Structures 2. Which is not stack operation? A. count() B. peek() C. getche() D. display() 3. Which is not part of stack? A. Overflow B. Enqueue C. Underflow D. None of above 4. To returns the element at the given position which function is used? A. isEmpty() B. isFull() C. peek() D. display() 5. Stack implementation is done using…… A. Array B. Linked list C. Both using array and linked list D. None of above 6. Parenthesis checker is used for A. Balanced Brackets B. Unbalanced Brackets C. Both balanced and unbalanced D. None of above 7. Which is incorrect statement? A. (a+b*(c/d)) B. [10+20*(6+7)}] C. (x+y)/(c-d) D. None of above 8. Which is not Sorting type A. Bubble B. Merge C. Insertion D. Linear 54 Lovely Professional University Notes Unit 03: Stacks 9. Tower of Hanoi is associated with…. A. Queue B. Stack C. Array D. None of above 10. Arithmetic expressions can be represented as A. Infix notation B. Postfix notation C. Prefix notation D. All of above 11. Prefix notation can be represented as A. operator operand1 operand2 B. operand1 operand2 operator C. operand1 operator operand1 D. None of above 12. Postfix notation can be represented as A. operator operand1 operand2 B. operand1 operator operand1 C. operand1 operand2 operator D. None of above 13. Stack implementation is done using______ A. Statically B. Dynamically C. Both Statically and Dynamically D. None of above 14. Which is not stack operation? A. peek() B. pop() C. display() D. printf() 15. Underflow condition occur when stack is_____ A. Full B. Empty C. Half D. None of above Lovely Professional University 55 Notes Advanced Data Structures Answers for Self Assessment 1. C 2. C 3. B 4. C 5. C 6. C 7. B 8. D 9. B 10. D 11. A 12. C 13. C 14. D 15. B Review Questions 1 Explain how function calls may be implemented using stacks for return values. 2 What are the advantages of implementing a stack using dynamic memory allocation method? 3 Explain concept of tower of Hanoi. 4 what are the different methods for implementing stacks? 5 Give an example of push and pop operation using stack. 6 Write an algorithm to reverse an input string of characters using a stack. Further Readings Data Structures and Efficient Algorithms, Burkhard Monien, Thomas Ottmann, Springer. Kruse Data Structure & Program Design, Prentice Hall of India, New Delhi Mark Allen Weles: Data Structure & Algorithm Analysis in C Second Adition. Addison- Wesley publishing RG Dromey, How to Solve it by Computer, Cambridge University Press. Shi-kuo Chang, Data Structures and Algorithms, World Scientifi c Sorenson and Tremblay: An Introduction to Data Structure with Algorithms. Thomas H. Cormen, Charles E, Leiserson& Ronald L. Rivest: Introduction to Algorithms. Prentice-Hall of India Pvt. Limited, New Delhi Timothy A. Budd, Classic Data Structures in C++, Addison Wesley. Web Links www.en.wikipedia.org www.webopedia.com https://www.programiz.com/ https://www.javatpoint.com/data-structure-stack https://www.tutorialspoint.com/data_structures_algorithms/stack_algorithm.htm 56 Lovely Professional University Notes Ashwani Kumar, Lovely Professional University Unit 04: Queues Unit 04: Queues CONTENTS Objectives Introduction 4.1 Fundamentals of Queues 4.2 Basic Operations of Queue 4.3 Types of Queue 4.4 Implementation of Queues 4.5 Applications of Queues Summary Keywords Self Assessment Answers for Self Assessment Review Questions Further Readings Objectives After studying this unit, you will be able to: Learn implementation of queues Explain priority queue Discuss applications of queues Introduction A queue is a linear list of elements that consists of two ends known as front and rear. We can delete elements from the front end and insert elements at the rear end of a queue. A queue in an application is used to maintain a list of items that are ordered not by their values but by their sequential value. The queue abstract data type is also a widely used one with applications very common in real life. An example comes from the operating system software where the scheduler picks up the next process to be executed on the system from a queue data structure. In this unit, we would study the various properties of queues, their operations and implementation strategies. 4.1 Fundamentals of Queues A queue is an ordered collection of items in which deletion takes place at one end, which is called the front of the queue, and insertion at the other end, which is called the rear of the queue. The queue is a ‘First In First Out’ system (FIFO). In a time-sharing system, there can be many tasks waiting in the queue, for access to disk storage or for using the CPU. The queues in a bank, or railway station counter are examples of queue. The first person in the queue is the first to be attended. The two main operations in the queue are insertion and deletion of items. The queue has two pointers, the front pointer points to the first element of the queue and the rear pointer points to the last element of the queue. Lovely Professional University 57 Notes Advanced Data Structures 4.2 Basic Operations of Queue The basic operations of queue are insertion and deletion of items which are referred as enqueue and dequeue respectively. In enqueue operation, an item is added to the rear end of the queue. In dequeue operation, the item is deleted from the front end of the queue. Insert at Rear End To insert an item into the queue, first it should be verified whether the queue is full or not. If the queue is full, a new item cannot be inserted into the queue. The condition FRONT=NULL indicates that the queue is empty. If the queue is not full, items are inserted at the rear end. When an item is added to the queue, the value of rear is incremented by 1. In queue we need to maintain two data pointers, front and rear. Operations on queue are comparatively difficult to implement than that of stacks Step 1 − Check if the queue is full. Step 2 − If the queue is full, produce overflow error and exit. Step 3 − If the queue is not full, in crement rear pointer to point the next empty space. Step 4 − Add data element to the queue location, where the rear is pointing. Step 5 − return success. Algorithm: Enqueue operation procedure enqueue(data) if queue is full return overflow endif rear ← rear + 1 queue[rear] ← data return true end procedure Delete from the Front End To delete an item from the stack, first it should be verified that the queue is not empty. If the queue is not empty, the items are deleted at the front end of the queue. When an item is deleted from the queue, Dequeue operation include two tasks: access the data where front is pointing and remove the data after access. Step 1 − Check if the queue is empty. Step 2 − If the queue is empty, produce underflow error and exit. Step 3 − If the queue is not empty, access the data where front is pointing. Step 4 − Increment front pointer to point to the next available data element. Step 5 − Return success. the value of the front is incremented by 1. Algorithm: Dequeue operation procedure dequeue if queue is empty 58 Lovely Professional University Notes Unit 04: Queues return underflow end if data = queue[front] front ← front + 1 return true end procedure Example: # include # define MAX 50 int queue_arr[MAX]; int rear = -1; int front = -1; void ins_delete(); void insert(); void display(); void main() { int choice; while(1) { printf("1.Insert\n"); printf("2.Delete\n"); printf("3.Display\n"); printf("4.Quit\n"); printf("Enter your choice : \n"); scanf("%d",&choice); switch(choice) { case 1 : insert(); break; case 2 :ins_delete(); break; case 3: ins_display(); break; case 4: exit(1); default: Lovely Professional University 59 Notes Advanced Data Structures printf("Wrong choice\n"); } } } void insert() { int added_item; if (rear==MAX-1) printf("Queue overflow\n"); else { if (front==-1) front=0; printf("Enter an element to add in the queue : "); scanf("%d", &added_item); rear=rear+1; queue_arr[rear] = added_item ; } } void ins_delete() { if (front == -1 || front > rear) { printf("Queue underflow\n"); return ; } else { printf("Element deleted from queue is : %d\n", queue_arr[front]); front=front+1; } } void display() { int i; if (front == -1) printf("Queue is empty\n"); else { printf("Elements in the queue:\n"); for(i=front;i DATA = VAL Step 3: IF FRONT = NULL SET FRONT = REAR = PTR SET FRONT -> NEXT = REAR -> NEXT = NULL ELSE SET REAR -> NEXT = PTR SET REAR = PTR SET REAR -> NEXT = NULL [END OF IF] Step 4: END Delete operation Deletion operation removes the element that is first inserted among all the queue elements. The condition front == NULL becomes true if the list is empty. Otherwise, we will delete the element that is pointed by the pointer front. Algorithm Step 1: IF FRONT = NULL Write " Underflow " Go to Step 5 Lovely Professional University 67 Notes Advanced Data Structures [END OF IF] Step 2: SET PTR = FRONT Step 3: SET FRONT = FRONT -> NEXT Step 4: FREE PTR Step 5: END 4.5 Applications of Queues One major application of the queue data structure is in the computer simulation of a real-world situation. Queues are also used in many ways by the operating system, the program that schedules and allocates the resources of a computer system. One of these resources is the CPU (CentralProcessing Unit) itself. If you are working on a multi-user system and you tell the computerto run a particular program, the operating system adds your request to its “job queue”. Whenyour request gets to the front of the queue, the program you requested is executed. Similarly, thevarious users for the system must share the I/O devices (printers, disks etc.). Each device has itsown queue of requests to print, read or write to these devices. The following subsection discussesone application of the queues – the priority queue. It is used in time-sharing multi-user systemswhere programs of high priority are processed first arid programs with the same priority forma standard queue. In Operating systems: a) Semaphores b) FCFS ( first come first serve) scheduling, c) Spooling in printers d) Buffer for devices like keyboard In Networks: a) Queues in routers/ switches b) Mail Queues Queues are used in operating systems for handling interrupts. Queues are used as buffers in most of the applications like MP3 media player, CD player, etc When a resource is shared among multiple consumers. CPU scheduling, Disk Scheduling. Summary A queue is an ordered collection of items in which deletion takes place at the front and insertion at the rear of the queue. In a memory, a queue can be represented in two ways; by representing the way in which the elements are stored in the memory, and by naming the address to which the front and rear pointers point to. The different types of queues are double ended queue, circular queue, and priority queue. The basic operations performed on a queue include inserting an element at the rear end and deleting an element at the front end. A priority queue is a collection of elements such that each element has been assigned a priority. An element of higher priority is processed before any element of lower priority. Two elements with the same priority are processed according to the order in which they were inserted into the queue. 68 Lovely Professional University Notes Unit 04: Queues Keywords FIFO: (First In First Out) The property of a linear data structure which ensures that the element retrieved from it is the first element that went into it. Front: The end of a queue from where elements are retrieved. Queue: A linear data structure in which the element is inserted at one end while retrieved from another end. Rear: The end of a queue where new elements are inserted. Dequeue: Process of deleting elements from the queue. Enqueue: Process of inserting elements into queue. SelfAssessment 1. Which technique is followed by queue? A. FIFO B. LIFO C. Both LIFO and FIFO D. None of above 2. Enqueue () operation is used to perform A. Deletion B. Insertion C. Display D. All of above 3. Which operation is part of queue A. Peek B. isFull C. isEmpty D. All of above 4. Queue implementation is done using A. Array B. Stack C. Linked List D. All of above 5. Which is not types of Queue A. Circular B. Simple C. Complex D. Priority Lovely Professional University 69 Notes Advanced Data Structures 6. Queue is a____________ data structure. A. Static B. Linear C. Dynamic D. None of above 7. Front pointer and rear pointer are used in… A. Implementation of Queue using Linked List B. Implementation of Queue using array C. Implementation of Queue using stack D. All of above 8. Underflow condition represent. A. It checks if the queue is full before enqueueing any element. B. It checks if there exists any item before popping from the queue. C. It checks whether all variables are declared D. All of above 9. Overflow condition represent. A. It checks if there exists any item before popping from the queue. B. It checks whether all variables are initialized. C. It checks if the queue is full before enqueueing any element. D. All of above 10. Front pointer contains A. Address of the starting element of the queue B. Address of the last element of the queue. C. Link for next element D. All of above 11. Priority queue is a____________ data structure. A. Static B. Linear C. Abstract D. All of above 12. Priority queue types are A. Ascending order B. Descending order C. Both ascending and descending order D. None of above 70 Lovely Professional University Notes Unit 04: Queues 13. Priority Queue Operations are A. Deleting an Element from the Priority Queue B. Peeking from the Priority Queue (Find max/min) C. Extract-Max/Min from the Priority Queue D. All of above 14. Priority Queue Implementation are performed using. A. Linked list B. Heap data structure C. Binary search tree D. All of above 15. Binary Heap can be divided into A. Max heap B. Min-heap. C. Both max and min heap D. None of above Answers for Self Assessment 1. A 2. B 3. D 4. D 5. C 6. B 7. A 8. B 9. C 10. A 11. B 12. C 13. D 14. D 15. C Review Questions 1 “Using double ended queues is more advantageous than using circular queues. “Discuss 2 “Stacks are different from queues.” Justify. 3 “Using priority queues is advantageous in job scheduling algorithms. “Analyze 4 Can a basic queue be implemented to function as a dynamic queue? Discuss 5 Describe the application of queue. 6 How will you insert and delete an element in queue? 7 Explain dynamic memory allocation advantages. Further Readings Data Structures and Algorithms; Shi-Kuo Chang; World Scientifi c. Data Structures and Efficient Algorithms, Burkhard Monien, Thomas Ottmann, Springer. Kruse Data Structure & Program Design, Prentice Hall of India, New Delhi Mark Allen Weles: Data Structure & Algorithm Analysis in C Second Adition. Lovely Professional University 71 Notes Advanced Data Structures Addison-Wesley publishing RG Dromey, How to Solve it by Computer, Cambridge University Press. Shi-kuo Chang, Data Structures and Algorithms, World Scientifi c Sorenson and Tremblay: An Introduction to Data Structure with Algorithms. Thomas H. Cormen, Charles E, Leiserson& Ronald L. Rivest: Introduction to Algorithms. Prentice-Hall of India Pvt. Limited, New Delhi Timothy A. Budd, Classic Data Structures in C++, Addison Wesley. Web Links www.en.wikipedia.org www.web-source.net www.webopedia.com https://www.geeksforgeeks.org/ https://www.javatpoint.com/data-structure-queue https://www.tutorialspoint.com/data_structures_algorithms/dsa_queue.htm 72 Lovely Professional University Notes Ashwani Kumar, Lovely Professional University Unit 05: Search Trees Unit 05: Search Trees CONTENTS Objectives Introduction 5.1 Concept of Tree 5.2 Binary Tree 5.3 Binary Search Tree 5.4 Binary Search Tree Operations Summary Keywords Self Assessment Review Questions Answers for self Assessment Further Readings Objectives After studying this unit, you will be able to: Discuss the basics of tree Learn concept of binary tree Know binary tree traversal Explain the representation of tree in memory Introduction We know that data structure is a set of data elements grouped together under one name. A data structure can be considered as a set of rules that hold the data together. Almost all computer programs use data structures. Data structures are an essential part of algorithms. We can use it to manage huge amount of data in large databases. Some modern programming languages emphasize more on data structures than algorithms. There are many data structures that help us to manipulate the data stored in the memory, which we have discussed in the previous units. These include array, stack, queue, and linked- list. Choosing the best data structure for a program is a challenging task. Similar tasks may require different data structures. We derive new data structures for complex tasks using the already existing ones. We need to compare the characteristics of the data structures before choosing the right data structure. A tree is a hierarchical data structure suitable for representing hierarchical information. The tree data structure has the characteristics of quick search, quick inserts, and quick deletes. 5.1 Concept of Tree A tree is a set of one or more nodes T such that: 1. There is a specially designated node called root, and 2. Remaining nodes are partitioned into n >= 0 disjoint set of nodes T1, T2,…,Tn each of which is a tree. Lovely Professional University 73 Notes Advanced data structures This is a tree because it is a set of nodes {A, B, C, D, E, F, G, H, I}, with node A as a root node, and the remaining nodes are partitioned into three disjoint sets: {B, G, H, I}, {C, E, F} AND {D} respectively. Each of these sets is a tree individually because each of these sets satisfies the above properties. This is not a tree because it is a set of nodes {A, B, C, D, E, F, G, H, I}, with node A as a root node, but the remaining nodes cannot be partitioned into disjoint sets, because the node I is shared. Given below are some of the important definitions, which are used in connection with trees. Degree of Node of a Tree: The degree of a node of a tree is the number of sub-trees havingthis node as a root, or it is a number of decedents of a node. If degree is zero then it is calledterminal node or leaf node of a tree. Degree of a Tree: It is defined as the maximum of degree of the nodes of the tree, i.e. degreeof tree = max (degree (node i) for i = 1 to n). Level of a Node: We defi ne the level of the node by taking the level of the root node to be 1,and incrementing it by 1 as we move from the root towards the sub-trees i.e. the level of allthe descendents of the root nodes will be 2. The level of their descendents will be 3 and soon. We then define depth of the tree to be the maximum value of level for node of a tree. Root Node: The root of a tree is called a root node. A root node occurs only once in the whole tree. Parent Node: The parent of a node is the immediate predecessor of that node. Child Node: Child nodes are the immediate successors of a node. Leaf Node: A node which does not have any child nodes is known as a leaf node. Link: The pointer to a node in the tree is known as a link. A node can have more than one link. Path: Every node in the tree is reachable from the root node through a unique sequence of links. This sequence of links is known as a path. The number of links in a path is considered to be the length of the path. Levels: The level of a node in the tree is considered to be its hierarchical rank in the tree. Height: The height of a non-empty tree is the maximum level of a node in the tree. The height of an empty tree (no node in a tree) is 0. The height of a tree containing a single node is 1. The longest path in the tree has to be considered to measure the height of the tree. Height of a tree (h) = Imax+ 1, where Imax is the maximum level of a tree. Siblings: The nodes which have the same parent node are known as siblings. 74 Lovely Professional University Notes Unit 05: Search Trees Graphs consist of a set of nodes and edges, just like trees. But for graphs, there are no rules for the connections between nodes. In graphs, there is no concept of a root node, nor a concept of parents and children. Rather, a graph is just a collection of interconnected nodes. All trees are graphs. A tree is a special case of a graph, in which the nodes are all reachable from some starting node. Representation of Tree in Graphs A graph G consists of a set of objects V = {v1, v2, v3 …} called vertices (points or nodes) and a set of objects E = {e1, e2, e3 ….} called edges (lines or arcs). The set V (G) is called the vertex set of G and E (G) is the edge set. The graph is denoted as G = (V, E) Let G be a graph and {u, v} an edge of G. Since {u, v} is 2-element set, we write {v, u} instead of {u, v}. This edge can be represented as uv or vu. If e = uv is an edge of a graph G, then u and v are adjacent in G and e joins u and v. Consider given graph This graph G is defined by the sets: V (G) = {u, v, w, x, y, z} and E(G) = { uv, uw, wx, xy, xz} Every graph has a diagram associated with it. The vertex u and an edge e are incident with each other as are v and e. If two distinct edges e and f are incident with a common vertex, then they are adjacent edges. 5.2 Binary Tree A binary tree is a special tree where each non-leaf node can have atmost two child nodes. Most important types of trees which are used to model yes/no, on/off, higher/lower, i.e., binary decisions are binary trees. A tree data structure in which every node has a maximum of two child nodes is known as a binary tree. It is the most commonly used non-linear data structure. A binary tree could either have only a root node or two disjoint binary trees called the left sub-tree or the right sub-tree. An empty tree could also be a binary tree. Recursive Definition: “A binary tree is either empty or a node that has left and right sub-trees that are binary trees. Empty trees are represented as boxes (but we will almost always omit the boxes)”. Binary Tree Structure Lovely Professional University 75 Notes Advanced data structures In a formal way, we can define a binary tree as a finite set of nodes which is either empty or partitioned in to sets of T0, Tl, Tr, where T0 is the root and Tl and Tr are left and right binary trees, respectively. So, for a binary tree we find that: 1. The maximum number of nodes at level i will be 2i−1 2. If k is the depth of the tree then the maximum number of nodes that the tree can have is 2k − 1 = 2k−1 + 2k−2 + … + 20 Types of Binary Tree There are two main binary tree and these are: 1. Full binary tree 2. Complete binary tree Full Binary Tree A full binary tree is a binary of depth k having 2k − 1 nodes. If it has < 2k − 1, it is not a full binary tree. For k = 3, the number of nodes = 2k − 1 = 23 − 1 = 8 − 1 = 7. A full binary tree with depth k = 3 is shown in figure. A Full Binary Tree We use numbers from 1 to 2k − 1 as labels of the nodes of the tree. If a binary tree is full, then we can number its nodes sequentially fom 1 to 2k−1, starting from the root node, and at every level numbering the nodes from left to right. Complete Binary Tree A complete binary tree of depth k is a tree with n nodes in which these n nodes can be numbered sequentially from 1 to n, as if it would have been the fi rst n nodes in a full binary tree of depth k. A complete binary tree with depth k = 3 is shown in Figure A Complete Binary Tree Properties of a Binary Tree Main properties of binary tree are: 1. If a binary tree contains n nodes, then it contains exactly n – 1 edges; 2. A Binary tree of height h has 2h – 1 nodes or less. 76 Lovely Professional University Notes Unit 05: Search Trees 3. If we have a binary tree containing n nodes, then the height of the tree is at most n and at least ceiling log2 (n + 1). 4. If a binary tree has n nodes at a level l then, it has at most 2n nodes at a level l + 1 5. The total number of nodes in a binary tree with depth k (root has depth zero) is N = 20 + 21 + 22 + …….+ 2k = 2k+1 – 1. 5.3 Binary Search Tree A binary search tree is a binary tree which may be empty, and every node contains an identifier and 1. Identifier of any node in the left sub-tree is less than the identifier of the root 2. Identifier of any node in the right sub-tree is greater than the identifier of the root and the left sub-tree as well as right sub-tree both are binary search trees. Binary Search Tree A binary search tree is basically a binary tree, and therefore it can be traversed is inorder, preorder, and postorder. If we traverse a binary search tree in inorder and print the identifiers contained in the nodes of the tree, we get a sorted list of identifiers in the ascending order. A binary search tree is an important search structure. For example, consider the problem of searching a list. If a list is an ordered then searching becomes faster, if we use a contiguous list and binary search, but if we need to make changes in the list like inserting new entries and deleting old entries. Then it is much slower to use a contiguous list because insertion and deletion in a contiguous list requires moving many of the entries every time. So we may think of using a linked list because it permits insertions and deletions to be carried out by adjusting only few pointers, but in a linked list there is no way to move through the list other than one node at a time hence permitting only sequential access. Binary trees provide an excellent solution to this problem. By making the entries of an ordered list into the nodes of a binary search tree, we find that we can search for a key in O(n log n) steps. Creating a Binary Search Tree We assume that every node a binary search tree is capable of holding an integer data item and the links which can be made pointing to the root of the left and the right sub-tree respectively. Therefore the structure of the node can be defi ned using the following declaration: struct tnode { int data; tnode *lchid; tnode *rchild; } To create a binary search tree we use a procedure named insert which creates a new node with the data value supplied as a parameter to it, and inserts into an already existing tree whose root Lovely Professional University 77 Notes Advanced data structures pointer is also passed as a parameter. The procedure accomplishes this by checking whether the tree whose root pointer is passed as a parameter is empty. If it is empty then the newly created node is inserted as a root node. If it is not empty then it copies the root pointer into a variable temp1, it then stores value of temp1 in another variable temp2, compares the data value of the node pointed to by temp1 with the data value supplied as a parameter, if the data value supplied as a parameter is smaller than the data value of the node pointed to by temp1 then it copies the left link of the node pointed by temp1 into temp1 (goes to the left), otherwise it copies the right link of the node pointed by temp1 into temp1(goes to the right). It repeats this process till temp1 becomes nil. When temp1 becomes nil, the new node is inserted as a left child of the node pointed to by temp2 if data value of the node pointed to by temp2 is greater than data value supplied as parameter. Otherwise the new node is inserted as a right child of node pointed to by temp2. Therefore the insert procedure is void insert(tnode *p, int val) { tnode *temp1, *temp2; if (p == NULL) { p = new(tnode); p->data = val; p->1child = NULL; p->rchild = NULL; } else { temp1 = p; while(temp1 != NULL) { temp2 = temp1; if(temp1->data > val) temp1 = temp1->1eft; else temp1 = temp1->right; } if(temp2->data > val) { temp2->left = new(tnode); temp2 = temp2->left; temp2->data = val; temp2->left = NULL; temp2->right= NULL; } else { temp2->right = new(tnode); 78 Lovely Professional University Notes Unit 05: Search Trees temp2 = temp2->right; temp2->data = val; temp2->left = NULL; temp2->right = NULL; } } } 5.4 Binary Search Tree Operations The four main operations that we perform on binary trees are: 1. Searching 2. Insertion 3. Deletion 4. Traversal Searching in Binary Search Trees In searching, the node being searched is called as key node. We first match the key node with the root node. If the value of the key node is greater than the current node, then we search for it in the right subtree, else we search in the left sub-tree. We continue this process until we find the node or until no nodes are left. The pseudo code for searching a binary search tree is as follows: Pseudocode for a Binary Search Tree find(X, node){ if(node = NULL) return NULL if(X = node:data) return node else if(Xnode:data) return find(X,node:rightChild) } Inserting in Binary Search Trees To insert a new element in an existing binary search tree, first we compare the value of the new nodewith the current node value. If the value of the new node is lesser than the current node value, we insertit as a left sub-node. If the value of the new node is greater than the current node value, then we insert itas a right sub-node. If the root node of the tree does not have any value, we can insert the new node asthe root node. Algorithm for Inserting a Value in a Binary Search Tree 1. Read the value for the node that needs to be created and store it in a node called NEW. 2. At first, if (root! =NULL) then root = NEW. 3. If (NEW->value < root->value) then attach NEW node as a left child node of root, else attach NEWnode as a right child node of root. 4. Repeat steps 3 and 4 for creating the desired binary search tree completely. Lovely Professional University 79 Notes Advanced data structures When inserting any node in a binary search tree, it is necessary to look for its proper position in thebinary search tree. The new node is compared with every node of the tree. If the value of the nodewhich is to be inserted is more than the value of the current node, then the right sub-tree is considered,else the left sub-tree is considered. Once the proper position is identified, the new node is attached asthe left or right child node. Let us now discuss the pseudo code for inserting a new element in a binarysearch tree. Pseudocode for Inserting a Value in a Binary Search Tree //Purpose: Insert data object X into the Tree //Inputs: Data object X (to be inserted), binary-search-tree node //Effect: Do nothing if tree already contains X; // otherwise, update binary search tree by adding a new node containing data object X insert(X, node){ if(node = NULL){ node = new binaryNode(X,NULL,NULL) return } if(X = node:data) return else if(Xnode:data insert(X, node:rightChild) } Deleting in Binary Search Trees If the node to be deleted has no children, we can just delete it. If the node to be deleted has one child, then the node is deleted and the child is connected directly to the parent node. There are mainly three cases possible for deletion of any node from a binary search tree. They are: 1. Deletion of the leaf node 2. Deletion of a node that has one child 3. Deletion of a node that has two children We can delete an existing element from a binary search tree using the following pseudocode: Pseudocode for Deleting a Value from a Binary Search Tree //Purpose: Delete data object X from the Tree //Inputs: Data object X (to be deleted), binary-search-tree node //Effect: Do nothing if tree does not contain X; // else, update binary search tree by deleting the node containing data object X delete(X, node){ if(node = NULL) //nothing to do return if(Xnode:data) delete(X, node:rightChild) else { // found the node to be deleted. Take action based on number of node children if(node:leftChild = NULL and node:rightChild = NULL){ delete node node = NULL return } else if(node:leftChild = NULL){ tempNode = node node = node:rightChild delete tempNode} else if(node:rightChild = NULL){ tempNode = node node = node:leftChild delete tempNode } else { //replace node:data with minimum data from right sub-tree tempNode = findMin(node.rightChild) node:data = tempNode:data delete(node:data,node:rightChild) } } Pseudocode for Finding Minimum Value from a Binary Search Tree //Purpose: return least data object X in the T