Data Structures Full Notes PDF
Document Details
Uploaded by Deleted User
Tags
Related
- Data Structures & Algorithms Lecture Notes 1 PDF
- Data Structures and Algorithms Study Guide (Sorsogon State University)
- Introduction to Java Programming and Data Structures (2019) by Y. Daniel Liang - PDF
- II-Sem Computer Science Syllabus PDF
- Data Structures and Algorithms with Python and C++ PDF
- Data Structures and Algorithms(All Sessions) PDF
Summary
These notes provide an introduction to fundamental data structures, including arrays, linked lists, stacks, queues, trees, and graphs. It explains the different types of data structures, their characteristics, and their applications. The document also defines data types and structures and their differences.
Full Transcript
DATA STRUCTURES UNIT I Data structure is a particular way of organizing data in a computer so that it can be used effectively. Data Structures are widely used in almost every aspect of Computer Science i.e. Operating System, Compiler Design, Artificial intelli...
DATA STRUCTURES UNIT I Data structure is a particular way of organizing data in a computer so that it can be used effectively. Data Structures are widely used in almost every aspect of Computer Science i.e. Operating System, Compiler Design, Artificial intelligence, Graphics and many more. Data Structures are the main part of many computer science algorithms as they enable the programmers to handle the data in an efficient way. It plays a vital role in enhancing the performance of a software or a program as the main function of the software is to store and retrieve the user's data as fast as possible Example of data structures 1. Arrays An array is a structure of fixed-size, which can hold items of the same data type. It can be an array of integers, an array of floating-point numbers, an array of strings or even an array of arrays (such as 2- dimensional arrays). Arrays are indexed, meaning that random access is possible. 2. Linked Lists A linked list is a sequential structure that consists of a sequence of items in linear order which are linked to each other. Hence, you have to access data sequentially and random access is not possible. 3. Stacks A stack is a LIFO (Last In First Out — the element placed at last can be accessed at first) structure which can be commonly found in many programming languages. This structure is named as ―stack‖ because it resembles a real-world stack — a stack of plates. 4. Queues A queue is a FIFO (First In First Out — the element placed at first can be accessed at first) structure which can be commonly found in many programming languages. This structure is named as ―queue‖ because it resembles a real-world queue — people waiting in a queue. 5. Trees A tree is a hierarchical structure where data is organized hierarchically and are linked together. This structure is different than a linked list whereas, in a linked list, items are linked in a linear order. 6. Graphs A graph consists of a finite set of vertices or nodes and a set of edges connecting these vertices. Elementary Data Organization Data: Data can be defined as an elementary value, for example, student's name and his roll number are the data about the student. Group Items: Data items which have subordinate data items are called Group item, for example, name of a student can have first name and the last name. Record: Record can be defined as the collection of various data items, for example, if we talk about the student entity, then his name, address, course and marks can be grouped together to form the record for the student. File: A File is a collection of various records of one type of entity, for example, if there are 60 employees in a Bank, then there will be 60 records in the related file where each record contains the data about each employee. Entity: An entity is a person, place, thing, event or concept about which information recorded. What is the difference between a data structure and a data type? Data type Data structure A data type describes pieces of data A data structure is a way of describing that all share a common property. a certain way to organize pieces of data so that operations and algorithms can be more easily applied. For example an integer data type For example tree type data structures describes every integer that the often allow for efficient searching computer can handle. algorithms. Values can directly be assigned to the The data is assigned to the data data type variables. structure object using some set of algorithms and operations like push, pop and so on. No problem of time complexity Time complexity comes into play when working with data structures. Examples: int, float, double Examples: stacks, queues, tree Categories of data structure Data structures can be classified as follows Primitive Data Structures These are the structures which are supported at the machine level, they can be used to make non-primitive data structures. These are integral and are pure in form. They have predefined behavior and specifications. Examples: Integer, float, character etc Non-primitive Data Structures The non-primitive data structures cannot be performed without the primitive data structures. Although, they too are provided by the system itself yet they are derived data structures and cannot be formed without using the primitive data structures. The Non-primitive data structures are further divided into the following categories: 1. Linear data structure Data structure where data elements are arranged sequentially or linearly where the elements are attached to its previous and next adjacent in what is called a linear data structure. In linear data structure, single level is involved. Therefore, we can traverse all the elements in single run only. Linear data structures are easy to implement because computer memory is arranged in a linear way. Its examples are array, stack, queue, linked list 2. Non Linear data structure Data structures where data elements are not arranged sequentially or linearly are called non-linear data structures. In a non-linear data structure, single level is not involved. Therefore, we can’t traverse all the elements in single run only. Non-linear data structures are not easy to implement in comparison to linear data structure. It utilizes computer memory efficiently in comparison to a linear data structure. Its examples are trees and graphs. Data structure operations 1. Traversing- It is used to access each data item exactly once so that it can be processed. 2. Searching- It is used to find out the location of the data item if it exists in the given collection of data items. 3. Inserting- It is used to add a new data item in the given collection of data items. 4. Deleting- It is used to delete an existing data item from the given collection of data items. 5. Sorting- It is used to arrange the data items in some order i.e. in ascending or descending order in case of numerical data and in dictionary order in case of alphanumeric data. 6. Merging- It is used to combine the data items of two sorted files into single file in the sorted form. Applications of data structures To search and find a specified target value, such as maximum, minimum, mean, median, average, count from a group of data objects the arrays are very efficient. A stack can be used as an "undo" mechanism in text editors; this operation has been accomplished by keeping all text changes in a stack. Queues are used for CPU job scheduling and in disk scheduling Linked lists are used in Symbol Tables for balancing parenthesis and in representing Sparse Matrix. An operating system maintains a disk's file system as a tree, where file folders act as tree nodes. The tree structure is useful because it easily accommodates the creation and deletion of folders and files. Shortest Path algorithm is the commonly used algorithm for determining the shortest path between any two points. This is practically used in applications such as network stations, computer games, maps, satellite navigation system etc. Algorithms Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in a certain order to get the desired output. Algorithms are generally created independent of underlying languages, i.e. an algorithm can be implemented in more than one programming language. From the data structure point of view, following are some important categories of algorithms − Search − Algorithm to search an item in a data structure. Sort − Algorithm to sort items in a certain order. Insert − Algorithm to insert item in a data structure. Update − Algorithm to update an existing item in a data structure. Delete − Algorithm to delete an existing item from a data structure. Characteristics of an Algorithm Not all procedures can be called an algorithm. An algorithm should have the following characteristics − Unambiguous − Algorithm should be clear and unambiguous. Each of its steps (or phases), and their inputs/outputs should be clear and must lead to only one meaning. Input − An algorithm should have 0 or more well-defined inputs. Output − An algorithm should have 1 or more well-defined outputs, and should match the desired output. Finiteness − Algorithms must terminate after a finite number of steps. Feasibility − Should be feasible with the available resources. Independent − An algorithm should have step-by-step directions, which should be independent of any programming code. How to Write an Algorithm? There are no well-defined standards for writing algorithms. Rather, it is problem and resource dependent. Algorithms are never written to support a particular programming code. As we know that all programming languages share basic code constructs like loops (do, for, while), flow-control (if-else), etc. These common constructs can be used to write an algorithm. Example Let's try to learn algorithm-writing by using an example. Problem − Design an algorithm to add two numbers and display the result. Step 1 − START Step 2 − declare three integers a, b & c Step 3 − define values of a & b Step 4 − add values of a & b Step 5 − store output of step 4 to c Step 6 − print c Step 7 − STOP Algorithms tell the programmers how to code the program. Alternatively, the algorithm can be written as − Step 1 − START ADD Step 2 − get values of a & b Step 3 − c ← a + b Step 4 − display c Step 5 − STOP In design and analysis of algorithms, usually the second method is used to describe an algorithm. It makes it easy for the analyst to analyze the algorithm ignoring all unwanted definitions. He can observe what operations are being used and how the process is flowing. Writing step numbers, is optional. We design an algorithm to get a solution of a given problem. A problem can be solved in more than one ways. Algorithm Complexity Suppose X is an algorithm and n is the size of input data, the time and space used by the algorithm X are the two main factors, which decide the efficiency of X. Time Factor − Time is measured by counting the number of key operations such as comparisons in the sorting algorithm. Space Factor − Space is measured by counting the maximum memory space required by the algorithm. The complexity of an algorithm f(n) gives the running time and/or the storage space required by the algorithm in terms of n as the size of input data. Space Complexity Space complexity of an algorithm represents the amount of memory space required by the algorithm in its life cycle. The space required by an algorithm is equal to the sum of the following two components − A fixed part that is a space required to store certain data and variables, that are independent of the size of the problem. For example, simple variables and constants used, program size, etc. A variable part is a space required by variables, whose size depends on the size of the problem. For example, dynamic memory allocation, recursion stack space, etc. Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the fixed part and S(I) is the variable part of the algorithm, which depends on instance characteristic I. Following is a simple example that tries to explain the concept − Algorithm: SUM(A, B) Step 1 - START Step 2 - C ← A + B + 10 Step 3 - Stop Here we have three variables A, B, and C and one constant. Hence S(P) = 1 + 3. Now, space depends on data types of given variables and constant types and it will be multiplied accordingly. Time Complexity Time complexity of an algorithm represents the amount of time required by the algorithm to run to completion. Time requirements can be defined as a numerical function T(n), where T(n) can be measured as the number of steps, provided each step consumes constant time. Time-Memory Tradeoff In computer science, a space-time or time-memory tradeoff is a way of solving a problem or calculation in less time by using more storage space (or memory), or by solving a problem in very little space by spending a long time. Most computers have a large amount of space, but not infinite space. Also, most people are willing to wait a little while for a big calculation, but not forever. So if your problem is taking a long time but not much memory, a space-time tradeoff would let you use more memory and solve the problem more quickly. Or, if it could be solved very quickly but requires more memory than you have, you can try to spend more time solving the problem in the limited memory. Big O Notation Big O notation is the language we use for talking about how long an algorithm takes to run. It's how we compare the efficiency of different approaches to a problem. It's like math except it's an awesome, not-boring kind of math. With big O notation we express the runtime in terms of—brace yourself—how quickly it grows relative to the input, as the input gets arbitrarily large. Big O notation is used to classify algorithms according to how their run time or space requirements grow as the input size grows. Strings A string in C is an array of characters. The length of a string is determined by a terminating null character: '\0'. So, a string with the contents, say, "abc" has four characters: 'a', 'b', 'c', and the terminating null ( '\0') character. The terminating null character has the value zero. Rithun String operations Function Work of Function strlen() computes string's length strcpy() copies a string to another strcat() concatenates(joins) two strings strcmp() compares two strings strlwr() converts string to lowercase strupr() converts string to uppercase Strings handling functions are defined under "string.h" header file. #include UNIT II Array Array is a container which can hold a fixed number of items and these items should be of the same type. Most of the data structures make use of arrays to implement their algorithms. Following are the important terms to understand the concept of Array. Element − Each item stored in an array is called an element. Index − Each location of an element in an array has a numerical index, which is used to identify the element. Representation of linear array in memory Arrays can be declared in various ways in different languages. For illustration, let's take C array declaration. As per the above illustration, following are the important points to be considered. Index starts with 0. Array length is 10 which means it can store 10 elements. Each element can be accessed via its index. For example, we can fetch an element at index 6 as 27. Basic Operations Following are the basic operations supported by an array. Traverse − print all the array elements one by one. Insertion − Adds an element at the given index. Deletion − Deletes an element at the given index. Search − Searches an element using the given index or by the value. Traverse Operation This operation is to traverse through the elements of an array. Example Following program traverses and prints the elements of an array: #include int main() { int age={20,21,19,18,25}; int i; for(i=0;i next; } Algorithm o STEP 1: SET PTR = HEAD o STEP 2: IF PTR = NULL WRITE "EMPTY LIST" GOTO STEP 7 END OF IF o STEP 4: REPEAT STEP 5 AND 6 UNTIL PTR != NULL o STEP 5: PRINT PTR→ DATA o STEP 6: PTR = PTR → NEXT [END OF LOOP] o STEP 7: EXIT Insertion Operation Adding a new node in linked list is a more than one step activity. We shall learn this with diagrams here. First, create a node using the same structure and find the location where it has to be inserted. Imagine that we are inserting a node B (NewNode), between A (LeftNode) and C (RightNode). Then point B.next to C − NewNode.next −> RightNode; It should look like this − Now, the next node at the left should point to the new node. LeftNode.next −> NewNode; This will put the new node in the middle of the two. The new list should look like this − Similar steps should be taken if the node is being inserted at the beginning of the list. While inserting it at the end, the second last node of the list should point to the new node and the new node will point to NULL. Deletion Operation Deletion is also a more than one step process. We shall learn with pictorial representation. First, locate the target node to be removed, by using searching algorithms. The left (previous) node of the target node now should point to the next node of the target node − LeftNode.next −> TargetNode.next; This will remove the link that was pointing to the target node. Now, using the following code, we will remove what the target node is pointing at. TargetNode.next −> NULL; We need to use the deleted node. We can keep that in memory otherwise we can simply deallocate memory and wipe off the target node completely. Searching in Linked List Sequential search is the most common search used on linked list structures. Algorithm Step-1: Initialise the Current pointer with the beginning of the List. Step-2: Compare the KEY value with the Current node value; if they match then quit there else go to step-3. Step-3: Move the Current pointer to point to the next node in the list and go to step-2, till the list is not over or else quit Header Linked List A Header linked list is one more variant of linked list. In Header linked list, we have a special node present at the beginning of the linked list. This special node is used to store number of nodes present in the linked list. In other linked list variant, if we want to know the size of the linked list we use traversal method. But in Header linked list, the size of the linked list is stored in its header itself. Two Way Linked List (Doubly Linked List) Doubly Linked List is a variation of Linked list in which navigation is possible in both ways, either forward and backward easily as compared to Single Linked List. Following are the important terms to understand the concept of doubly linked list. Link − Each link of a linked list can store a data called an element. Next − Each link of a linked list contains a link to the next link called Next. Prev − Each link of a linked list contains a link to the previous link called Prev. LinkedList − A Linked List contains the connection link to the first link called First and to the last link called Last. Doubly Linked List Representation As per the above illustration, following are the important points to be considered. Doubly Linked List contains a link element called first and last. Each link carries a data field(s) and two link fields called next and prev. Each link is linked with its next link using its next link. Each link is linked with its previous link using its previous link. The last link carries a link as null to mark the end of the list. Basic Operations Following are the basic operations supported by a list. Insertion − Adds an element at the beginning of the list. Deletion − Deletes an element at the beginning of the list. Insert Last − Adds an element at the end of the list. Delete Last − Deletes an element from the end of the list. Insert After − Adds an element after an item of the list. Delete − Deletes an element from the list using the key. Display forward − Displays the complete list in a forward manner. Display backward − Displays the complete list in a backward manner. Circular Linked List Circular Linked List is a variation of Linked list in which the first element points to the last element and the last element points to the first element. Both Singly Linked List and Doubly Linked List can be made into a circular linked list. Singly Linked List as Circular In singly linked list, the next pointer of the last node points to the first node. Doubly Linked List as Circular In doubly linked list, the next pointer of the last node points to the first node and the previous pointer of the first node points to the last node making the circular in both directions. As per the above illustration, following are the important points to be considered. The last link's next points to the first link of the list in both cases of singly as well as doubly linked list. The first link's previous points to the last of the list in case of doubly linked list. Basic Operations Following are the important operations supported by a circular list. insert − Inserts an element at the start of the list. delete − Deletes an element from the start of the list. display − Displays the list. Applications of linked list 1. Implementation of stacks and queues 2. Implementation of graphs : Adjacency list representation of graphs is most popular which is uses linked list to store adjacent vertices. 3. Dynamic memory allocation : We use linked list of free blocks. 4. Maintaining directory of names 5. Performing arithmetic operations on long integers 6. Manipulation of polynomials by storing constants in the node of linked list 7. representing sparse matrices 8. Image viewer – Previous and next images are linked, hence can be accessed by next and previous button. 9. Previous and next page in web browser – We can access previous and next url searched in web browser by pressing back and next button since, they are linked as linked list. 10. Music Player – Songs in music player are linked to previous and next song. you can play songs either from starting or ending of the list. Algorithm of insertion and deletion in Singly Linked List (SLL) Insertion In a single linked list, the insertion operation can be performed in three ways. They are as follows... 1. Inserting At Beginning of the list 2. Inserting At End of the list 3. Inserting At Specific location in the list Inserting At Beginning of the list We can use the following steps to insert a new node at beginning of the single linked list... Step 1 - Create a newNode with given value. Step 2 - Check whether list is Empty (head == NULL) Step 3 - If it is Empty then, set newNode→next = NULL and head = newNode. Step 4 - If it is Not Empty then, set newNode→next = head and head = newNode. Inserting At End of the list We can use the following steps to insert a new node at end of the single linked list... Step 1 - Create a newNode with given value and newNode → next as NULL. Step 2 - Check whether list is Empty (head == NULL). Step 3 - If it is Empty then, set head = newNode. Step 4 - If it is Not Empty then, define a node pointer temp and initialize with head. Step 5 - Keep moving the temp to its next node until it reaches to the last node in the list (until temp → next is equal to NULL). Step 6 - Set temp → next = newNode. Inserting At Specific location in the list (After a Node) We can use the following steps to insert a new node after a node in the single linked list... Step 1 - Create a newNode with given value. Step 2 - Check whether list is Empty (head == NULL) Step 3 - If it is Empty then, set newNode → next = NULL and head = newNode. Step 4 - If it is Not Empty then, define a node pointer temp and initialize with head. Step 5 - Keep moving the temp to its next node until it reaches to the node after which we want to insert the newNode (until temp1 → data is equal to location, here location is the node value after which we want to insert the newNode). Step 6 - Every time check whether temp is reached to last node or not. If it is reached to last node then display 'Given node is not found in the list!!! Insertion not possible!!!' and terminate the function. Otherwise move the temp to next node. Step 7 - Finally, Set 'newNode → next = temp → next' and 'temp → next = newNode' Deletion In a single linked list, the deletion operation can be performed in three ways. They are as follows... 1. Deleting from Beginning of the list 2. Deleting from End of the list 3. Deleting a Specific Node Deleting from Beginning of the list We can use the following steps to delete a node from beginning of the single linked list... Step 1 - Check whether list is Empty (head == NULL) Step 2 - If it is Empty then, display 'List is Empty!!! Deletion is not possible' and terminate the function. Step 3 - If it is Not Empty then, define a Node pointer 'temp' and initialize with head. Step 4 - Check whether list is having only one node (temp → next == NULL) Step 5 - If it is TRUE then set head = NULL and delete temp (Setting Empty list conditions) Step 6 - If it is FALSE then set head = temp → next, and delete temp. Deleting from End of the list We can use the following steps to delete a node from end of the single linked list... Step 1 - Check whether list is Empty (head == NULL) Step 2 - If it is Empty then, display 'List is Empty!!! Deletion is not possible' and terminate the function. Step 3 - If it is Not Empty then, define two Node pointers 'temp1' and 'temp2' and initialize 'temp1' with head. Step 4 - Check whether list has only one Node (temp1 → next == NULL) Step 5 - If it is TRUE. Then, set head = NULL and delete temp1. And terminate the function. (Setting Empty list condition) Step 6 - If it is FALSE. Then, set 'temp2 = temp1 ' and move temp1 to its next node. Repeat the same until it reaches to the last node in the list. (until temp1 → next == NULL) Step 7 - Finally, Set temp2 → next = NULL and delete temp1. Deleting a Specific Node from the list We can use the following steps to delete a specific node from the single linked list... Step 1 - Check whether list is Empty (head == NULL) Step 2 - If it is Empty then, display 'List is Empty!!! Deletion is not possible' and terminate the function. Step 3 - If it is Not Empty then, define two Node pointers 'temp1' and 'temp2' and initialize 'temp1' with head. Step 4 - Keep moving the temp1 until it reaches to the exact node to be deleted or to the last node. And every time set 'temp2 = temp1' before moving the 'temp1' to its next node. Step 5 - If it is reached to the last node then display 'Given node not found in the list! Deletion not possible!!!'. And terminate the function. Step 6 - If it is reached to the exact node which we want to delete, then check whether list is having only one node or not Step 7 - If list has only one node and that is the node to be deleted, then set head = NULL and delete temp1 (free(temp1)). Step 8 - If list contains multiple nodes, then check whether temp1 is the first node in the list (temp1 == head). Step 9 - If temp1 is the first node then move the head to the next node (head = head → next) and delete temp1. Step 10 - If temp1 is not first node then check whether it is last node in the list (temp1 → next == NULL). Step 11 - If temp1 is last node then set temp2 → next = NULL and delete temp1 (free(temp1)). Step 12 - If temp1 is not first node and not last node then set temp2 → next = temp1 → next and delete temp1 (free(temp1)). UNIT III Stack Data Structure Stack is a linear data structure which follows a particular order in which the operations are performed. The order may be LIFO(Last In First Out) or FILO(First In Last Out). There are many real-life examples of a stack. Consider an example of plates stacked over one another in the canteen. The plate which is at the top is the first one to be removed, i.e. the plate which has been placed at the bottommost position remains in the stack for the longest period of time. So, it can be simply seen to follow LIFO(Last In First Out)/FILO(First In Last Out) order. This feature makes it LIFO data structure. LIFO stands for Last-in-first-out. Here, the element which is placed (inserted or added) last, is accessed first. In stack terminology, insertion operation is called PUSH operation and removal operation is called POP operation. The following diagram depicts a stack and its operations − A stack can be implemented by means of Array, Structure, Pointer, and Linked List. Stack can either be a fixed size one or it may have a sense of dynamic resizing. Here, we are going to implement stack using arrays, which makes it a fixed size stack implementation. Primitive operations on stack Stack operations may involve initializing the stack, using it and then de-initializing it. Apart from these basic stuffs, a stack is used for the following two primary operations − push() − Pushing (storing) an element on the stack. pop() − Removing (accessing) an element from the stack. When data is PUSHed onto stack. To use a stack efficiently, we need to check the status of stack as well. For the same purpose, the following functionality is added to stacks − peek() − get the top data element of the stack, without removing it. isFull() − check if stack is full. isEmpty() − check if stack is empty. At all times, we maintain a pointer to the last PUSHed data on the stack. As this pointer always represents the top of the stack, hence named top. The top pointer provides top value of the stack without actually removing it. Stack Push & Pop When you add an item (often called an element) to the top of a stack, that's called a push. Push is a good word for this, because the new item pushes all the old ones down. If you've ever worked in a cafeteria or restaurant kitchen, spring-loaded shafts for plates are super common. The weight of the plates on top push the lower plates down. When you remove an item from a stack, that's called a pop. In a cafeteria, when you remove the topmost plate, the ones beneath it pop up because of the spring beneath them. Well, a computer stack is kind of like that. Stack Underflow & Overflow Now that we have a basic picture in mind of what a stack conceptually looks like, we can define what underflow and overflow are. Stack underflow happens when we try to pop (remove) an item from the stack, when nothing is actually there to remove. This will raise an alarm of sorts in the computer because we told it to do something that cannot be done. Stack overflow happens when we try to push one more item onto our stack than it can actually hold. You see, the stack usually can hold only so much stuff. Typically, we allocate (set aside) where the stack is going to be in memory and how big it can get. So, when we stick too much stuff there or try to remove nothing, we will generate a stack overflow condition or stack underflow condition, respectively. Algorithms for push and pop Push Operation The process of putting a new data element onto stack is known as a Push Operation. Push operation involves a series of steps − Step 1 − Checks if the stack is full. Step 2 − If the stack is full, produces an error and exit. Step 3 − If the stack is not full, increments top to point next empty space. Step 4 − Adds data element to the stack location, where top is pointing. Step 5 − Returns success. If the linked list is used to implement the stack, then in step 3, we need to allocate space dynamically. Algorithm for PUSH Operation A simple algorithm for Push operation can be derived as follows − begin procedure push: stack, data if stack is full return null endif top ← top + 1 stack[top] ← data end procedure Pop Operation Accessing the content while removing it from the stack, is known as a Pop Operation. In an array implementation of pop() operation, the data element is not actually removed, instead top is decremented to a lower position in the stack to point to the next value. But in linked-list implementation, pop() actually removes data element and deallocates memory space. A Pop operation may involve the following steps − Step 1 − Checks if the stack is empty. Step 2 − If the stack is empty, produces an error and exit. Step 3 − If the stack is not empty, accesses the data element at which top is pointing. Step 4 − Decreases the value of top by 1. Step 5 − Returns success. Algorithm for Pop Operation A simple algorithm for Pop operation can be derived as follows − begin procedure pop: stack if stack is empty return null endif data ← stack[top] top ← top - 1 return data end procedure Array implementation of Stack In array implementation, the stack is formed by using the array. All the operations regarding the stack are performed using arrays. Lets see how each operation can be implemented on the stack using array data structure. Adding an element onto the stack (push operation) Adding an element into the top of the stack is referred to as push operation. Push operation involves following two steps. 1. Increment the variable Top so that it can now refere to the next memory location. 2. Add element at the position of incremented top. This is referred to as adding new element at the top of the stack. Stack is overflown when we try to insert an element into a completely filled stack therefore, our main function must always avoid stack overflow condition. Algorithm: begin if top = n then stack full top = top + 1 stack (top) : = item; end Time Complexity : o(1) implementation of push algorithm in C language void push (int val,int n) //n is size of the stack { if (top == n ) printf("\n Overflow"); else { top = top +1; stack[top] = val; } } Deletion of an element from a stack (Pop operation) Deletion of an element from the top of the stack is called pop operation. The value of the variable top will be incremented by 1 whenever an item is deleted from the stack. The top most element of the stack is stored in an another variable and then the top is decremented by 1. the operation returns the deleted value that was stored in another variable as the result. The underflow condition occurs when we try to delete an element from an already empty stack. Algorithm : begin if top = 0 then stack empty; item := stack(top); top = top - 1; end; Time Complexity : o(1) Implementation of POP algorithm using C language int pop () { if(top == -1) { printf("Underflow"); return 0; } else { return stack[top - - ]; } } Linked List implementation of Stack Linked List-based implementation provides the best (from the efficiency point of view) dynamic stack implementation. Stack applications 1. Stacks can be used for expression evaluation. 2. Stacks can be used to check parenthesis matching in an expression. 3. Stacks can be used for Conversion from one form of expression to another. 4. Stacks can be used for Memory Management. 5. Stack data structures are used in backtracking problems. Polish Notation Polish notation is a notation form for expressing arithmetic, logic and algebraic equations. Its most basic distinguishing feature is that operators are placed on the left of their operands. If the operator has a defined fixed number of operands, the syntax does not require brackets or parenthesis to lessen ambiguity. Polish notation is also known as prefix notation, prefix Polish notation, normal Polish notation, Warsaw notation and Lukasiewicz notation. Polish notation was invented in 1924 by Jan Lukasiewicz, a Polish logician and philosopher, in order to simplify sentential logic. The idea is simply to have a parenthesis-free notation that makes each equation shorter and easier to parse in terms of defining the evaluation priority of the operators. Example: Infix notation with parenthesis: (3 + 2) * (5 – 1) Polish notation: * + 3 2 – 5 1 Recursion "Recursion" is technique of solving any problem by calling same function again and again until some breaking (base) condition where recursion stops and it starts calculating the solution from there on. For eg. calculating factorial of a given number Thus in recursion last function called needs to be completed first. Now Stack is a LIFO data structure i.e. ( Last In First Out) and hence it is used to implement recursion. The High level Programming languages, such as Pascal , C etc. that provides support for recursion use stack for book keeping. Queue A Queue is a linear structure which follows a particular order in which the operations are performed. The order is First In First Out (FIFO). A good example of a queue is any queue of consumers for a resource where the consumer that came first is served first. The difference between stacks and queues is in removing. In a stack we remove the item the most recently added; in a queue, we remove the item the least recently added. Queue Representation As we now understand that in queue, we access both ends for different reasons. The following diagram given below tries to explain queue representation as data structure − As in stacks, a queue can also be implemented using Arrays, Linked-lists, Pointers and Structures. For the sake of simplicity, we shall implement queues using one- dimensional array. Basic Operations Queue operations may involve initializing or defining the queue, utilizing it, and then completely erasing it from the memory. Here we shall try to understand the basic operations associated with queues − enqueue() − add (store) an item to the queue. dequeue() − remove (access) an item from the queue. Few more functions are required to make the above-mentioned queue operation efficient. These are − peek() − Gets the element at the front of the queue without removing it. isfull() − Checks if the queue is full. isempty() − Checks if the queue is empty. In queue, we always dequeue (or access) data, pointed by front pointer and while enqueing (or storing) data in the queue we take help of rear pointer. Circular Queue Circular Queue is a linear data structure in which the operations are performed based on FIFO (First In First Out) principle and the last position is connected back to the first position to make a circle. It is also called ‘Ring Buffer’. In a normal Queue, we can insert elements until queue becomes full. But once queue becomes full, we can not insert the next element even if there is a space in front of queue. Operations on Circular Queue: Front: Get the front item from queue. Rear: Get the last item from queue. enQueue(value) This function is used to insert an element into the circular queue. In a circular queue, the new element is always inserted at Rear position. Priority Queue Priority Queue is an extension of queue with following properties. 1. Every item has a priority associated with it. 2. An element with high priority is dequeued before an element with low priority. 3. If two elements have the same priority, they are served according to their order in the queue. Memory Representation of Queues Like Stacks, Queues can also be represented in memory in two ways. Using the contiguous memory like an array Using the non-contiguous memory like a linked list Using the Contiguous Memory like an Array In this representation the Queue is implemented using the array. Variables used in this case are QUEUE- the name of the array storing queue elements. FRONT- the index where the first element is stored in the array representing the queue. REAR- the index where the last element is stored in array representing the queue. MAX- defining that how many elements (maximum count) can be stored in the array representing the queue. Using the Non-Contiguous Memory like a Linked List In this representation the queue is implemented using the dynamic data structure Linked List. Using linked list for creating a queue makes it flexible in terms of size and storage. You don’t have to define the maximum number of elements in the queue. Pointers (links) to store addresses of nodes for defining a queue are. FRONT- address of the first element of the Linked list storing the Queue. REAR- address of the last element of the Linked list storing the Queue. Queue Applications Queues are used in various applications in Computer Science- Job scheduling tasks of CPU. Printer’s Buffer to store printing commands initiated by a user. Input commands sent to CPU by devices like Keyboard and Mouse. Document downloading from internet. User Requests for call center services. Order Queue for Online Food Delivery Chains. Online Cab Booking applications. Algorithm on insertion and deletion in simple queue Enqueue Operation Queues maintain two data pointers, front and rear. Therefore, its operations are comparatively difficult to implement than that of stacks. The following steps should be taken to enqueue (insert) data into a queue − Step 1 − Check if the queue is full. Step 2 − If the queue is full, produce overflow error and exit. Step 3 − If the queue is not full, increment rear pointer to point the next empty space. Step 4 − Add data element to the queue location, where the rear is pointing. Step 5 − return success. Sometimes, we also check to see if a queue is initialized or not, to handle any unforeseen situations. Algorithm for enqueue operation procedure enqueue(data) if queue is full return overflow endif rear ← rear + 1 queue[rear] ← data return true end procedure Implementation of enqueue() in C programming language − Example int enqueue(int data) if(isfull()) return 0; rear = rear + 1; queue[rear] = data; return 1; end procedure Dequeue Operation Accessing data from the queue is a process of two tasks − access the data where front is pointing and remove the data after access. The following steps are taken to perform dequeue operation − Step 1 − Check if the queue is empty. Step 2 − If the queue is empty, produce underflow error and exit. Step 3 − If the queue is not empty, access the data where front is pointing. Step 4 − Increment front pointer to point to the next available data element. Step 5 − Return success. Algorithm for dequeue operation procedure dequeue if queue is empty return underflow end if data = queue[front] front ← front + 1 return true end procedure Implementation of dequeue() in C programming language − Example int dequeue() { if(isempty()) return 0; int data = queue[front]; front = front + 1; return data; } Algorithm to insert an element in circular queue o Step 1: IF (REAR+1)%MAX = FRONT Write " OVERFLOW " Goto step 4 [End OF IF] o Step 2: IF FRONT = -1 and REAR = -1 SET FRONT = REAR = 0 ELSE IF REAR = MAX - 1 and FRONT ! = 0 SET REAR = 0 ELSE SET REAR = (REAR + 1) % MAX [END OF IF] o Step 3: SET QUEUE[REAR] = VAL o Step 4: EXIT Algorithm to delete an element from a circular queue To delete an element from the circular queue, we must check for the three following conditions. 1. If front = -1, then there are no elements in the queue and therefore this will be the case of an underflow condition. 2. If there is only one element in the queue, in this case, the condition rear = front holds and therefore, both are set to -1 and the queue is deleted completely. 3. If front = max -1 then, the value is deleted from the front end the value of front is set to 0. 4. Otherwise, the value of front is incremented by 1 and then delete the element at the front end. Algorithm o Step 1: IF FRONT = -1 Write " UNDERFLOW " Goto Step 4 [END of IF] o Step 2: SET VAL = QUEUE[FRONT] o Step 3: IF FRONT = REAR SET FRONT = REAR = -1 ELSE IF FRONT = MAX -1 SET FRONT = 0 ELSE SET FRONT = FRONT + 1 [END of IF] [END OF IF] o Step 4: EXIT UNIT IV Tree What are trees? Tree is a hierarchical data structure which stores the information naturally in the form of hierarchy style. Tree is one of the most powerful and advanced data structures. It is a non-linear data structure compared to arrays, linked lists, stack and queue. It represents the nodes connected by edges. The above figure represents structure of a tree. Tree has 2 subtrees. A is a parent of B and C. B is called a child of A and also parent of D, E, F. Tree is a collection of elements called Nodes, where each node can have arbitrary number of children. Field Description Root Root is a special node in a tree. The entire tree is referenced through it. It does not have a parent. Parent Node Parent node is an immediate predecessor of a node. Child Node All immediate successors of a node are its children. Siblings Nodes with the same parent are called Siblings. Path Path is a number of successive edges from source node to destination node. Height of Height of a node represents the number of edges on the longest path between that node Node and a leaf. Height of Height of tree represents the height of its root node. Tree Depth of Depth of a node represents the number of edges from the tree's root node to the node. Node Degree of Degree of a node represents a number of children of a node. Node Edge Edge is a connection between one node to another. It is a line between two nodes or a node and a leaf. In the above figure, D, F, H, G are leaves. B and C are siblings. Each node excluding a root is connected by a direct edge from exactly one other node parent → children. Levels of a node Levels of a node represents the number of connections between the node and the root. It represents generation of a node. If the root node is at level 0, its next node is at level 1, its grand child is at level 2 and so on. Levels of a node can be shown as follows: - If node has no children, it is called Leaves or External Nodes. - Nodes which are not leaves, are called Internal Nodes. Internal nodes have at least one child. - A tree can be empty with no nodes or a tree consists of one node called the Root. Height of a Node As we studied, height of a node is a number of edges on the longest path between that node and a leaf. Each node has height. In the above figure, A, B, C, D can have height. Leaf cannot have height as there will be no path starting from a leaf. Node A's height is the number of edges of the path to K not to D. And its height is 3. Note: - Height of a node defines the longest path from the node to a leaf. - Path can only be downward. Depth of a Node While talking about the height, it locates a node at bottom where for depth, it is located at top which is root level and therefore we call it depth of a node. In the above figure, Node G's depth is 2. In depth of a node, we just count how many edges between the targeting node & the root and ignoring the directions. Note: Depth of the root is 0. Advantages of Tree Tree reflects structural relationships in the data. It is used to represent hierarchies. It provides an efficient insertion and searching operations. Trees are flexible. It allows to move subtrees around with minimum effort. Tree Representation Left-Child Right-Sibling Representation of Tree It is a different representation of an n-ary tree where instead of holding a reference to each and every child node, a node holds just two references, first a reference to it’s first child, and the other to it’s immediate next sibling. This new transformation not only removes the need of advance knowledge of the number of children a node has, but also limits the number of references to a maximum of two, thereby making it so much easier to code. Binary trees A tree whose elements have at most 2 children is called a binary tree. Since each element in a binary tree can have only 2 children, we typically name them the left and right child. Operations on binary trees Searching: For searching element 2, we have to traverse all elements (assuming we do breadth first traversal). Therefore, searching in binary tree has worst case complexity of O(n). Insertion: For inserting element as left child of 2, we have to traverse all elements. Therefore, insertion in binary tree has worst case complexity of O(n). Deletion: For deletion of element 2, we have to traverse all elements to find 2 (assuming we do breadth first traversal). Therefore, deletion in binary tree has worst case complexity of O(n). Traversal of binary trees Tree traversal is the process of visiting each node in the tree exactly once. Visiting each node in a graph should be done in a systematic manner. If search result in a visit to all the vertices, it is called a traversal. There are basically three traversal techniques for a binary tree that are 1. Preorder traversal 2. Inorder traversal 3. Postorder traversal In-order Traversal In this traversal method, the left subtree is visited first, then the root and later the right sub-tree. We should always remember that every node may represent a subtree itself. If a binary tree is traversed in-order, the output will produce sorted key values in an ascending order. We start from A, and following in-order traversal, we move to its left subtree B. B is also traversed in-order. The process goes on until all the nodes are visited. The output of inorder traversal of this tree will be − D→B→E→A→F→C→G Algorithm Until all nodes are traversed − Step 1 − Recursively traverse left subtree. Step 2 − Visit root node. Step 3 − Recursively traverse right subtree. Pre-order Traversal In this traversal method, the root node is visited first, then the left subtree and finally the right subtree. We start from A, and following pre-order traversal, we first visit A itself and then move to its left subtree B. B is also traversed pre-order. The process goes on until all the nodes are visited. The output of pre-order traversal of this tree will be − A→B→D→E→C→F→G Algorithm Until all nodes are traversed − Step 1 − Visit root node. Step 2 − Recursively traverse left subtree. Step 3 − Recursively traverse right subtree. Post-order Traversal In this traversal method, the root node is visited last, hence the name. First we traverse the left subtree, then the right subtree and finally the root node. We start from A, and following Post-order traversal, we first visit the left subtree B. B is also traversed post-order. The process goes on until all the nodes are visited. The output of post-order traversal of this tree will be − D→E→B→F→G→C→A Algorithm Until all nodes are traversed − Step 1 − Recursively traverse left subtree. Step 2 − Recursively traverse right subtree. Step 3 − Visit root node. Linked List Representation of Binary Tree We use a double linked list to represent a binary tree. In a double linked list, every node consists of three fields. First field for storing left child address, second for storing actual data and third for storing right child address. In this linked list representation, a node has the following structure. The above example of the binary tree represented using Linked list representation is shown as follows... Binary Search Tree (BST) Binary Search Tree is a node-based binary tree data structure which has the following properties: The left subtree of a node contains only nodes with keys lesser than the node’s key. The right subtree of a node contains only nodes with keys greater than the node’s key. The left and right subtree each must also be a binary search tree. Operations on binary search tree Following are the basic operations − Search − Searches an element in a tree. Insert − Inserts an element in a tree. Pre-order Traversal − Traverses a tree in a pre-order manner. In-order Traversal − Traverses a tree in an in-order manner. Post-order Traversal − Traverses a tree in a post-order manner. Expression Tree The expression tree is a binary tree in which each internal node corresponds to the operator and each leaf node corresponds to the operand so for example expression tree for 3 + ((5+9)*2) would be: Inorder traversal of expression tree produces infix version of given postfix expression (same with preorder traversal it gives prefix expression) Construction of Expression Tree: Now For constructing an expression tree we use a stack. We loop through input expression and do the following for every character. 1. If a character is an operand push that into the stack 2. If a character is an operator pop two values from the stack make them its child and push the current node again. In the end, the only element of the stack will be the root of an expression tree. Forest A collection of disjoint trees is called a forest. Creating forest from a tree You can create a forest by cutting the root of a tree. Applications of trees Binary Search Trees(BSTs) are used to quickly check whether an element is present in a set or not. Heap is a kind of tree that is used for heap sort. A modified version of a tree called Tries is used in modern routers to store routing information. Most popular databases use B-Trees and T-Trees, which are variants of the tree structure we learned above to store their data Compilers use a syntax tree to validate the syntax of every program you write. UNIT V Graph A Graph is a non-linear data structure consisting of nodes and edges. The nodes are sometimes also referred to as vertices and the edges are lines or arcs that connect any two nodes in the graph. More formally a Graph can be defined as, A Graph consists of a finite set of vertices(or nodes) and set of Edges which connect a pair of nodes. In the above Graph, the set of vertices V = {0,1,2,3,4} and the set of edges E = {01, 12, 23, 34, 04, 14, 13}. Graphs are used to solve many real-life problems. Graphs are used to represent networks. The networks may include paths in a city or telephone network or circuit network. Graphs are also used in social networks like Instagram, Facebook. For example, in Facebook, each person is represented with a vertex(or node). Each node is a structure and contains information like person id, name, gender, locale etc. All of facebook is then a collection of these nodes and edges. This is because facebook uses a graph data structure to store its data. More precisely, a graph is a data structure (V, E) that consists of A collection of vertices V A collection of edges E, represented as ordered pairs of vertices (u,v) In the graph, V = {0, 1, 2, 3} E = {(0,1), (0,2), (0,3), (1,2)} G = {V, E} Graph Terminology Adjacency: A vertex is said to be adjacent to another vertex if there is an edge connecting them. Vertices 2 and 3 are not adjacent because there is no edge between them. Path: A sequence of edges that allows you to go from vertex A to vertex B is called a path. 0-1, 1-2 and 0-2 are paths from vertex 0 to vertex 2. Directed Graph: A graph in which an edge (u,v) doesn't necessarily mean that there is an edge (v, u) as well. The edges in such a graph are represented by arrows to show the direction of the edge. Graph Representation Graphs are commonly represented in two ways: 1. Adjacency Matrix An adjacency matrix is a 2D array of V x V vertices. Each row and column represent a vertex. If the value of any element a[i][j] is 1, it represents that there is an edge connecting vertex i and vertex j. The adjacency matrix for the graph we created above is Since it is an undirected graph, for edge (0,2), we also need to mark edge (2,0); making the adjacency matrix symmetric about the diagonal. Edge lookup(checking if an edge exists between vertex A and vertex B) is extremely fast in adjacency matrix representation but we have to reserve space for every possible link between all vertices(V x V), so it requires more space. 2. Adjacency List An adjacency list represents a graph as an array of linked lists. The index of the array represents a vertex and each element in its linked list represents the other vertices that form an edge with the vertex. The adjacency list for the graph we made in the first example is as follows: An adjacency list is efficient in terms of storage because we only need to store the values for the edges. For a graph with millions of vertices, this can mean a lot of saved space. Graph Operations The most common graph operations are: Check if the element is present in the graph Graph Traversal Add elements(vertex, edges) to graph Finding the path from one vertex to another Directed Graphs vs. Undirected Graphs Directed Graphs A directed graph is a set of vertices (nodes) connected by edges, with each node having a direction associated with it. Edges are usually represented by arrows pointing in the direction the graph can be traversed. In the example on the right, the graph can be traversed from vertex A to B, but not from vertex B to A. Undirected Graphs In an undirected graph the edges are bidirectional, with no direction associated with them. Hence, the graph can be traversed in either direction. The absence of an arrow tells us that the graph is undirected. In the example on the left, the graph can be traversed from node A to B as well as from node B to A. Some more complex directed and undirected graphs might look like the following: Weighted graph: Weighted graph = a graph whose edges have weights Example: The weight of an edge can represent: Cost or distance = the amount of effort needed to travel from one place to another Capacity = the maximim amount of flow that can be transported from one place to another Representing weighted graphs using an adjacency list Representing a weighted graph using an adjacency list:: Each node in the adjacency graph will contain: A neighbor node ID (this field was already discussed previously) A cost field (this field is new) Example: Graph: Representation: Explanation: Row 0 contains the linked list with the following 3 elements: (NodeId = 1, link cost = 3): this represent the link (0,1) in the figure above. (NodeId = 3, link cost = 2): this represent the link (0,3) in the figure above. (NodeId = 8, link cost = 4): this represent the link (0,8) in the figure above. And so on.... Graph Traversal Graph traversal is a technique used for a searching vertex in a graph. The graph traversal is also used to decide the order of vertices is visited in the search process. A graph traversal finds the edges to be used in the search process without creating loops. That means using graph traversal we visit all the vertices of the graph without getting into looping path. There are two graph traversal techniques and they are as follows... 1. DFS (Depth First Search) 2. BFS (Breadth First Search) DFS (Depth First Search) DFS traversal of a graph produces a spanning tree as final result. Spanning Tree is a graph without loops. We use Stack data structure with maximum size of total number of vertices in the graph to implement DFS traversal. We use the following steps to implement DFS traversal... Step 1 - Define a Stack of size total number of vertices in the graph. Step 2 - Select any vertex as starting point for traversal. Visit that vertex and push it on to the Stack. Step 3 - Visit any one of the non-visited adjacent vertices of a vertex which is at the top of stack and push it on to the stack. Step 4 - Repeat step 3 until there is no new vertex to be visited from the vertex which is at the top of the stack. Step 5 - When there is no new vertex to visit then use back tracking and pop one vertex from the stack. Step 6 - Repeat steps 3, 4 and 5 until stack becomes Empty. Step 7 - When stack becomes Empty, then produce final spanning tree by removing unused edges from the graph Back tracking is coming back to the vertex from which we reached the current vertex. BFS (Breadth First Search) BFS traversal of a graph produces a spanning tree as final result. Spanning Tree is a graph without loops. We use Queue data structure with maximum size of total number of vertices in the graph to implement BFS traversal. We use the following steps to implement BFS traversal... Step 1 - Define a Queue of size total number of vertices in the graph. Step 2 - Select any vertex as starting point for traversal. Visit that vertex and insert it into the Queue. Step 3 - Visit all the non-visited adjacent vertices of the vertex which is at front of the Queue and insert them into the Queue. Step 4 - When there is no new vertex to be visited from the vertex which is at front of the Queue then delete that vertex. Step 5 - Repeat steps 3 and 4 until queue becomes empty. Step 6 - When queue becomes empty, then produce final spanning tree by removing unused edges from the graph Applications of Graph Data Structure In Computer science graphs are used to represent the flow of computation. Google maps uses graphs for building transportation systems, where intersection of two(or more) roads are considered to be a vertex and the road connecting two vertices is considered to be an edge, thus their navigation system is based on the algorithm to calculate the shortest path between two vertices. In Facebook, users are considered to be the vertices and if they are friends then there is an edge running between them. Facebook’s Friend suggestion algorithm uses graph theory. Facebook is an example of undirected graph. In World Wide Web, web pages are considered to be the vertices. There is an edge from a page u to other page v if there is a link of page v on page u. This is an example of Directed graph. It was the basic idea behind Google Page Ranking Algorithm. In Operating System, we come across the Resource Allocation Graph where each process and resources are considered to be vertices. Edges are drawn from resources to the allocated process, or from requesting process to the requested resource. If this leads to any formation of a cycle then a deadlock will occur. Hashing Hashing is a technique that is used to uniquely identify a specific object from a group of similar objects. Some examples of how hashing is used in our lives include: In universities, each student is assigned a unique roll number that can be used to retrieve information about them. In libraries, each book is assigned a unique number that can be used to determine information about the book, such as its exact position in the library or the users it has been issued to etc. In both these examples the students and books were hashed to a unique number. Assume that you have an object and you want to assign a key to it to make searching easy. To store the key/value pair, you can use a simple array like a data structure where keys (integers) can be used directly as an index to store values. However, in cases where the keys are large and cannot be used directly as an index, you should use hashing. In hashing, large keys are converted into small keys by using hash functions. The values are then stored in a data structure called hash table. The idea of hashing is to distribute entries (key/value pairs) uniformly across an array. Each element is assigned a key (converted key). By using that key you can access the element in O(1) time. Using the key, the algorithm (hash function) computes an index that suggests where an entry can be found or inserted. Hash Table Hash Table is a data structure which stores data in an associative manner. In a hash table, data is stored in an array format, where each data value has its own unique index value. Access of data becomes very fast if we know the index of the desired data. Thus, it becomes a data structure in which insertion and search operations are very fast irrespective of the size of the data. Hash Table uses an array as a storage medium and uses hash technique to generate an index where an element is to be inserted or is to be located from. Hashing Hashing is a technique to convert a range of key values into a range of indexes of an array. We're going to use modulo operator to get a range of key values. Consider an example of hash table of size 20, and the following items are to be stored. Item are in the (key,value) format. (1,20) (2,70) (42,80) (4,25) (12,44) (14,32) (17,11) (13,78) (37,98) Sr.No. Key Hash Array Index 1 1 1 % 20 = 1 1 2 2 2 % 20 = 2 2 3 42 42 % 20 = 2 2 4 4 4 % 20 = 4 4 5 12 12 % 20 = 12 12 6 14 14 % 20 = 14 14 7 17 17 % 20 = 17 17 8 13 13 % 20 = 13 13 9 37 37 % 20 = 17 17 Infix, Postfix and Prefix Infix, Postfix and Prefix notations are three different but equivalent ways of writing expressions. It is easiest to demonstrate the differences by looking at examples of operators that take two operands. Infix notation: X + Y Operators are written in-between their operands. This is the usual way we write expressions. An expression such as A * ( B + C ) / D is usually taken to mean something like: "First add B and C together, then multiply the result by A, then divide by D to give the final answer." Infix notation needs extra information to make the order of evaluation of the operators clear: rules built into the language about operator precedence and associativity, and brackets ( ) to allow users to override these rules. For example, the usual rules for associativity say that we perform operations from left to right, so the multiplication by A is assumed to come before the division by D. Similarly, the usual rules for precedence say that we perform multiplication and division before we perform addition and subtraction. (see CS2121 lecture). Postfix notation (also known as "Reverse Polish notation"): X Y + Operators are written after their operands. The infix expression given above is equivalent to A B C + * D / The order of evaluation of operators is always left-to-right, and brackets cannot be used to change this order. Because the "+" is to the left of the "*" in the example above, the addition must be performed before the multiplication. Operators act on values immediately to the left of them. For example, the "+" above uses the "B" and "C". We can add (totally unnecessary) brackets to make this explicit: ( (A (B C +) *) D /) Thus, the "*" uses the two values immediately preceding: "A", and the result of the addition. Similarly, the "/" uses the result of the multiplication and the "D". Prefix notation (also known as "Polish notation"): + X Y Operators are written before their operands. The expressions given above are equivalent to / * A + B C D As for Postfix, operators are evaluated left-to-right and brackets are superfluous. Operators act on the two nearest values on the right. I have again added (totally unnecessary) brackets to make this clear: (/ (* A (+ B C) ) D) Although Prefix "operators are evaluated left-to-right", they use values to their right, and if these values themselves involve computations then this changes the order that the operators have to be evaluated in. In the example above, although the division is the first operator on the left, it acts on the result of the multiplication, and so the multiplication has to happen before the division (and similarly the addition has to happen before the multiplication). Because Postfix operators use values to their left, any values involving computations will already have been calculated as we go left-to-right, and so the order of evaluation of the operators is not disrupted in the same way as in Prefix expressions. In all three versions, the operands occur in the same order, and just the operators have to be moved to keep the meaning correct. (This is particularly important for asymmetric operators like subtraction and division: A - B does not mean the same as B - A; the former is equivalent to A B - or - A B, the latter to B A - or - B A). Examples: Infix Postfix Prefix Notes multiply A and B, A * B + C / D A B * C D / + + * A B / C D divide C by D, add the results add B and C, A * (B + C) / D A B C + * D / / * A + B C D multiply by A, divide by D divide C by D, A * (B + C / D) A B C D / + * * A + B / C D add B, multiply by A Converting between these notations The most straightforward method is to start by inserting all the implicit brackets that show the order of evaluation e.g.: Infix Postfix Prefix ( (A * B) + (C / D) ) ( (A B *) (C D /) +) (+ (* A B) (/ C D) ) ((A * (B + C) ) / D) ( (A (B C +) *) D /) (/ (* A (+ B C) ) D) (A * (B + (C / D) ) ) (A (B (C D /) +) *) (* A (+ B (/ C D) ) ) You can convert directly between these bracketed forms simply by moving the operator within the brackets e.g. (X + Y) or (X Y +) or (+ X Y). Repeat this for all the operators in an expression, and finally remove any superfluous brackets. You can use a similar trick to convert to and from parse trees - each bracketed triplet of an operator and its two operands (or sub-expressions) corresponds to a node of the tree. The corresponding parse trees are: / * + / \ / \ / \ * D A + / \ / \ / \ * / A + B / / \ / \ / \ / \ A B C D B C C D ((A*B)+(C/D)) ((A*(B+C))/D) (A*(B+(C/D))) Linear search (Sequential Search) Linear search is a very simple search algorithm. In this type of search, a sequential search is made over all items one by one. Every item is checked and if a match is found then that particular item is returned, otherwise the search continues till the end of the data collection. Binary Search Binary search is a fast search algorithm with run-time complexity of Ο(log n). This search algorithm works on the principle of divide and conquer. For this algorithm to work properly, the data collection should be in the sorted form. Binary search looks for a particular item by comparing the middle most item of the collection. If a match occurs, then the index of item is returned. If the middle item is greater than the item, then the item is searched in the sub-array to the left of the middle item. Otherwise, the item is searched for in the sub- array to the right of the middle item. This process continues on the sub-array as well until the size of the subarray reduces to zero. How Binary Search Works? For a binary search to work, it is mandatory for the target array to be sorted. We shall learn the process of binary search with a pictorial example. The following is our sorted array and let us assume that we need to search the location of value 31 using binary search. First, we shall determine half of the array by using this formula − mid = low + (high - low) / 2 Here it is, 0 + (9 - 0 ) / 2 = 4 (integer value of 4.5). So, 4 is the mid of the array. Now we compare the value stored at location 4, with the value being searched, i.e. 31. We find that the value at location 4 is 27, which is not a match. As the value is greater than 27 and we have a sorted array, so we also know that the target value must be in the upper portion of the array. We change our low to mid + 1 and find the new mid value again. low = mid + 1 mid = low + (high - low) / 2 Our new mid is 7 now. We compare the value stored at location 7 with our target value 31. The value stored at location 7 is not a match, rather it is more than what we are looking for. So, the value must be in the lower part from this location. Hence, we calculate the mid again. This time it is 5. We compare the value stored at location 5 with our target value. We find that it is a match. We conclude that the target value 31 is stored at location 5. Binary search halves the searchable items and thus reduces the count of comparisons to be made to very less numbers. Pseudocode The pseudocode of binary search algorithms should look like this − Procedure binary_search A ← sorted array n ← size of array x ← value to be searched Set lowerBound = 1 Set upperBound = n while x not found if upperBound < lowerBound EXIT: x does not exists. set midPoint = lowerBound + ( upperBound - lowerBound ) / 2 if A[midPoint] < x set lowerBound = midPoint + 1 if A[midPoint] > x set upperBound = midPoint - 1 if A[midPoint] = x EXIT: x found at location midPoint end while end procedure Sorting Sorting refers to arranging data in a particular format. Sorting algorithm specifies the way to arrange data in a particular order. Most common orders are in numerical or lexicographical order. The importance of sorting lies in the fact that data searching can be optimized to a very high level, if data is stored in a sorted manner. Sorting is also used to represent data in more readable formats. Following are some of the examples of sorting in real-life scenarios − Telephone Directory − The telephone directory stores the telephone numbers of people sorted by their names, so that the names can be searched easily. Dictionary − The dictionary stores words in an alphabetical order so that searching of any word becomes easy. Bubble Sort Algorithm Bubble sort is a simple sorting algorithm. This sorting algorithm is comparison-based algorithm in which each pair of adjacent elements is compared and the elements are swapped if they are not in order. This algorithm is not suitable for large data sets as its average and worst case complexity are of Ο(n2) where n is the number of items. How Bubble Sort Works? We take an unsorted array for our example. Bubble sort takes Ο(n2) time so we're keeping it short and precise. Bubble sort starts with very first two elements, comparing them to check which one is greater. In this case, value 33 is greater than 14, so it is already in sorted locations. Next, we compare 33 with 27. We find that 27 is smaller than 33 and these two values must be swapped. The new array should look like this − Next we compare 33 and 35. We find that both are in already sorted positions. Then we move to the next two values, 35 and 10. We know then that 10 is smaller 35. Hence they are not sorted. We swap these values. We find that we have reached the end of the array. After one iteration, the array should look like this − To be precise, we are now showing how an array should look like after each iteration. After the second iteration, it should look like this − Notice that after each iteration, at least one value moves at the end. And when there's no swap required, bubble sorts learns that an array is completely sorted. Now we should look into some practical aspects of bubble sort. Algorithm We assume list is an array of n elements. We further assume that swap function swaps the values of the given array elements. begin BubbleSort(list) for all elements of list if list[i] > list[i+1] swap(list[i], list[i+1]) end if end for return list end BubbleSort Insertion Sort This is an in-place comparison-based sorting algorithm. Here, a sub-list is maintained which is always sorted. For example, the lower part of an array is maintained to be sorted. An element which is to be 'insert'ed in this sorted sub- list, has to find its appropriate place and then it has to be inserted there. Hence the name, insertion sort. The array is searched sequentially and unsorted items are moved and inserted into the sorted sub-list (in the same array). This algorithm is not suitable for large data sets as its average and worst case complexity are of Ο(n2), where n is the number of items. How Insertion Sort Works? We take an unsorted array for our example. Insertion sort compares the first two elements. It finds that both 14 and 33 are already in ascending order. For now, 14 is in sorted sub-list. Insertion sort moves ahead and compares 33 with 27. And finds that 33 is not in the correct position. It swaps 33 with 27. It also checks with all the elements of sorted sub-list. Here we see that the sorted sub-list has only one element 14, and 27 is greater than 14. Hence, the sorted sub-list remains sorted after swapping. By now we have 14 and 27 in the sorted sub-list. Next, it compares 33 with 10. These values are not in a sorted order. So we swap them. However, swapping makes 27 and 10 unsorted. Hence, we swap them too. Again we find 14 and 10 in an unsorted order. We swap them again. By the end of third iteration, we have a sorted sub-list of 4 items. This process goes on until all the unsorted values are covered in a sorted sub- list. Now we shall see some programming aspects of insertion sort. Algorithm Now we have a bigger picture of how this sorting technique works, so we can derive simple steps by which we can achieve insertion sort. Step 1 − If it is the first element, it is already sorted. return 1; Step 2 − Pick next element Step 3 − Compare with all elements in the sorted sub-list Step 4 − Shift all the elements in the sorted sub-list that is greater than the value to be sorted Step 5 − Insert the value Step 6 − Repeat until list is sorted Selection Sort Selection sort is a simple sorting algorithm. This sorting algorithm is an in- place comparison-based algorithm in which the list is divided into two parts, the sorted part at the left end and the unsorted part at the right end. Initially, the sorted part is empty and the unsorted part is the entire list. The smallest element is selected from the unsorted array and swapped with the leftmost element, and that element becomes a part of the sorted array. This process continues moving unsorted array boundary by one element to the right. This algorithm is not suitable for large data sets as its average and worst case complexities are of Ο(n2), where n is the number of items. How Selection Sort Works? Consider the following depicted array as an example. For the first position in the sorted list, the whole list is scanned sequentially. The first position where 14 is stored presently, we search the whole list and find that 10 is the lowest value. So we replace 14 with 10. After one iteration 10, which happens to be the minimum value in the list, appears in the first position of the sorted list. For the second position, where 33 is residing, we start scanning the rest of the list in a linear manner. We find that 14 is the second lowest value in the list and it should appear at the second place. We swap these values. After two iterations, two least values are positioned at the beginning in a sorted manner. The same process is applied to the rest of the items in the array. Following is a pictorial depiction of the entire sorting process − Now, let us learn some programming aspects of selection sort. Algorithm Step 1 − Set MIN to location 0 Step 2 − Search the minimum element in the list Step 3 − Swap with value at location MIN Step 4 − Increment MIN to point to next element Step 5 − Repeat until list is sorted