DSA Notes III Sem -1 PDF
Document Details
Uploaded by LuxuriantHydrangea4935
Tags
Summary
This document provides notes on data structures, focusing on stacks and arrays. It explains linear and non-linear data structures, including the characteristics of stacks (LIFO) and arrays. The notes also cover operations like push, pop, and top, in addition to time complexity.
Full Transcript
UNIT 1 “STACK” Data Structure : A data structure is a storage that is used to store and organize data. It is a way of arranging data on a computer so that it can be accessed and updated efficiently. Linear data structure: Data structure in which d...
UNIT 1 “STACK” Data Structure : A data structure is a storage that is used to store and organize data. It is a way of arranging data on a computer so that it can be accessed and updated efficiently. Linear data structure: Data structure in which data elements are arranged sequentially or linearly, where each element is attached to its previous and next adjacent elements, is called a linear data structure. Examples of linear data structures are array, stack, queue, linked list, etc. ○ Static data structure: Static data structure has a fixed memory size. It is easier to access the elements in a static data structure. An example of this data structure is an array. ○ Dynamic data structure: In the dynamic data structure, the size is not fixed. It can be randomly updated during the runtime which may be considered efficient concerning the memory (space) complexity of the code. Examples of this data structure are queue, stack, etc. Non-linear data structure: Data structures where data elements are not placed sequentially or linearly are called non-linear data structures. In a non-linear data structure, we can’t traverse all the elements in a single run only. Examples of non-linear data structures are trees and graphs. Introduction to Stack:- A stack is a linear data structure in which the insertion of a new element and removal of an existing element takes place at the same end represented as the top of the stack. To implement the stack, it is required to maintain the pointer to the top of the stack, which is the last element to be inserted because we can access the elements only on the top of the stack. LIFO( Last In First Out ): This strategy states that the element that is inserted last will come out first. You can take a pile of plates kept on top of each other as a real-life example. The plate which we put last is on the top and since we remove the plate that is at the top, we can say that the plate that was put last comes out first. There are some key differences between arrays and stacks: Stacks are a limited-access data structure. We could technically iterate through an array from index 0 to index -1, but we could also go from -1 to 0, meaning that arrays do not follow the LIFO principle. Stacks limit these behaviors because that is what makes them useful: we know that we can only read or remove the last item on the stack. Even if internally stacks are represented by an array (they can also be represented by a linked list), they represent a class distinct from arrays with unique behaviors in and of themselves. Stacks give us an opportunity to extend or limit the behaviors typically associated with arrays through the methods we define. Creating a stack allows for abstraction and encapsulation of behaviors, both of which are core principles of Object-Oriented Programming. A Note on Time Complexity and Efficiency : Each operation on a stack is O(1). They are quite efficient in adding and removing data; however, if a problem requires us to access data throughout the entire stack, they would not be the best choice of data structure. If you were to print each element within a stack, this would be O(n) (where n represents the total number of elements in the stack). In this situation, time complexity grows linearly based on the number of elements in the structure. If we are popping elements off of a stack, we might recreate that same stack in reverse order. One option for working with stacks is to append popped elements to a temporary/holding stack. Then, you can print each element as you pop it off of the temporary stack and append it back to the original. Storing data in a temporary stack requires more resources; we can represent this operation’s time complexity with O(2*n), meaning it takes twice as long as it would to simply print the elements in reverse order. Types of Stacks: Fixed Size Stack: As the name suggests, a fixed size stack has a fixed size and cannot grow or shrink dynamically. If the stack is full and an attempt is made to add an element to it, an overflow error occurs. If the stack is empty and an attempt is made to remove an element from it, an underflow error occurs. Dynamic Size Stack: A dynamic size stack can grow or shrink dynamically. When the stack is full, it automatically increases its size to accommodate the new element, and when the stack is empty, it decreases its size. This type of stack is implemented using a linked list, as it allows for easy resizing of the stack. Basic Operations on Stack In order to make manipulations in a stack, there are certain operations provided to us. push() to insert an element into the stack pop() to remove an element from the stack top() Returns the top element of the stack. isEmpty() returns true if stack is empty else false. size() returns the size of stack. Algorithm for Push Operation: Before pushing the element to the stack, we check if the stack is full. If the stack is full (top == capacity-1) , then Stack Overflows and we cannot insert the element to the stack. Otherwise, we increment the value of top by 1 (top = top + 1) and the new value is inserted at top position. The elements can be pushed into the stack till we reach the capacity of the stack. Algorithm for Pop Operation: Before popping the element from the stack, we check if the stack is empty. If the stack is empty (top == -1), then Stack Underflows and we cannot remove any element from the stack. Otherwise, we store the value at top, decrement the value of top by 1 (top = top – 1) and return the stored top value. Algorithm for Top Operation: Before returning the top element from the stack, we check if the stack is empty. If the stack is empty (top == -1), we simply print “Stack is empty”. Otherwise, we return the element stored at index = top. Algorithm for isEmpty Operation : Check for the value of top in stack. If (top == -1) , then the stack is empty so return true. Otherwise, the stack is not empty so return false. Algorithm for isFull Operation: Check for the value of top in stack. If (top == capacity-1), then the stack is full so return true. Otherwise, the stack is not full so return false. Push: Pop: Adds an item to the stack. If the Removes an item from the stack. stack is full, then it is said to be an The items are popped in the reversed Overflow condition. order in which they are pushed. If the stack is empty, then it is said to Algorithm for push: be an Underflow condition. begin Algorithm for pop: if stack is full begin return if stack is empty endif return else endif increment top else stack[top] assign value store value of stack[top] end else decrement top end procedure return value end else end procedure Top: isEmpty: Returns the top element of the stack. Returns true if the stack is empty, else Algorithm for Top: false. begin Algorithm for isEmpty: return stack[top] begin end procedure if top < 1 return true else return false end procedure “Array” An array is a collection of items of the same data type stored at contiguous memory locations. Calculate the position of each element by simply adding an offset to a base value, i.e., the memory location of the first element of the array (generally denoted by the name of the array). The base value is index 0 and the difference between the two indexes is the offset. In C language, the array has a fixed size meaning once the size is given to it, it cannot be changed i.e. you can’t shrink it nor can you expand it. The reason was that for expanding, if we change the size we can’t be sure ( it’s not possible every time) that we get the next memory location to us for free. The shrinking will not work because the array, when declared, gets memory statically allocated, and thus the compiler is the only one that can destroy it. Types of Arrays: There are basically two types of arrays: Static Array: In this type of array, memory is allocated at compile time having a fixed size of it. We cannot alter or update the size of this array. Dynamic Array: In this type of array, memory is allocated at run time but not having a fixed size. Suppose, a user wants to declare any random size of an array, then we will not use a static array, instead that a dynamic array is used in hand. It is used to specify the size of it during the run time of any program. Example of Static Array: int a = {1, 2, 3, 4, 5}; A static integer array named a and initializes with the values 1, 2, 3, 4, and 5. The size of this array is determined automatically based on the number of values provided in the initialization list. Thus the array a is allocated on the stack, and its size cannot be changed once defined. Example of Dynamic Array: int *a = new int; In this code, we declare a pointer to an integer named a and it allocates memory in a dynamic fashion for an integer array of size 5. The new keyword here is used for dynamic memory allocation, and int specifies the size of this dynamic array. The new operator is used to return the address of the dynamically allocated memory, which is already stored in the pointer a. This array is allocated on the Heap, and its size can be modified later if needed. Static Array Dynamic Array 1. The memory allocation 1. The memory allocation occurs during run time. occurs during compile time. 2. The array size is fixed and 2. The array size is not fixed and can be changed. cannot be changed. 3. The location is in Stack 3. The location is in Heap Memory Space. Memory Space. 4. The array elements are set 4. The array elements can be destroyed during to 0 or to empty strings. erase statements and the memory is then released. 5. This array can be 5. This array cannot be read or written after Initialized but not erased. destruction. Implement Stack using Array: To implement a stack using an array, initialize an array and treat its end as the stack’s top. Implement push (add to end), pop (remove from end), and peek (check end) operations, handling cases for an empty or full stack. Step-by-step approach: 1. Initialize an array to represent the stack. 2. Use the end of the array to represent the top of the stack. 3. Implement push (add to end), pop (remove from the end), and peek (check end) operations, ensuring to handle empty and full stack conditions. Advantages of Array Implementation: Easy to implement. Memory is saved as pointers are not involved. Disadvantages of Array Implementation: It is not dynamic i.e., it doesn’t grow and shrink depending on needs at runtime. [But in case of dynamic sized arrays like vector in C++, list in Python, ArrayList in Java, stacks can grow and shrink with array implementation as well]. The total size of the stack must be defined beforehand. “Multiple stack implementation using single array” Method 1 (Divide the array in slots of size n/k) A simple way to implement k stacks is to divide the array in k slots of size n/k each, and fix the slots for different stacks, i.e., use arr to arr[n/k-1] for first stack, and arr[n/k] to arr[2n/k-1] for stack2 where arr[] is the array to be used to implement two stacks and size of array be n. The problem with this method is inefficient use of array space. A stack push operation may result in stack overflow even if there is space available in arr[]. For example, say the k is 2 and array size (n) is 6 and we push 3 elements to the first and do not push anything to the second stack. When we push the 4th element to first, there will be overflow even if we have space for 3 more elements in the array. Method 2 (A space-efficient implementation) The idea is to use two extra arrays for efficient implementation of k stacks in an array. This may not make much sense for integer stacks, but stack items can be large for example stacks of employees, students, etc where every item is hundreds of bytes. For such large stacks, the extra space used is comparatively very less as we use two integer arrays as extra space. Following are the two extra arrays are used: 1) top[]: This is of size k and stores indexes of top elements in all stacks. 2) next[]: This is of size n and stores indexes of next item for the items in array arr[]. Here arr[] is an actual array that stores k stacks. Together with k stacks, a stack of free slots in arr[] is also maintained. The top of this stack is stored in a variable ‘free’. All entries in top[] are initialized as -1 to indicate that all stacks are empty. All entries next[i] are initialized as i+1 because all slots are free initially and pointing to the next slot. Top of free stack, ‘free’ is initialized as 0. In case of Push Operation: The next[] array is used to maintain the chain of elements within each stack and to keep track of the next available free slot. In case of Pop Operation: The next[] array helps update the top of the stack and adds the freed slot back into the list of available free slots. Setup Assume n = 10 (total size of arr[]) and k = 3 (number of stacks). Initial state: ○ arr[] = [ , , , , , , , , , ] (empty) ○ top[] = [-1, -1, -1] (all stacks are empty) ○ next[] = [1, 2, 3, 4, 5, 6, 7, 8, 9, -1] (free list links, initially all slots are free) ○ free = 0 (first free slot is at index 0) Step 1: Push 10 into Stack 0 1. Find Free Slot: ○ free = 0, so arr will be used to store the new element. 2. Store Element: ○ arr = 10 3. Update next[]: ○ next was 1 (pointing to the next free slot), so we use next to update free. ○ next now points to top which was -1 (indicating stack 0 was empty). ○ next = -1 4. Update top[]: ○ top = 0 (index of the new top element in stack 0) 5. Update free: ○ free = 1 (next free slot is now at index 1) State after Step 1: arr[] = [10, , , , , , , , , ] top[] = [0, -1, -1] next[] = [-1, 2, 3, 4, 5, 6, 7, 8, 9, -1] free = 1 Step 2: Push 20 into Stack 1 1. Find Free Slot: ○ free = 1, so arr will be used to store the new element. 2. Store Element: ○ arr = 20 3. Update next[]: ○ next was 2 (pointing to the next free slot), so we use next to update free. ○ next now points to top which was -1 (indicating stack 1 was empty). ○ next = -1 4. Update top[]: ○ top = 1 (index of the new top element in stack 1) 5. Update free: ○ free = 2 (next free slot is now at index 2) State after Step 2: arr[] = [10, 20, , , , , , , , ] top[] = [0, 1, -1] next[] = [-1, -1, 3, 4, 5, 6, 7, 8, 9, -1] free = 2 Step 3: Push 30 into Stack 0 1. Find Free Slot: ○ free = 2, so arr will be used to store the new element. 2. Store Element: ○ arr = 30 3. Update next[]: ○ next was 3, so we use next to update free. ○ next now points to top which was 0 (indicating the previous top of stack 0). ○ next = 0 4. Update top[]: ○ top = 2 (index of the new top element in stack 0) 5. Update free: ○ free = 3 (next free slot is now at index 3) State after Step 3: arr[] = [10, 20, 30, , , , , , , ] top[] = [2, 1, -1] next[] = [-1, -1, 0, 4, 5, 6, 7, 8, 9, -1] free = 3 Step 4: Pop from Stack 0 1. Get the Top Index: ○ top = 2 (index of the current top element of stack 0) 2. Retrieve Element: ○ The element at arr = 30 is popped. 3. Update top[]: ○ top is updated to next which is 0 (the previous top element of stack 0). ○ top = 0 4. Update next[]: ○ The freed slot 2 is added back to the free list. ○ next = free (current free is 3). ○ next = 3 5. Update free: ○ free = 2 (index 2 is now the first free slot) State after Step 4: arr[] = [10, 20, 30, , , , , , , ] top[] = [0, 1, -1] next[] = [-1, -1, 3, 4, 5, 6, 7, 8, 9, -1] free = 2 Step 5: Pop from Stack 1 1. Get the Top Index: ○ top = 1 (index of the current top element of stack 1) 2. Retrieve Element: ○ The element at arr = 20 is popped. 3. Update top[]: ○ top is updated to next which is -1 (indicating stack 1 is now empty). ○ top = -1 4. Update next[]: ○ The freed slot 1 is added back to the free list. ○ next = free (current free is 2). ○ next = 2 5. Update free: ○ free = 1 (index 1 is now the first free slot) Final State after Step 5: arr[] = [10, 20, 30, , , , , , , ] top[] = [0, -1, -1] next[] = [-1, 2, 3, 4, 5, 6, 7, 8, 9, -1] free = 1 Algorithm: 1. Initialize an array of size k to keep track of the top element of each stack.b 2. Initialize an array next of size n, where n is the total size of the array that will hold the k stacks. Set the value of next[i] to i+1 for all 0 ≤ i < n-1, and next[n-1] to -1. This array will be used to keep track of the next element in the stack. 3. Initialize an array top of size k to store the index of the top element of each stack. Set the value of top[i] to -1 for all 0 ≤ i < k. 4. To push an element onto the i-th stack, do the following: Check if the array is full by checking if next is -1. If it is, return an error message indicating that the stack is full. Set the value of next to top[i]. Set the value of top[i] to 0. Set the value of next[top[i]] to the new element’s index. Increment the value of top[i] by the block size. 5. To pop an element from the i-th stack, do the following: Check if the stack is empty by checking if top[i] is -1. If it is, return an error message indicating that the stack is empty. Decrement the value of top[i] by the block size. Set the value of next[top[i]] to -1. Return the element at the index top[i] + block size – 1. Time complexity of operations push() and pop() is O(1). The best part of the above implementation is, if there is a slot available in the stack, then an item can be pushed in any of the stacks, i.e., no wastage of space. Time Complexity of top() operation is also O(1) Time Complexity: O(N), as we are using a loop to traverse N times. Auxiliary Space: O(N), as we are using extra space for the stack. Here in else statement to initialize the stack top b == n-1 is the correct statement, top b == 0 is wrong. Correct it in your notes. Application of Stack Data Structure: Function calls and recursion: When a function is called, the current state of the program is pushed onto the stack. When the function returns, the state is popped from the stack to resume the previous function’s execution. Undo/Redo operations: The undo-redo feature in various applications uses stacks to keep track of the previous actions. Each time an action is performed, it is pushed onto the stack. To undo the action, the top element of the stack is popped, and the reverse operation is performed. Expression evaluation: Stack data structure is used to evaluate expressions in infix, postfix, and prefix notations. Operators and operands are pushed onto the stack, and operations are performed based on the stack’s top elements. Browser history: Web browsers use stacks to keep track of the web pages you visit. Each time you visit a new page, the URL is pushed onto the stack, and when you hit the back button, the previous URL is popped from the stack. Balanced Parentheses: Stack data structure is used to check if parentheses are balanced or not. An opening parenthesis is pushed onto the stack, and a closing parenthesis is popped from the stack. If the stack is empty at the end of the expression, the parentheses are balanced. Backtracking Algorithms: The backtracking algorithm uses stacks to keep track of the states of the problem-solving process. The current state is pushed onto the stack, and when the algorithm backtracks, the previous state is popped from the stack. 1. Reversing List - A linked list is a data structure that consists of a sequence of nodes, where each node contains data and a reference (or link) to the next node in the sequence. The last node in the list points to None, indicating the end of the list.Reversing a linked list means changing the order of the nodes so that the last node becomes the first node, the second-to-last node becomes the second node, and so on, resulting in a completely reversed sequence. One approach to reversing a linked list is by using a stack. A stack is a Last-In-First-Out (LIFO) data structure, meaning that the last element added to the stack is the first one to be removed. Input: Output: In case, the linked list is empty or has only one node, reversing the linked list won’t make any change. So, in that case, we don’t reverse it. Or if we have a cycled linked list, in that case the list cannot be reversed as there has to be a starting point and an end point of the stack for this to work, but in the case of a cycled linked list there will be no ends hence it will not be possible to reverse it. Algorithm on How to Reverse a Linked List using Stack. Check if the linked list is empty or has a single node. If yes, then no need to reverse it and simply return from the function. Otherwise, follow the below steps. Declare an empty stack. Iterate through the linked list and push the values of the nodes into the stack. Now, after the iteration is complete, the linked list is stored in reverse order in the stack. Now, start popping the elements from the stack and iterating through the linked list together, and change the values in the nodes by the values we get from the stack. Once the stack is empty, we will have the required reversed linked list. Time complexity : O(N) 2. Factorial Calculation : Factorial of a non-negative integer is the multiplication of all positive integers smaller than or equal to n. For example factorial of 6 is 6*5*4*3*2*1 which is 720. Approach : First, we will create a stack using an array. Now, we will take user-input for the number of which factorial is to be calculated (say, num). We will make a variable 'top' which shows the top of the stack. Initial value of top is -1. Now while pushing the value in stack, we will set a loop which ends when number becomes 0, in loop we will; ○ Increase top by 1. ○ Make the top of the stack as num. ○ Decrement number by 1. A similar procedure will be followed to calculate the factorial of the number. Here, we set a loop which will end when top becomes -1 (initial value of top was also -1), in loop; ○ We multiply the variable factorial with the current top of the stack. (factorial is a pre-defined variable with initial value 1). ○ Decrement top by 1. The end result of this loop will be the required answer, i.e., factorial of entered number. // C program to find factorial of given number #include // function to find factorial of given number unsigned int factorial(unsigned int n) { if (n == 0) return 1; return n * factorial(n - 1); } int main() { int num = 5; printf("Factorial of %d is %d", num, factorial(num)); return 0; } Output Factorial of 5 is 120 Time Complexity: O(n) 3.Infix to postfix Transformation Infix expression: The expression of the form “a operator b” (a + b) i.e., when an operator is in-between every pair of operands. Postfix expression: The expression of the form “a b operator” (ab+) i.e., When every pair of operands is followed by an operator. The compiler scans the expression either from left to right or from right to left. Consider the expression: a + b * c + d The compiler first scans the expression to evaluate the expression b * c, then again scans the expression to add a to it. The result is then added to d after another scan. The repeated scanning makes it very inefficient. Infix expressions are easily readable and solvable by humans whereas the computer cannot differentiate the operators and parentheses easily so, it is better to convert the expression to postfix(or prefix) form before evaluation. The corresponding expression in postfix form is abc*+d+. The postfix expressions can be evaluated easily using a stack. Below are the steps to implement the above idea: 1. Scan the infix expression from left to right. 2. If the scanned character is an operand, put it in the postfix expression. 3. Otherwise, do the following If the precedence and associativity of the scanned operator are greater than the precedence and associativity of the operator in the stack [or the stack is empty or the stack contains a ‘(‘ ], then push it in the stack. [‘^‘ operator is right associative and other operators like ‘+‘,’–‘,’*‘ and ‘/‘ are left-associative]. Check especially for a condition when the operator at the top of the stack and the scanned operator both are ‘^‘. In this condition, the precedence of the scanned operator is higher due to its right associativity. So it will be pushed into the operator stack. In all the other cases when the top of the operator stack is the same as the scanned operator, then pop the operator from the stack because of left associativity due to which the scanned operator has less precedence. Else, Pop all the operators from the stack which are greater than or equal to in precedence than that of the scanned operator. After doing that Push the scanned operator to the stack. (If you encounter parenthesis while popping then stop there and push the scanned operator in the stack.) 4. If the scanned character is a ‘(‘, push it to the stack. 5. If the scanned character is a ‘)’, pop the stack and output it until a ‘(‘ is encountered, and discard both the parenthesis. 6. Repeat steps 2-5 until the infix expression is scanned. 7. Once the scanning is over, Pop the stack and add the operators in the postfix expression until it is not empty. 8. Finally, print the postfix expression. Questions : 1.a+b*c+(d*e) = abc*+de*+ 2. a+b*c+(d*e) = abc^/d- 3.a*(b+c)/d = abc+*d/ 4. a+(b*c(d/e^f)*g)*h) = abcdef^/*g*h*+ 5. a+b*(c^d-e)^(f+g*h)-i = abcdefgh*+^*+i- 6.abcd^efgh*+^*+i- = abcdefgh*+^*+i- 7. ( ( AX + ( B * CY ) ) / ( D E ) ) Postfix Expression: AX B CY * + D E / Prefix Expression: / + AX * B CY D E 8.Infix Expression: ( ( A + B ) * ( C + E ) ) ; Postfix Expression: A B + C E + * Prefix Expression: * + A B + C E 9.Infix Expression: ( AX * ( BX * ( ( ( CY + AY ) + BY ) * CX ) ) ) ; Postfix Expression: AX BX CY AY + BY + CX * * * Prefix Expression: * AX * BX * + + CY AY BY CX 10.Infix Expression: ( ( H * ( ( ( ( A + ( ( B + C ) * D ) ) * F ) * G ) * E ) ) + J ) Postfix Expression: H A B C + D * + F * G * E * * J + Prefix Expression: + * H * * * + A * + BCDFG Tower of Hanoi Tower of Hanoi, is a mathematical puzzle which consists of three towers (pegs) and more than one rings is as depicted − These rings are of different sizes and stacked upon in an ascending order, i.e. the smaller one sits over the larger one. There are other variations of the puzzle where the number of disks increase, but the tower count remains the same. The Tower of Hanoi is a classic problem in computer science and mathematics that involves moving a set of disks from one peg to another, following a specific set of rules. The Tower of Hanoi puzzle with n disks can be solved in minimum 2n−1 steps. This presentation shows that a puzzle with 3 disks has taken 23 - 1 = 7 steps. Problem Description You have three pegs: Source Peg (A) Destination Peg (B) Auxiliary Peg (C) You also have a number of disks of different sizes (e.g., n disks), which are initially stacked in ascending order of size on the source peg, with the smallest disk on top. Rules 1. You can move only one disk at a time. 2. A move consists of taking the top disk from one of the stacks and placing it on top of another stack or on an empty peg. 3. No disk may be placed on top of a smaller disk. Goal Move all the disks from the source peg to the destination peg using the auxiliary peg for temporary storage. Algorithm The algorithm for solving the Tower of Hanoi problem is recursive and can be described as follows: 1. Base Case: ○ If there is only one disk (i.e., n=1), simply move it from the source peg to the destination peg. 2. Recursive Case: ○ Move n−1 disks from the source peg to the auxiliary peg using the destination peg as a temporary holding area. ○ Move the nth disk (the largest disk) from the source peg to the destination peg. ○ Move the n−1 disks from the auxiliary peg to the destination peg using the source peg as a temporary holding area. Pseudocode Here’s a pseudocode representation of the algorithm: function TowerOfHanoi(n, source, destination, auxiliary): if n == 1: print "Move disk 1 from", source, "to", destination return TowerOfHanoi(n - 1, source, auxiliary, destination) print "Move disk", n, "from", source, "to", destination TowerOfHanoi(n - 1, auxiliary, destination, source) Explanation Base Case: If there's only one disk, just move it directly from the source peg to the destination peg. Recursive Case: To solve for n disks, you first move the top n−1 disks to the auxiliary peg, then move the nth disk to the destination peg, and finally move the n−1 disks from the auxiliary peg to the destination peg. Understanding the Recursive Algorithm The function TowerOfHanoi(n, source, destination, auxiliary) is a recursive solution to the Tower of Hanoi problem. The function's purpose is to move n disks from the source peg to the destination peg using the auxiliary peg as a temporary holding area. Here's a step-by-step explanation of the algorithm: 1. Base Case: ○ When n=1: This is the simplest case where only one disk needs to be moved directly from the source peg to the destination peg. If there's only one disk, you don't need to perform any further recursive steps. Just move the disk and print the move operation. 2. Recursive Case: ○ When n>1: Step 1: Move n-1 disks from the source peg to the auxiliary peg: To do this, you need to use the destination peg as a temporary holding area. This involves a recursive call: TowerOfHanoi(n - 1, source, auxiliary, destination). In this call, you're solving a smaller instance of the problem (with n-1 disks), moving these disks from the source to the auxiliary peg. Step 2: Move the nth disk from the source peg to the destination peg: This is the largest disk and is moved directly to its final position. You simply print the move operation: print "Move disk", n, "from", source, "to", destination. Step 3: Move the n-1 disks from the auxiliary peg to the destination peg: Now, move the n-1 disks that were temporarily stored on the auxiliary peg to the destination peg. To do this, you use the source peg as a temporary holding area. This involves another recursive call: TowerOfHanoi(n - 1, auxiliary, destination, source). Summary TowerOfHanoi(n - 1, source, auxiliary, destination): Moves the top n-1 disks from the source peg to the auxiliary peg, using the destination peg as temporary storage. print "Move disk", n, "from", source, "to", destination: Moves the nth (largest) disk directly from the source peg to the destination peg. TowerOfHanoi(n - 1, auxiliary, destination, source): Moves the n-1 disks from the auxiliary peg to the destination peg, using the source peg as temporary storage. Implementation of Tower of Hanoi in C: #include void toh(int, char, char, char); int main(){ char source = 'A', destination='B', Auxiliary='C'; int n; printf("Enter value of n\t"); scanf("%d", &n); toh(n, source, destination, Auxiliary); return 0; } void toh(int n, char source, char dest, char aux){ if(n==1){ printf("%c -> %c \n", source, dest); return; } toh(n-1, source, aux, dest); printf("%c -> %c \n", source, dest); toh(n-1, aux, dest, source); } Tower of Hanoi for 3 disks: Move disk 1 from rodS to rod D Move disk 2 from rodS to rod A Move disk 1 from rod D to rod A Move disk 3 from rodS to rod D Move disk 1 from rod A to rad S Move disk 2 from rod A to rod D Move disk 1 from rodS to rod D UNIT II TREES A tree data structure is a hierarchical structure that is used to represent and organize data in a way that is easy to navigate and search. It is a collection of nodes that are connected by edges and has a hierarchical relationship between the nodes. The topmost node of the tree is called the root, and the nodes below it are called the child nodes. Each node can have multiple child nodes, and these child nodes can also have their own child nodes, forming a recursive structure. Technical Definition : A Tree is a connected acyclic undirected graph. There is a unique path between every pair of vertices in G. A tree with N number of vertices contains (N-1) number of edges. The vertex which is 0 degrees is called the root of the tree. The vertex which is of 1 degree is called leaf node of the tree and the degree of an internal node is at least 2. Parent Node: The node which is a predecessor of a node is called the parent node of that node. {B} is the parent node of {D, E}. Child Node: The node which is the immediate successor of a node is called the child node of that node. Examples: {D, E} are the child nodes of {B}. Root Node: The topmost node of a tree or the node which does not have any parent node is called the root node. {A} is the root node of the tree. A non-empty tree must contain exactly one root node and exactly one path from the root to all other nodes of the tree. Leaf Node or External Node: The nodes which do not have any child nodes are called leaf nodes. {K, L, M, N, O, P} are the leaf nodes of the tree. Ancestor of a Node: Any predecessor nodes on the path of the root to that node are called Ancestors of that node. {A,B} are the ancestor nodes of the node {E} Descendant: Any successor node on the path from the leaf node to that node. {E,I} are the descendants of the node {B}. Sibling: Children of the same parent node are called siblings. {D,E} are called siblings. Level of a node: The count of edges on the path from the root node to that node. The root node has level 0. Internal node: A node with at least one child is called Internal Node. Neighbor of a Node: Parent or child nodes of that node are called neighbors of that node. Subtree: Any node of the tree along with its descendant. Types of Tree data structures: Binary tree: In a binary tree, each node can have a maximum of two children linked to it. Some common types of binary trees include full binary trees, complete binary trees, balanced binary trees, and degenerate or pathological binary trees. Ternary Tree: A Ternary Tree is a tree data structure in which each node has at most three child nodes, usually distinguished as “left”, “mid” and “right”. N-ary Tree or Generic Tree: Generic trees are a collection of nodes where each node is a data structure that consists of records and a list of references to its children(duplicate references are not allowed). Unlike the linked list, each node stores the address of multiple nodes. Properties of Tree Data Structure: Number of edges: An edge can be defined as the connection between two nodes. If a tree has N nodes then it will have (N-1) edges. There is only one path from each node to any other node of the tree. Depth of a node: The depth of a node is defined as the length of the path from the root to that node. Each edge adds 1 unit of length to the path. So, it can also be defined as the number of edges in the path from the root of the tree to the node. Height of a node: The height of a node can be defined as the length of the longest path from the node to a leaf node of the tree. Height of the Tree: The height of a tree is the length of the longest path from the root of the tree to a leaf node of the tree. Degree of a Node: The total count of subtrees attached to that node is called the degree of the node. The degree of a leaf node must be 0. The degree of a tree is the maximum degree of a node among all the nodes in the tree. Application of Tree Data Structure: File System: This allows for efficient navigation and organization of files. Data Compression: Huffman coding is a popular technique for data compression that involves constructing a binary tree where the leaves represent characters and their frequency of occurrence. The resulting tree is used to encode the data in a way that minimizes the amount of storage required. Compiler Design: In compiler design, a syntax tree is used to represent the structure of a program. Database Indexing: B-trees and other tree structures are used in database indexing to efficiently search for and retrieve data. Advantages of Tree Data Structure: Trees offer Efficient Searching Depending on the type of tree, with average search times of O(log n) for balanced trees like AVL. Trees provide a hierarchical representation of data, making it easy to organize and navigate large amounts of information. The recursive nature of trees makes them easy to traverse and manipulate using recursive algorithms. Disadvantages of Tree Data Structure: Unbalanced Trees, meaning that the height of the tree is skewed towards one side, which can lead to inefficient search times. Trees demand more memory space requirements than some other data structures like arrays and linked lists, especially if the tree is very large. The implementation and manipulation of trees can be complex and require a good understanding of the algorithms. Binary Tree Binary Tree is defined as a tree data structure where each node has at most 2 children. All nodes have either zero, one, or at most two children nodes. These two children are generally referred to as left and right children respectively. Each node in a binary tree has these three elements – 1. Data item that is the data stored in the node 2. Address of the left child 3. Address of the right child Depth of a Binary Tree – The number of edges from a node in the tree to the root node. Height of a Binary Tree – The number of edges from the deepest node in the tree to the root node. Properties of Binary Tree in Data Structure 1. If there are n nodes in a perfect binary tree (we would read more about them), its height is given by – logn. This is because, for a given node in a binary tree, there can be at most 2 child nodes. This further drives us to the explanation that at each level or height of a binary tree, the number of nodes will be almost equal to half of the number of nodes present in the next level. To put it simply, in a binary tree, the number of nodes at each level is almost double the number of nodes at the previous level. 2. The minimum number of nodes possible at the height h of a binary tree is given by h+1. 3. If a binary tree has L number of leaf nodes, its height is given by L + 1 4. At each level i of a binary tree, the number of total nodes is given by 2 ^ i. Types of Binary Tree in Data Structure Binary Trees are of many types, and each of these trees has its own properties and characteristics. Let’s discuss some of them in detail – 1. Full Binary Tree A full binary tree, also known as a proper binary tree, is a tree in which each internal node has either zero or two children nodes is known as a full binary tree. In other words, if in a binary tree a node contains only one child node, it is not a full binary tree. The following diagrams shows a full binary tree – Notice how each node has either zero or two children nodes. In a full binary tree, if there are n number of total nodes – The number of internal nodes is given by (n-1)/2 The number of leaf nodes is given by (n+1)/2 2. Complete Binary Tree A complete binary tree is a binary tree in which all the elements are arranged without missing any sequence. In a complete binary tree – All the levels are completely filled except the last level that may or may not be completely filled. Elements are filled from left to right. The following trees are complete binary trees since they have no empty spaces in them. 3. Perfect Binary Trees If in a tree all the internal nodes have exactly two children nodes, it is known as a perfect binary tree. In a perfect binary tree, all the leaf nodes are on the same level. The following diagrams represents a perfect binary tree – Consider a perfect binary tree with height h, the total number of nodes in this case is given by 2h – 1. 4. Degenerate Binary Trees If in a binary tree each node contains only one child node either on the left side or the right side of the tree, it is known as a degenerate binary tree. Degenerate binary trees are equal to linked lists in terms of performance. The following tree shows a degenerate binary tree – Degenerate binary trees can also be classified into two types – a. Left-skewed – A degenerate binary tree in which all the nodes lean towards the left side of the tree. The following diagram shows a left-skewed degenerate binary tree – b. Right-skewed – A degenerate binary tree in which all the nodes lean towards the right side of the tree. The following diagram shows a right-skewed degenerate binary tree – 5. Balanced Binary Trees A binary tree is said to be balanced if the height of the left and the right subtrees differ by 0 or 1. In a balanced binary tree, both the left and right trees are also balanced. The following diagram shows a balanced binary tree – Notice that in the first image, the right subtree is at the height of 3, and the left subtree is at 2. The difference between the heights of the left and the right subtree is therefore 1. Therefore, the given tree is a balanced binary tree. In the second image, the right subtree is at the height of 3, whereas the left subtree is at 1. The difference between the heights of the left and the right subtrees is 2. It violates the condition of a balanced binary tree. Hence, the given tree is not a balanced binary tree. Benefits of Binary Tree in Data Structure 1. Binary trees are used in binary search trees. It helps in searching for elements in a faster and efficient way. 2. Binary trees are also used in heaps that are special kinds of binary trees. Heaps are used in heap sort, which is an efficient sorting algorithm. Heaps are also used in building priority queues in which elements are arranged in the order of their priorities. 3. Binary trees are used in converting different prefix and postfix expressions. 4. Binary trees are also used in graph traversal algorithms like Dijkastra’s algorithm. 5. Some real-life applications of binary trees include virtual memory management, and 3D where faster rendering of objects is required. Binary tree Time Complexity Searching: Worst case complexity of O(n). Insertion: Worst case complexity of O(n). Deletion: Worst case complexity of O(n). Binary tree Space Complexity Searching: O(n). Insertion: O(n). Deletion: O(n). Representation of Binary trees using arrays The binary tree can be represented using an array of size 2n+1 if the depth of the binary tree is n. If the parent element is at the index p, Then the left child will be stored in the index (2p)+1, and the right child will be stored in the index (2p)+2.Parent of a node at index i: The parent of a node is located at index (i - 1) / 2. The array representation of the above binary tree is : As in the above binary tree, A was the root node, so that it will be stored in the 0th index. The left child of A will be stored in the 2(0)+1 equal to the 1st location. So, B is stored in index 1. And similarly, the right child of A will be stored in the 2(0)+2 index. For every node, the left and right child will be stored accordingly. Algorithm for Representing a Binary Tree Using Arrays 1. Declare an array: ○ Create an array large enough to hold all the nodes of the tree. 2. Insert a node: ○ Insert the node at the first available position in the array, i.e., starting with the root at index 0. ○ Calculate the positions of the left and right children using 2*p + 1 and 2*p + 2. 3. Accessing nodes: ○ To access a node's value, use the index of the array. ○ To access the left or right child of a node at index i, use indices 2*p + 1 and 2*p + 2, respectively. Representation of Binary trees using Linked List For the linked list representation, we will use the doubly linked list, which has two pointers so that we can point to the left and right children of a binary tree node. NULL is given to the pointer as the address when no child is connected. The Linked list representation of the above binary tree is : Algorithm for Representing a Binary Tree Using Linked List Creating a New Node: Allocate memory for the node. Assign the data to the node. Set the left and right child pointers to NULL. Inserting a Node: If the tree is empty, the new node becomes the root. Otherwise, traverse the tree to find the correct position for the new node and insert it there. Operations on a Binary Tree Insertion : Nodes can be inserted into binary trees in between two other nodes or added after a leaf node. In binary trees, a node that is inserted is specified as to which child it is. External nodes : Say that the external node being added onto is node A. To add a new node after node A, A assigns the new node as one of its children and the new node assigns node A as its parent. Internal nodes : The process of inserting a node into a binary tree Insertion on internal nodes is slightly more complex than on external nodes. Say that the internal node is node A and that node B is the child of A. (If the insertion is to insert a right child, then B is the right child of A, and similarly with a left child insertion.) A assigns its child to the new node and the new node assigns its parent to A. Then the new node assigns its child to B and B assigns its parent as the new node. Deletion : Deletion is the process whereby a node is removed from the tree. Only certain nodes in a binary tree can be removed unambiguously, Node with zero or one children The process of deleting an internal node in a binary tree says that the node to delete is node A. If a node has no children (external node), deletion is accomplished by setting the child of A's parent to null. If it has one child, set the parent of A's child to A's parent and set the child of A's parent to A's child. Node with two children : In a binary tree, a node with two children cannot be deleted unambiguously. However, in certain binary trees (including binary search trees) these nodes can be deleted, though with a rearrangement of the tree structure. Traversal : Pre-order, in-order, and post-order traversal visit each node in a tree by recursively visiting each node in the left and right subtrees of the root. Binary Tree Traversals Traversal of a tree means visiting and outputting the value of each node in a particular order. Trees are organized through relationships or hierarchies. This allows us to traverse them in multiple ways. Traversals are classified by the order in which the nodes are visited. Traversal of binary trees (also known as tree search) refers to the process of visiting (checking or updating) each node in a tree data structure, exactly once. Types of Binary Tree Traversal Tree Traversal Algorithms can be classified broadly in the following two categories by the order in which the nodes are visited: 1. Depth First Search (DFS) Algorithm: Depth-First Search is an important algorithm for traversal of binary trees. It starts with the root node and first visits all nodes of one branch as deep as possible of the chosen Node and before backtracking, it visits all other branches in a similar fashion. Key Characteristics: Depth Exploration: DFS explores as far as possible along each branch before backtracking, meaning it dives deep into the graph before exploring other branches. Uses a Stack: DFS uses a stack data structure (either explicitly or via recursion) to keep track of the nodes to be explored. Nodes are processed in a last-in, first-out (LIFO) manner. Path Exploration: DFS is useful for exploring all possible paths from a node, detecting cycles, and for problems where you need to explore all potential solutions (like backtracking problems). To traverse binary trees with depth-first search, the following operations are performed at each node: If the current node is empty, then simply return. Execute the following three operations in a particular order: ○ N: Visit the current node. ○ L: Recursively traverse the current node's left subtree. ○ R: Recursively traverse the current node's right subtree. There are 3 tree traversals that are mostly based on the DFS. They are given below: Preorder Traversal Inorder Traversal Postorder Traversal 1. Inorder Tree Traversal The inorder tree traversal of binary tree is also known as the LNR traversal, because in this traversal, we first visit the left node (abbreviation L), followed by the root node(abbreviation N), and finally the right node (abbreviation R) of the tree. Inorder traversal of binary trees is one of the most widely used variants of the DFS (Depth First Search) traversal. In the inorder traversal, we first start from the root node of the tree and go deeper and deeper into the left subtree in a recursive manner. When we reach the left-most node of the tree with the above steps, then we visit that current node and go to the left-most node of its right subtree (if it exists). Following the same steps (mentioned above) recursively, helps us in traversing through the tree as a whole. The algorithm for the inorder traversal can be stated as: Recursively traverse the current node's left subtree. Visit the current node. Recursively traverse the current node's right subtree. Simply put: Go to the left subtree Visit the Current Node Go to the right subtree Time Complexity : If a tree has n nodes, then each node is visited only once in inorder traversal and hence the complexity of the inorder traversal of the binary tree is O(n). Space Complexity :The space complexity of the recursive inorder traversal is O(h) for a tree of height h. The height of the tree is the length of the deepest node from the root. Applications of Inorder Traversal: In-order traversal is very commonly used on binary search trees because it returns the value of the tree in a sorted manner. More specifically, in an ascending (non-decreasing) order 2. Preorder Tree Traversal Just like the previously discussed inorder traversal, preorder traversal of binary tree is also a variant of the DFS. It is performed in a similar way as the inorder traversal is performed, but here the order of visiting the nodes is different than that of the inorder traversal. In the case of the preorder traversal of a binary tree, we visit the current node first. After that, we visited the leftmost subtree. Once we reach the leaf node(or have covered all the nodes of the left subtree), we move towards the right subtree. In the right subtree, we recursively call our function to do the traversal in a similar manner. It follows the NLR structure. It means first visit the current node, followed by recursively visiting the left subtree and then the right subtree. Below given is an image for the same. The algorithm for the preorder traversal of binary tree can be stated as: Visit the current node Recursively traverse the current node's left subtree. Recursively traverse the current node's right subtree. In the image, the dotted lines indicate the order in which we will visit each node. The blue lines indicate that we are going back (backtracking) after visiting a particular node. From the above image it must be very clear that only after processing a particular node, the left subtree is processed, followed by the right subtree. These operations are defined recursively for each of the nodes. And hence, this recursive implementation is referred to as a Depth-first search (DFS), as the search tree is deepened as much as possible on each child before going to the next sibling. Time complexity Analysis: If a tree has n nodes, then each node is visited only once in the preorder traversal of the binary tree, hence the complexity of the preorder traversal of the binary tree is O(n). Space Complexity Analysis: The space complexity of the recursive preorder traversal of a binary tree is O(h) for a tree of height h. Applications of Preorder Traversal: Pre-order traversal of binary trees can be used to make a prefix expression (Polish notation) from the expression trees. Preorder traversal of binary trees is also used to create a copy of the tree. 3. Postorder Traversal In the Postorder traversal of binary trees, we visit the left subtree and the right subtree before visiting the current node in recursion. Postorder follows the L->R->N order of traversing the binary tree. Postorder traversal of binary tree follows the following order of visiting the nodes in a tree: Recursively traverse the current node's left subtree. Recursively traverse the current node's right subtree. Visit the current node In the above image, we can see that, before processing any node, the left subtree is processed first, followed by the right subtree, and the node is processed last. These operations can be defined recursively for each node. Time Complexity Analysis: If a tree has n nodes, then each node is visited only once in the postorder traversal of the binary tree, hence the complexity of the postorder traversal of the binary tree is O(n). Space Complexity Analysis: The space complexity of the recursive postorder traversal of a binary tree is O(h) for a tree of height h. Applications of Postorder Traversal: Post-order traversal of a binary tree can generate a postfix representation (Reverse Polish notation) of a binary tree. Postorder traversal of binary trees is also used to delete the tree. Because it is easier in post-order traversal where each node is freed after freeing its children. 2. Breadth First Search (BFS) Algorithm: Breadth First Search (BFS) Algorithm is also an important algorithm for traversal of binary trees. It starts from the root node and visits all nodes of the current depth before moving to the next depth in the tree. In simple words, the BFS algorithm traverses the tree level by level and depth by depth. BFS uses the queue data structure in general. Key Characteristics: Level Order Exploration: BFS explores the graph level by level, visiting all neighbors of a node before moving on to the next level. Uses a Queue: BFS uses a queue data structure to keep track of the nodes to be explored next. Nodes are added to the queue in the order they are discovered and are processed in a first-in, first-out (FIFO) manner. Shortest Path: BFS is often used to find the shortest path in an unweighted graph since it explores all possible paths of equal length before moving on to longer paths. Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The left subtree of a node contains only nodes with keys lesser than the node’s key. The right subtree of a node contains only nodes with keys greater than the node’s key. The left and right subtree each must also be a binary search tree. This means everything to the left of the root is less than the value of the root and everything to the right of the root is greater than the value of the root. Due to this performance, a binary search is very easy. The left and right subtree each must also be a binary search tree. Create a Binary Tree for the following keys : 1. 7,5,1,8,3,6,0,9,4,2 2. 30 , 20 , 10 , 15 , 25 , 23 , 39 , 35 , 42 3. 30, 40,24, 58,48,26, 11, 13 4. 47,12,75,88,90,73,57,1,85,50,62 5. Basic Operations on Binary Search Trees Following are the basic operations of a tree − Search Operation Whenever an element is to be searched, start searching from the root node. Then if the data is less than the key value, search for the element in the left subtree. Otherwise, search for the element in the right subtree. Follow the same algorithm for each node. Algorithm 1. START 2. Check whether the tree is empty or not 3. If the tree is empty, search is not possible 4. Otherwise, first search the root of the tree. 5. If the key does not match with the value in the root, search its subtrees. 6. If the value of the key is less than the root value, search the left subtree 7. If the value of the key is greater than the root value, search the right subtree. 8. If the key is not found in the tree, return unsuccessful search. 9. END Insert Operation Whenever an element is to be inserted, first locate its proper location. Start searching from the root node, then if the data is less than the key value, search for the empty location in the left subtree and insert the data. Otherwise, search for the empty location in the right subtree and insert the data. Algorithm 1 – START 2 – If the tree is empty, insert the first element as the root node of the tree. The following elements are added as the leaf nodes. 3 – If an element is less than the root value, it is added into the left subtree as a leaf node. 4 – If an element is greater than the root value, it is added into the right subtree as a leaf node. 5 – The final leaf nodes of the tree point to NULL values as their child nodes. 6 – END Inorder Traversal The inorder traversal operation in a Binary Search Tree visits all its nodes in the following order − Firstly, we traverse the left child of the root node/current node, if any. Next, traverse the current node. Lastly, traverse the right child of the current node, if any. Algorithm 1. START 2. Traverse the left subtree, recursively 3. Then, traverse the root node 4. Traverse the right subtree, recursively. 5. END Time Complexity: O(N), where N is the number of nodes of the BST Auxiliary Space: O(1) Preorder Traversal The preorder traversal operation in a Binary Search Tree visits all its nodes. However, the root node in it is first printed, followed by its left subtree and then its right subtree. Algorithm 1. START 2. Traverse the root node first. 3. Then traverse the left subtree, recursively 4. Later, traverse the right subtree, recursively. 5. END Time Complexity: O(N), where N is the number of nodes of the BST Auxiliary Space: O(1) Postorder Traversal Like the other traversals, postorder traversal also visits all the nodes in a Binary Search Tree and displays them. However, the left subtree is printed first, followed by the right subtree and lastly, the root node. Algorithm 1. START 2. Traverse the left subtree, recursively 3. Traverse the right subtree, recursively. 4. Then, traverse the root node 5. END Preorder Traversal: 55 -> 20 -> 15 -> 50 -> 35 -> 90 -> 65 -> Inorder Traversal: 15 -> 20 -> 35 -> 50 -> 55 -> 65 -> 90 -> Postorder Traversal: 55 -> 20 -> 15 -> 50 -> 35 -> 90 -> 65 -> Time Complexity: O(N), where N is the number of nodes of the BST Auxiliary Space: O(1) Level order traversal Level order traversal of a BST is breadth first traversal for the tree. It visits all nodes at a particular level first before moving to the next level. Time Complexity: O(N), where N is the number of nodes of the BST Auxiliary Space: O(1) Deletion Operation: Deletion in BST involves three cases:- First, search the key to be deleted using the searching algorithm and find the node. Then, find the number of children of the node to be deleted. Case 1- If the node to be deleted is leaf node: If the node to be deleted is a leaf node, then delete it. Case 2- If the node to be deleted has one child: If the node to be deleted has one child then, delete the node and place the child of the node at the position of the deleted node. Case 3- If the node to be deleted has two children: If the node to be deleted has two children then, find the inorder successor or inorder predecessor of the node according to the nearest capable value of the node to be deleted. Delete the inorder successor or predecessor using the above cases. Replace the node with the inorder successor or predecessor. Applications of Binary Search tree: BSTs are used for indexing. It is also used to implement various searching algorithms. IT can be used to implement various data structures. BSTs can be used in decision support systems to store and quickly retrieve data. BSTs can be used to store and quickly retrieve data in computer simulations. BSTs can be used to implement fast autocomplete systems. BSTs can be used to implement decision trees, which are used in machine learning and artificial intelligence to model decisions and predict outcomes. Decision trees are used in many applications, including medical diagnosis, financial analysis, and marketing research. BSTs can be used in encryption algorithms such as RSA, which is a public-key encryption algorithm used in secure communication protocols. RSA uses a BST to generate public and private keys. BSTs can be used to compress data by storing frequently occurring values in a smaller space and less frequently occurring values in a larger space. This technique is used in many applications, including image and audio compression, data transmission, and file compression. Real-time Application of Binary Search tree: BSTs are used for indexing in databases. It is used to implement searching algorithms. BSTs are used to implement Huffman coding algorithms. It is also used to implement dictionaries. Used for data caching. Used in Priority queues. Used in spell checkers. Advantages of Binary Search Tree: BST is fast in insertion and deletion when balanced. It is fast with a time complexity of O(log n). BST is also for fast searching, with a time complexity of O(log n) for most operations. BST is efficient. It is efficient because they only store the elements and do not require additional memory for pointers or other data structures. We can also do range queries – find keys between N and M (N 1. If it is smaller than the current distance of Node, set it as the new current distance of N. Mark the current node 1 as visited. Go to step 2 if there are any nodes that are unvisited. Dijkstra’s Algorithm will generate the shortest path from Node 0 to all other Nodes in the graph. Consider the below graph: Dijkstra’s Algorithm The algorithm will generate the shortest path from node 0 to all the other nodes in the graph. For this graph, we will assume that the weight of the edges represents the distance between two nodes. As, we can see we have the shortest path from, Node 0 to Node 1, from Node 0 to Node 2, from Node 0 to Node 3, from Node 0 to Node 4, from Node 0 to Node 6. Initially we have a set of resources given below : The Distance from the source node to itself is 0. In this example the source node is 0. The distance from the source node to all other nodes is unknown so we mark all of them as infinity. Example: 0 -> 0, 1-> ∞,2-> ∞,3-> ∞,4-> ∞,5-> ∞,6-> ∞. We'll also have an array of unvisited elements that will keep track of unvisited or unmarked Nodes. Algorithm will complete when all the nodes marked as visited and the distance between them are added to the path. Unvisited Nodes:- 0 1 2 3 4 5 6. Step 1: Start from Node 0 and mark Node as visited as you can check in below image visited Node is marked red. Dijkstra’s Algorithm Step 2: Check for adjacent Nodes. Now we have two choices (Either choose Node1 with distance 2 or either choose Node 2 with distance 6 ) and choose Node with minimum distance. In this step Node 1 is the Minimum distance adjacent Node, so mark it as visited and add up the distance. Distance: Node 0 -> Node 1 = 2 Dijkstra’s Algorithm Step 3: Then Move Forward and check for adjacent Node which is Node 3, so marked it as visited and add up the distance, Now the distance will be: Distance: Node 0 -> Node 1 -> Node 3 = 2 + 5 = 7 Dijkstra’s Algorithm Step 4: Again we have two choices for adjacent Nodes (Either we can choose Node 4 with distance 10 or either we can choose Node 5 with distance 15) so choose Node with minimum distance. In this step Node 4 is the Minimum distance adjacent to Node, so mark it as visited and add up the distance. Distance: Node 0 -> Node 1 -> Node 3 -> Node 4 = 2 + 5 + 10 = 17 Dijkstra’s Algorithm Step 5: Again, Move Forward and check for adjacent Node which is Node 6, so marked it as visited and add up the distance, Now the distance will be: Distance: Node 0 -> Node 1 -> Node 3 -> Node 4 -> Node 6 = 2 + 5 + 10 + 2 = 19 Dijkstra’s Algorithm So, the Shortest Distance from the Source Vertex is 19 which is optimal one Complexity Analysis of Dijkstra’s Algorithm: The time complexity of Dijkstra’s algorithm depends on the data structure used for the priority queue. Here is a breakdown of the time complexity based on different implementations: Using an unsorted list as the priority queue: O(V2), where V is the number of vertices in the graph. In each iteration, the algorithm searches for the vertex with the smallest distance among all unvisited vertices, which takes O(V) time. This operation is performed V times, resulting in a time complexity of O(V^2). Using a sorted list or a binary heap as the priority queue: O(E + V log V), where E is the number of edges in the graph. In each iteration, the algorithm extracts the vertex with the smallest distance from the priority queue, which takes O(log V) time. The distance updates for the neighboring vertices take O(E) time in total. This operation is performed V times, resulting in a time complexity of O(V log V + E log V). Since E can be at most V^2, the time complexity is O(E + V log V). Kruskal’s Minimum Spanning Tree (MST) Algorithm In Kruskal’s algorithm, sort all edges of the given graph in increasing order. Then it keeps on adding new edges and nodes in the MST if the newly added edge does not form a cycle. It picks the minimum weighted edge at first and the maximum weighted edge at last. Thus we can say that it makes a locally optimal choice in each step in order to find the optimal solution. Steps for finding MST using Kruskal's Algorithm: 1. Arrange the edge of G in order of increasing weight. 2. Starting only with the vertices of G and proceeding sequentially add each edge which does not result in a cycle, until (n - 1) edges are used. 3. EXIT. Kruskal’s algorithm to find the minimum cost spanning tree uses the greedy approach. The Greedy Choice is to pick the smallest weight edge that does not cause a cycle in the MST constructed so far. Let us understand it with an example:Illustration: Below is the illustration of the above approach: Input Graph: The graph contains 9 vertices and 14 edges. So, the minimum spanning tree formed will have (9 – 1) = 8 edges. After sorting: Weight Source Destination 1 7 6 2 8 2 2 6 5 4 0 1 4 2 5 6 8 6 7 2 3 7 7 8 8 0 7 8 1 2 9 3 4 10 5 4 11 1 7 14 3 5 Now pick all edges one by one from the sorted list of edges Step 1: Pick edge 7-6. No cycle is formed, including it. Add edge 7-6 in the MST Step 2: Pick edge 8-2. No cycle is formed, including it. Add edge 8-2 in the MST Step 3: Pick edge 6-5. No cycle is formed, including it. Add edge 6-5 in the MST Step 4: Pick edge 0-1. No cycle is formed, including it. Add edge 0-1 in the MST Step 5: Pick edge 2-5. No cycle is formed, including it. Add edge 2-5 in the MST Step 6: Pick edge 8-6. Since including this edge results in the cycle, discard it. Pick edge 2-3: No cycle is formed, include it. Add edge 2-3 in the MST Step 7: Pick edge 7-8. Since including this edge results in the cycle, discard it. Pick edge 0-7. No cycle is formed, including it. Add edge 0-7 in MST Step 8: Pick edge 1-2. Since including this edge results in the cycle, discard it. Pick edge 3-4. No cycle is formed, including it. Add edge 3-4 in the MST Note: Since the number of edges included in the MST equals to (V – 1), so the algorithm stops here Questions: 1. Question Solution 2. 3. Hashing Hashing is the process of generating a value from a text or a list of numbers using a mathematical function known as a hash function. A Hash Function is a function that converts a given numeric or alphanumeric key to a small practical integer value. The mapped integer value is used as an index in the hash table. In simple terms, a hash function maps a significant number or string to a small integer that can be used as the index in the hash table. The pair is of the form (key, value), where for a given key, one can find a value using some kind of a “function” that maps keys to values. The key for a given object can be calculated using a function called a hash function. For example, given an array A, if i is the key, then we can find the value by simply looking up A[i]. The hash function in the data structure verifies the file which has been imported from another source. A hash key for an item can be used to accelerate the process. It increases the efficiency of retrieval and optimizes the search. This is how we can simply give a hashing definition in data structure. It requires a significant amount of your time to search in the entire list and locate that specific number. This manual process of scanning is not only time-consuming but inefficient too. With hashing in the data structure, you can narrow down the search and find the number within seconds. Types of hashing in data structure is a two-step process. 1. The hash function converts the item into a small integer or hash value. This integer is used as an index to store the original data. 2. It stores the data in a hash table. You can use a hash key to locate data quickly. Examples of Hashing in Data Structure The following are real-life examples of hashing in the data structure – In schools, the teacher assigns a unique roll number to each student. Later, the teacher uses that roll number to retrieve information about that student. A library has an infinite number of books. The librarian assigns a unique number to each book. This unique number helps in identifying the position of the books on the bookshelf. Hash Function The hash function in a data structure maps the arbitrary size of data to fixed-sized data. It returns the following values: a small integer value (also known as hash value), hash codes, and hash sums. The hashing techniques in the data structure are very interesting, such as: hash = hash func(key) index = hash % array_size The hash function must satisfy the following requirements: A good hash function is easy to compute. A good hash function never gets stuck in clustering and distributes keys evenly across the hash table. A good hash function avoids collision when two elements or items get assigned to the same hash value. One of the hashing techniques of using a hash function is used for data integrity. If using a hash function one change in a message will create a different hash. The three characteristics of the hash function in the data structure are: 1. Collision free 2. Property to be hidden 3. Puzzle friendly Hash Table Hashing in data structure uses hash tables to store the key-value pairs. The hash table then uses the hash function to generate an index. Hashing uses this unique index to perform insert, update, and search operations. It can be defined as a bucket where the data are stored in an array format. These data have their own index value. If the index values are known then the process of accessing the data is quicker. Types of Hash functions There are many hash functions that use numeric or alphanumeric keys. This article focuses on discussing different hash functions: Mainly used: k mod 10 and k mod n Here k= key and n= total number of keys 1. Division Method. 2. Mid Square Method. 3. Folding Method. 4. Multiplication Method. Collision Resolution The purpose of collision resolution during insertion is to locate an open location in the hash table when the record’s home position is already taken. Any collision resolution technique may be thought of as creating a series of hash table slots that may or may not contain the record. The key will be in its home position in the first position in the sequence. The collision resolution policy shifts to the following location in the sequence if the home position is already occupied. Another slot needs to be sought if this is also taken, and so on. The probe sequence is a collection of slots that is produced by a probe function that we will refer to as p. This is how insertion operates. Collision in Hashing In this, the hash function is used to find the index of the array. The hash value is used to create an index for the key in the hash table. The hash function may return the same hash value for two or more keys. When two or more keys have the same hash value, a collision happens. To handle this collision, we use collision resolution techniques. Collision Resolution Techniques There are two types of collision resolution techniques. Separate chaining (open hashing) Open addressing (closed hashing) Open Hashing: The first Collision Resolution or Handling technique, " Open Hashing ", is popularly known as Separate Chaining. This is a technique which is used to implement an array as a linked list known as a chain. It is one of the most used techniques by programmers to handle collisions. Basically, a linked list data structure is used to implement the Separate Chaining technique. When a number of elements are hashed into the index of a single slot, then they are inserted into a singly-linked list. This singly-linked list is the linked list which we refer to as a chain in the Open Hashing technique. We can make use of a key " K " to search the chain by traversing linearly. If the key K and intrinsic key for any entry in the singly linked list are found to be equal, then it means that we have found our entry. But in a case where we traverse the singly linked list and reach the end without finding our entry, then it means that the entry we are trying to find does not exist. So, when there are two keys that are fighting for the same key position, then the same key position will be allotted for both keys or records. The key position is furtherly extended after placing one key with a linked list. In the end, the hash table will contain a chain where the collision has happened. That is the main reason for calling this technique as " Chaining technique ". Advantages of Open Hashing: 1. The Separate chaining method is simple to implement and understand. 2. The hash table never ends, so we can always add new elements. 3. Open Hashing is less sensitive to the load factors or hash function. 4. It can be implemented when we do not know how frequently keys will be inserted or deleted. Disadvantages of Open Hashing: 1. The cache performance of the Separate Chaining method is poor as the keys are stored using a singly linked list. 2. A lot of storage space is wasted as some parts of the hash table are never used. 3. In the worst case, the search time can become " O ( n ) ". This happens only if the chain becomes too lengthy. 4. Extra storage is used for storing links of the chain. Open addressing: To prevent collisions in the hashing table open, addressing is employed as a collision-resolution technique. No key is kept anywhere else besides the hash table. As a result, the hash table’s size is never equal to or less than the number of keys. Additionally known as closed hashing. The following techniques are used in open addressing: Linear probing Quadratic probing Double hashing UNIT IV SEARCHING AND SORTING TECHNIQUES Finding a given element's occurrence in a list is the process of searching. In any data structure where any element is stored, search algorithms are made to check or retrieve that element. If the element being searched for is found in the list, the search is considered successful; otherwise, it is considered unsuccessful. There are two types of searching algorithms 1. Sequential / Linear Search 2. Binary Search Linear Search and Binary Search A linear search, sometimes referred to as a sequential search, simply scans each element one at a time. In this search, an array is sequentially traversed from the first element until the element is found or the end of the array is reached. This method can only be suitable for searching over a small array or an unsorted array. Algorithm 1. Assume we need to find an element that is in an array in random order. 2. Start with the first position and compare it to the target in order to search for a target element. 3. If the current element matches the target element, return the position of the current element. 4. If not, move on to the next one until we reach the very end of an array. 5. If still unable to find the target, return -1. The Binary search method is only suitable for searching in a sorted array. In this method, the element that has to be searched is compared to the array's middle element. Search is considered successful only if it matches the target. The binary search algorithm uses the divide-and-conquer approach, it does not scan every element in the list, it only searches half of the list instead of going through each element, Hence said to be the best searching algorithm because it is faster to execute as compared to Linear search. Algorithm 1. Start with the middle element of the array as a current key. 2. If the middle element value matches the target element then return the index of the middle element. 3. Else check if the target element is lesser than the middle element, then continue the search in the left half. 4. If the target element is greater than the middle element, then continue the search in the right half. 5. Check from the second point repeatedly until the value is found or the array is empty. 6. If still unable to find the target, return -1. Difference Between Linear Search and Binary Search Linear Search Binary Search Commonly known as sequential search. Commonly known as half-interval search. Elements are searched in a sequential manner Elements are searched using the (one by one). divide-and-conquer approach. The elements in the array can be in random Elements in the array need to be in sorted order. order. Less Complex to implement. More Complex to implement. Linear search is a slow process. Binary search is comparatively faster. Single and Multidimensional arrays can be Only single dimensional arrays can be used. used. Does not Efficient for larger arrays. Efficient for larger arrays. The worst-case time complexity is O(n). The worst case time complexity is O(log n). Linear Search and Binary Search Time Complexity Analysis Linear Search The Best Case: When the array's first element is the target element. In this case, there is only one comparison. Hence, the time complexity becomes O(1). The Average Case: Somewhere in the middle of the array will be the target element. In this case, N/2 will be the number of comparisons. Hence, the time complexity becomes O(N). The Worst Case: When the array's last element is the target element. In this case, the whole array is traversed. Hence, the time complexity becomes O(N). Binary Search The Best Case: When the array's middle element is the target element. In this case, there is only one comparison. Hence, the time complexity becomes O(1). The Average Case: Anywhere in the array will be the target element. In this case, the time complexity becomes O(logN). The Worst Case: When the array does not have the target element or is far away from the middle element. In this case, the time complexity becomes O(logN). Sorting Techniques Sorting refers to arranging data in a particular format. Sorting algorithm specifies the way to arrange data in a particular order. Most common orders are in numerical or lexicographical order. The importance of sorting lies in the fact that data searching can be optimized to a very high level, if data is stored in a sorted manner. Sorting is also used to represent data in more readable formats. Following are some of the examples of sorting in real-life scenarios − Telephone Directory − The telephone directory stores the telephone numbers of people sorted by their names, so that the names can be searched easily. Dictionary − The dictionary stores words in an alphabetical order so that searching for any word becomes easy. Categories of Sorting The techniques of sorting can be divided into two categories. These are: Comparison Based Sorting Non-comparison Based Sorting Bubble Sort Bubble Sort is the simplest sorting algorithm that works by repeatedly swapping the adjacent elements if they are in the wrong order. In Bubble Sort algorithm, Traverse from the left and compare adjacent elements and the higher one is placed at the right side. In this way, the largest element is moved to the rightmost end at first. This process is then continued to find the second largest and place it and so on until the data is sorted. Consider a list with the elements – 5, 4, 3, 2 in it. You are asked to arrange these elements in both ascending and descending order. Bubble sort, also known as sinking sort, is the easiest sorting algorithm. It works on the idea of repeatedly comparing the adjacent elements, from left to right, and swapping them if they are out-of-order. Two elements are said to be out of order if they do not follow the desired order. Recall the list which had elements 5, 3, 4, 2 in it. Here, 5 and 3 are out-of-order if we want to arrange the list in ascending order. Furthermore, 5 and 3 are in order if we want to sort the list in descending order. In bubble sort, the repetition continues till the list we get the sorted list. Bubble compares all the elements in a list successively and sorts them based on their values. Since it compares all the elements one by one, bubble sort is a slow algorithm and performs inefficiently in real-world scenarios. It is generally used for educational purposes or to determine if the sorted sorted given list is already sorted or not. Algorithm for Bubble Sort : begin BubbleSort(arr) for all array elements if arr[i] > arr[i+1] swap(arr[i], arr[i+1]) end if end for return arr end BubbleSort Working of Bubble Sort Consider that we want to sort a list in ascending order, here are the steps that the algorithm would follow: 1. Start with the first element. 2. Compare the current element with the next element. 3. If the current element is greater than the next element, then swap both the elements. If not, move to the next element. 4. Repeat steps 1 – 3 until we get the sorted list. To better understand bubble sort, recall the list that contains the elements 5, 3, 4, 2 initially. The steps to sort this list would involve – From the above-given diagram, we can infer the following conclusions about the bubble sort algorithm – 1. In bubble sort, to sort a list of size n, we need to perform n – 1 iterations. This is because each time we iterate over the list, an element gets sorted. This element is left in the next iteration of the loop. While we reach the (n – 1)th iteration, we are only left with one number. And, since a minimum of two numbers is required for comparison, that number can not be compared further. 2. In the first pass, the highest element was bubbled out on the right side of the list. Similarly, after each iteration, the largest among the unsorted elements was placed at its position. This is the reason why this sorting algorithm is known as bubble sort. Advantages of Bubble Sort: Bubble sort is easy to understand and implement. It does not require any additional memory space. It is a stable sorting algorithm, meaning that elements with the same key value maintain their relative order in the sorted output. Disadvantages of Bubble Sort: Bubble sort has a time complexity of O(N2) which makes it very slow for large data sets. Bubble sort is a comparison-based sorting algorithm, which means that it requires a comparison operator to determine the relative order of elements in the input data set. It can limit the efficiency of the algorithm in certain cases. Insertion Sort Insertion sort is a simple sorting algorithm that works similar to the way you sort playing cards in your hands. The array is virtually split into a sorted and an unsorted part. Values from the unsorted part are picked and placed at the correct position in the sorted part. To sort an array of size N in ascending order, iterate over the array and compare the current element (key) to its predecessor, if the key element is smaller than its predecessor, compare it to the elements before. Move the greater elements one position up to make space for the swapped element. The elements here shift, they don’t swap. Working of Insertion Sort algorithm Consider an example: arr[]: {7,12,9,11,3} Step 1: We start with an unsorted array. [ 7, 12, 9, 11, 3] Step 2: We can consider the first value as the initial sorted part of the array. If it is just one value, it must be sorted, right? [ 7, 12, 9, 11, 3] Step 3: The next value 12 should now be moved into the correct position in the sorted part of the array. But 12 is higher than 7, so it is already in the correct position. We compare the leftmost unsorted element(Suppose “a”,here a=12) with the elements in the sorted part of the list, starting from the leftmost element in it(Suppose “b”,here b=7). When the unsorted element is smaller than the sorted element(i.e. anext), then update the current tail’s next pointer to point to the new node. Finally, we update the tail pointer to the new node. This will ensure that the new node is now the last node in the list while maintaining the circular linkage. 4. Insertion at specific position in circular linked list To insert a new node at a specific position in a circular linked list, we first check if the list is empty. If it is and the position is not 1 then we print an error message because the position doesn’t exist in the list. If the position is 1 then we create the new node and make it point to itself. If the list is not empty, we create the new node and traverse the list to find the correct insertion point. If the position is 1, we insert the new node at the beginning by adjusting the pointers accordingly. For other positions, we traverse through the list until we reach the desired position and inserting the new node by updating the pointers. If the new node is inserted at the end, we also update the last pointer to reference the new node, maintaining the circular structure of the list. Deletion from a Circular Linked List Deletion involves removing a node from the linked list. The main difference is that we need to ensure the list remains circular after the deletion. We can delete a node in a circular linked list in three ways: 1. Delete the first node in circular linked list To delete the first node of a circular linked list, we first check if the list is empty. If it is then we print a message and return NULL. If the list contains only one node (the head is the same as the last) then we delete that node and set the last pointer to NULL. If there are multiple nodes then we update the last->next pointer to skip the head node and effectively removing it from the list. We then delete the head node to free the allocated memory. Finally, we return the updated last pointer, which still points to the last node in the list. 2. Delete a specific node in circular linked list To delete a specific node from a circular linked list, we first check if the list is empty. If it is then we print a message and return nullptr. If the list contains only one node and it matches the key then we delete that node and set last to nullptr. If the node to be deleted is the first node then we update the next pointer of the last node to skip the head node and delete the head. For other nodes, we traverse the list using two pointers: curr (to find the node) and prev (to keep track of the previous node). If we find the node with the matching key then we update the next pointer of prev to skip the curr node and delete it. If the node is found and it is the last node, we update the last pointer accordingly. If the node is not found then do nothing and tail or last as it is. Finally, we return the updated last pointer. 3. Deletion at the end of Circular linked list To delete the last node in a circular linked list, we first check if the list is empty. If it is, we print a message and return nullptr. If the list contains only one node (where the head is the same as the last), we delete that node and set last to nullptr. For lists with multiple nodes, we need to traverse the list to find the second last node. We do this by starting from the head and moving through the list until we reach the node whose next pointer points to last. Once we find the second last node then we update its next pointer to point back to the head, this effectively removing the last node from the list. We then delete the last node to free up memory and return the updated last pointer, which now points to the last node. Searching in Circular Linked list Searching in a circular linked list is similar to searching in a regular linked list. We start at a given node and traverse the list until you either find the target value or return to the starting node. Since the list is circular, make sure to keep track of where you started to avoid an infinite loop. To search for a specific value in a circular linked list, we first check if the list is empty. If it is then we return false. If the list contains nodes then we start from the head node (which is the last->next) and traverse the list. We use a pointer curr to iterate through the nodes until we reach back to the head. During traversal, if we find a node whose data matches the given key then we return true to indicating that the value was found. After the loop, we also check the last node to ensure we don’t miss it. If the key is not found after traversing the entire list then we return false. Advantages of Circular Linked Lists In circular linked list, the last node points to the first node. There are no null references, making traversal easier and reducing the chances of encountering null pointer exceptions. We can traverse the list from any node and return to it without needing to restart from the head, which is useful in applications requiring a circular iteration. Circular linked lists can easily implement circular queues, where the last element connects back to the first, allowing for efficient resource management. In a circular linked list, each node has a reference to the next node in the sequence. Although it doesn’t have a direct reference to the previous node like a doubly linked list, we can still find the previous node by traversing the list. Disadvantages of Circular Linked Lists Circular linked lists are more complex to implement than singly linked lists. Traversing a circular linked list without a clear stopping condition can lead to infinite loops if not handled carefully. Debugging can be more challenging due to the circular nature, as traditional methods of traversing linked lists may not apply. Applications of Circular Linked Lists It is used for time-sharing among different users, typically through a Round-Robin scheduling mechanism. In multiplayer games, a circular linked list can