DAA 1, 2, 3 Units Notes (1) PDF - Data Structures and Algorithms
Document Details
Tags
Related
- Data Structures and Algorithms Study Guide (Sorsogon State University)
- DATA STRUCTURES AND ALGORITHMS-2015 Edition.pdf
- 1 Introduction to Data Structures and Algorithms.pdf
- Data Structures and Algorithms with JavaScript PDF
- Data Structures and Algorithms with Python and C++ PDF
- Data Structures and Algorithms in Python PDF
Summary
This document provides an introduction to data structures and algorithms with explanations, pseudocode examples, and problem-solving approaches for computer science students. It cover topics like algorithms, performance analysis, and asymptotic notation.
Full Transcript
## UNIT-2. Introduction * **Definition:** * **Algorithm:** An algorithm is a finite set of instructions that if followed accomplish a particular task. * An algorithm must satisfy the following criteria/characteristics: 1. **Input:** 0 or more quantities are externally supplied....
## UNIT-2. Introduction * **Definition:** * **Algorithm:** An algorithm is a finite set of instructions that if followed accomplish a particular task. * An algorithm must satisfy the following criteria/characteristics: 1. **Input:** 0 or more quantities are externally supplied. 2. **Output:** At least one quantity is produced. 3. **Definiteness:** Each instruction is clear and unambiguous. 4. **Finiteness:** Algorithm terminates after a finite number of steps. 5. **Effectiveness:** Every instruction must be basic (can be executable). * **Areas of study:** 1. **How to devise an algorithm?** * A 'guiding' algorithm is an artifact that may never be fully automated. 2. **How to validate an algorithm?** * Once an algorithm is derived, it is necessary to compute the correct results for all possible legal inputs. 3. **How to analyze an algorithm (or performance analysis)?** * It refers to the task of determining how much computing time and storage an algorithm is required. 4. **How to test a program?** * **Debugging:** The process of executing programs on sample datasets to determine whether faulty results occur. If so, correct them. * **Profiling:** The process of executing a correct program on datasets and measuring the time and space it takes to compute the results. * **Pseudocode conventions:** 1. Comments begin with // until at the end line. 2. Blocks are indicated with matching {} braces. 3. An identifier begins with a letter. Datatypes are not explicitly declared. 4. Assignment of values to variables is done (x = 1). 5. There are two boolean values, True and False. 6. Elements of multi-dimensional arrays are accessed (A[I][J]). 7. The following loop statements are involved: * **for:** for variable: value 1 to value n do {} * **while:** while <condition> do {} * **repeat until:** <stmt 1>...<stmt n> until <condition> 8. Conditional statements have the following forms: * **if:** if <condition> then <stmt 1> * **elseif:** elseif <condition> then <stmt 1> else <stmt 2> * **case:** case <condition 1> : <stmt 1> ... <condition n> : <stmt n> 9. Input and output are done using the instructions input -> read, output -> write. 10. There is only one type of procedure: algorithm 'Algorithm name' (parameter list) { head ... body} ## Sum of elements in an array ``` 10 15 9 8 A[1] [2] [3] [4] [5] ``` **Algorithm Sum(A, n)** ``` S = 0; for i = 1 to n do S = S + A[i]; return S; ``` ## Algorithm to find the maximum of 'A' array values and size of the array ``` A[1] [2] [3] [4] [5] 30 45 60 18 12 ``` **Algorithm Max(A, n)** ``` result = A[1]; for i = 2 to n do if (A[i] > result) then result = A[i]; return result; ``` ## Recursive Algorithm. * **Direct Recursive:** * Algorithm 'A' is said to be direct recursive if it calls itself. * **Indirect Recursive:** * Algorithm 'A' is said to be indirect recursive if it calls another algorithm 'B' which in turn calls 'A'. ## Algorithm to solve the Towers of Hanoi problem recursively **Algorithm TOH(n, A, B, C)** ``` if (n >= 1) then TOH(n - 1, A, B, C) write('Move Top disk from tower 'A' to tower 'B'); TOH(n - 1, C, B, A) ``` ## Algorithm to find the factorial of a given number **Algorithm FACT(n)** ``` if (n = 1) then write('Factorial of 'n' is '1'); else return (n + fact(n - 1)) ``` ## Performance Analysis * An algorithm can be evaluated by its time performance. * Performance can be of two types. * **A Priori estimates/performance analysis:** * **A Posteriori testing/performance measurement:** * Performance analysis can be computed by: * **Space Complexity:** The amount of memory an algorithm needs to run to completion. Space complexity contains two parts: * **fixed part:** Independent of instance characteristics. Typically includes dependent variables (constants). * **variable part:** dependent on instance characteristics. Typically includes dependent variables. * **Time Complexity:** * Space complexity of 'P' is denoted by: ``` S(P) = C + Sp ``` where C denotes the constants and Sp denotes the instance characteristics. ## Algorithm Sum(A, n) ``` S = 0; for i = 1 to n do S = S + A[i]; return S; ``` * **variable 'n' size** is 1 word. * **array values** are n words. * **'S' variable** uses 1 word. * **loop variable size** is 1 word. **S(P) ≥ (n + 3) words**. ## Time Complexity * The time complexity of an algorithm is the amount of computer time is needs to run to completion. * Time complexity contains two components: * **Compile Time:** Compile time is independent of instance characteristics. Once a compiled program, it will run several times without recompilation. * **Run Time:** Run time depends on instance characteristics. The time complexity of an algorithm (P) is T(P = c + tp) ## Asymptotic Notation There are three types of asymptotic notation: * **Big-O (O):** The function f(n) = O(g(n)) if there exists two positive constants C and N0 such that F(n) ≤ C * g(n) for all n ≥ N0, C ≥ 0. * **Omega (Ω):** The function F(n) = Ω(g(n)) if only if there exists a positive constant C and N0 such that F(n) ≥ C * g(n) for all n ≥ N0, C ≥ 0. * **Theta (Θ):** The function f(n) = Θ(g(n)) if and only if there exists two positive constants C1, C2 and N0 such that C1 * g(n) ≤ F(n) ≤ C2 * g(n) for all n ≥ N0, C1,C2 ≥ 0. ## Disjoint Sets n = 10 elements S1, S2, S3 are disjoint sets. ``` S1 = {1, 7, 8, 9} S2 = {2, 5, 10} S3 = {3, 4, 6} ``` * Disjoint set tree representations * Disjoint set union * Find(i) * S1US2 = {1, 7, 8, 9} U {2, 5, 10} = {1, 2, 5, 7, 8 ,9, 10} * S1US3 = {1, 2, 5, 7, 8, 9, 10} * Find(i) = Find(4) * Element 4 is present in S3. * Union and Find Operations: * S1US2 = {1, 2, 5, 7, 8, 9, 10} * Data Representation * Set pointer (Si) -> ... * Root node is represented as -1. * Parent of root node (3) = -1 ## Algorithm for Simple Union(i, j) { P[j] = j; ## Algorithm for Simple Union(i, j) { P[i] = j; ## Algorithm for Find(i) { while (P[i] > 0) do i = P[i]; } return i ## Divide and Conquer **General Method:** **Algorithm DandC(P)** ``` if small(P) then return solution/S(P); else { Divide P into smaller instances P1, P2, P3, ... Pk; K ≥ 1 Apply DandC to each of the sub-problems return Combine (DandC (P1), DandC (P2), ... DandC (Pk)); } ``` * Computing the time of D and C is described by the recurrence relation: ``` T(n) = {g(n); if n is small {T(n1) + T(n2) + ... + T(nk) + F(n) ```` * The complexity of many D&C algorithms is given by recurrences of the form: ``` T(n) = { aT(n/b) + f(n); n ≥ 1. { aT(1); n = 1 ``` where a, b are constants, aT(1) is known and n is power of b. n=b^k. ## One of the methods for solving recurrence relation is called Substitution method. For example: consider the case in which a=2, b=2 and T(1)=2 and F(n)=n. ``` T(n) = aT(n/b) + f(n) T(n) = 2T(n/2) + n T(n/2) = 2^2[T(n/4) + n/2] + n T(n/2) = 2T(n/4) + n T(n) = 2[2T(n/4) + n] + n T(n) = 2^2T(n/4) + 2n T(n/4) = 2T(n/8) + n/4 T(n) = 2^3T(n/8) = 3n T(n) = 2^kT(n/2^k) + kn T(n) = nT(n/n) + n log₂ n T(n) = 2n + n log₂ n = O(n log n) ``` ## Binary Search ### Recursive: **Algorithm BinSearch(a, i, j, x)** ``` if (i == j) then if (x = a[i]) then return i; else return; else mid = i + (j - i)/2; if (x = a[mid]) then return mid; else if x < a[mid] then return BinSearch(a, i, mid-1, x); else return BinSearch(a, mid+1, j, x) ``` ### Non-recursive: **Algorithm BinSearch(a, n, x)** ``` low = 1; high = n; while (low ≤ high) do { mid = low + high / 2; if (x = a[mid]) then high = mid - 1; else if (x < a[mid]) then low = mid + 1; else return mid } return; ``` ## Recurrence Relation for Binary Search ``` T(n) = { 1; n = 1 {T(n/2) + 1; n = 2^k ``` ## Quick Sort **Algorithm QuickSort(a, lb, ub)** ``` if (lb < ub) { loc = partition(a, lb, ub); QuickSort(a, lb, loc - 1); QuickSort(a, loc + 1, ub); } ``` ## Algorithm partition (a, lb, ub) (Return the position of the pivot) ``` pivot = A[lb]; start = lb; end = ub; while (start < end) { while (a[start] <= pivot && start < end) start = start + 1; while (a[end] > pivot && end > start) end = end - 1; if (start < end) { swap(a[start], a[end]); } a[lb] = a[end], a[end] = pivot; return end; } ``` ## Recurrence Relation for Quick Sort in Best Case T(n) = { aT(n/b) + f(n); n ≥ 1, a=2, b=2, f(n)= n ``` T(n) = 2T(n/2) + n T(n/2) = 2T(n/4) + n/2 T(n) = 2^2[T(n/4) + n/2] + n T(n) = 2^2T(n/4) + 2n T(n/4) = 2T(n/8) + n/4 T(n) = 2^3T(n/8) + 3n T(n) = 2^kT(n/2^k) + kn T(n) = nT(n/n) + n log₂ n T(n) = 2n + n log₂ n = O(n log n) ``` ## Merge Sort **Algorithm MergeSort(low, high)** ``` if (low < high) then mid = low + high / 2; MergeSort(low, mid) MergeSort(mid + 1, high) Merge(low, mid, high) ``` **Algorithm Merge(low, mid, high)** ``` h = low, i = low, j = mid + 1; while ((i ≤ mid) and (j ≤ high)) do if (a[i] ≤ a[j]) then b[h] = a[i]; h = h + 1; i = i + 1; else b[h] = a[j]; h = h + 1; j = j + 1; if (i ≤ mid ) then for k = j to high do b[h] = a[k]; h = h + 1; else for k = i to mid do b[h] = a[k]; h = h + 1; for k = low to high do a[k] = b[k]; ``` ## UNIT-2. Graphs * **Breadth First Search:** **Algorithm BFS(V)** ``` u = V; visited[V] = 1; repeat for all vertices (w) adjacent from u do if (visited[w] = 0) then add w to q; visited[w] = 1 } if q is empty then return delete u from q until (false); ``` * **Depth First Search:** **Algorithm DFS(V)** ``` visited[V] = 1; for each vertex (w) adjacent from V do if (visited[w] = 0) then DFS(w); ``` * **Connected Components** * if 'G' is a connected undirected graph; run all vertices of G are visited. * If G is not connected then we need at least two calls to BFS. * The question is about obtaining a spanning tree for an undirected graph. * A graph G has a spanning tree if it is only if G is a connected graph. * 'T' denotes the set of forward edges except (5,6) (6,8) (4,8). * Spanning tree obtained using BFS and called Breadth first spanning tree. * **Biconnected Components and DFS:** * A vertex V in a connected graph G is an articulation point if and only if the deletion of vertex V, together with all edges incident to V disconnects the graph into two or more non-empty components. ## Greedy Method **Algorithm Greedy(a, n)** ``` sol = {}; for i =1 to n do x = select(a); if feasible(sol, x) then sol = union(sol, x); return sol; ``` ## 0/1 knapsack problem n = 3, m = 20, (P1, P2, P3) = (25, 24, 15), (W1, W2, W3) = (18, 15, 10) * **m**: capacity of the bag * **weight**: capacity **i) maximize profit** * x1 = 1 * x2 = 2 / 15 (only 2 is available because of 20 - 18 = 2) * x3 = 0 (no place available) * ΣP<sub>I</sub>X<sub>I</sub> = P1X1 + P2X2 + P3X3 * = 25 * 1 + 24 * 2 + 15 * 0 * = 25 + 48 * = 25 + 3.2 * = 28.2 * **Profit = 28.2** * ΣW<sub>I</sub>X<sub>I</sub> = W1X1 + W2X2 + W3X3 * = 18 * 1 + 15 * 2 + 10 * 0 * = 18 + 30 * = 18 + 2 * **weight = 20** **ii) minimize weight** * x3 = 1 (20 - 10 = 10) * x2 = 10 (only 10 is available because of 20 - 10 = 10) * x1 = 0 (no place available) * ΣP<sub>I</sub>X<sub>I</sub> = P1X1 + P2X2 + P3X3 * = 25 * 0 + 24 * 1 + 15 *1 * = 0 + 16 + 15 * = 31 * ΣW<sub>I</sub>X<sub>I</sub> = W1X1 + W2X2 + W3X3 * = 18 * 0 + 15 * 1 + 10 * 1 * = 0 + 15 + 10 * **weight = 20** **iii) P / W ratio is maximum** * P1 / W1 = 25 / 18 = 1.3 * P2/ W2 = 24 / 15 = 1.6 * P3/ W3 = 15 / 10 = 1.5 * x2 = 1 (20 - 15 = 5) * x3 = 5 / 10 = 1/2 (only 5 is available) * x1 = 0 (no space available) * ΣP<sub>I</sub>X<sub>I</sub> = P1X1 + P2X2 + P3X3 * = 25 * 0 + 24 * 1 + 15 * 1/2 * = 0 + 24 + 7.5 * **Profit = 31.5** * ΣW<sub>I</sub>X<sub>I</sub> = W1X1 + W2X2 + W3X3 * = 18 * 0 + 15 * 1 + 10 * 1/2 * = 15 + 5 * **weight = 20** **Conclusion** The profit is maximum in the case of P/W ratio maximum. The solution is (0, 1, 1/2) because we are getting the maximum profit. ## Algorithm for 0/1 Knapsack **Algorithm Knapsack(m, n)** ``` for i = 1 to n do x[i] = 0; u = m; for i = 1 to n do { if (w[i] > u) then break; x[i] = 1, u = u - w[i]; if (i == n) then x[i] = u / w[i]; } ``` n = 7, m = 15, (P1, P2, P3, P4, P5, P6, P7) = (10, 5, 15, 7, 6, 18, 3) (w1, w2, w3, w4, w5, w6, w7) = (2, 3, 5, 7, 1, 4, 1) **i) maximize profit** * x6 = 1 (15 - 4 = 1) * x3 = 1 (11 - 5 = 6) * x1 = 1 (6 - 2 = 4) * x4 = 4 / 7 * x2 = 0 * x5 = 0 * ΣP<sub>I</sub>X<sub>I</sub> = P1X1 + P2X2 + P3X3 + P4X4 + P5X5 + P6X6 + P7X7 * = 10 * 1 + 5 * 0 + 15 * 1 + 7 * 4 / 7 + 6 * 0 + 18 * 1 + 3 * 0 * = 10 + 0 + 15 + 4 + 0 + 18 + 0 * **profit = 47** * Σ P<sub>i</sub>X<sub>i</sub> = P1X1 + P2X2 + P3X3 + P4X4 + P5X5 + P6X6 + P7X7 * = 10 * 1 + 5 * 1 + 15 * 4 + 7 * 0 + 6 * 1 + 18 * 1 + 3 * 1 * = 10 + 5 + 12 + 6 + 18 + 3 * **profit = 54** **ii) minimize weight** * x5 = 1 (15 - 1 = 14) * x7 = 1 (14 - 1 = 13) * x1 = 1 (13 - 2 = 11) * x2 = 1 (11 - 3 = 8) * x6 = 1 (8 - 4 = 4) * x3 = 4 / 5 * x4 = 0 **iii) P/W ratio is maximum** ## Job Sequencing with Deadline * **n = 7** * **Jobs:** J1, J2, J3, J4, J5, J6, J7 * **Profits:** 35, 30, 25, 20, 15, 12, 5 * **Deadlines:** 3, 4, 4, 2, 1, 3, 2 **Algorithm for Job-sequencing:** **Algorithm JS(d, J, n)** ``` d[0] = J[0] = 0; J[1] = 1; K = 1; for i = 2 to n do { K = i; r = i - 1; while ((d[J[r]] ≠ r) and (d[J[r]] > d[i])) do r = r - 1; if ((d[J[r]] ≤ d[i])) then { for q = K to r + 1 do J[q + 1] = J[q]; J[r + 1] = i; K = K + 1; } return K; } ``` ## Minimum Cost Spanning Tree **Spanning Tree** * **Number of edges = 6 - 1 = 5** * **Number of vertices = 6** * **Number of spanning trees = 6C5** * **Prim's algorithm:** * Total Cost = 10 + 25 + 22 + 12 + 16 + 14 = 99 * **Kruskal's algorithm** * Total cost = 10 + 12 + 14 + 16 + 22 + 25 * = 99 ## Single Source Shortest Path **Dijkstra's algorithm** <start_of_image> Schematic representation of the graph. ## UNIT-3. Dynamic Programming * **Principle of Optimality:** * A problem can be solved by taking a sequence of decisions to get the optimal solution. This is called the principle of optimality. ## AllPairs Shortest Path A ``` 1 2 3 4 1 0 3 ∞ 7 2 8 ∞ 2 1 3 5 ∞ 0 1 4 2 ∞ ∞ 0 ``` A1 ``` 1 2 3 4 1 0 3 ∞ 7 2 8 0 2 1 3 5 ∞ 0 1 4 2 ∞ ∞ 0 ``` A2 ``` 1 2 3 4 1 0 3 5 7 2 8 0 2 1 3 5 8 0 1 4 2 5 ∞ 0 ``` A3 ``` 1 2 3 4 1 0 3 5 6 2 7 0 2 3 3 5 8 0 1 4 2 5 7 0 ``` A4 ``` 1 2 3 4 1 0 3 5 6 2 5 0 2 3 3 3 6 0 1 4 2 5 7 0 ``` ## Traveling Sales Person Problem 1 2 3 4 1 0 10 15 20 2 50 9 10 3 6 130 4 8 8 9 0 * g(1,1) = C11 = 0 * g(2,4) = C21 = 5 * g(3,4) = C31 = 6 * g(4,4) = C41 = 8 g(2,233) = C23 + g(3,4) = 9 + 6 = **15** g(2,243) = C24 + g(4,4) = 10 + 8 = **18** g(3,223) = C32 + g(2,4) = 13 + 5 = **18** g(3,243) = C34 + g(4,4) = 12 + 8 = **20** g(4,423) = C42 + g(2,4) = 8 + 5 = **13** g(4,433) = C43 + g(3,4) = 9 + 6 = **15** g(2,23,43)= min{C23 + g(3,43), C24 + g(4,43), C24} = min { 9 + 20, 10 + 15, 25 } = **25** g(3,22,43)= min {C32 + g(2, 243), C34 + g(4,43), C34 + g(4,423)} = min {13 + 18, 12 + 13, 12 + 15 } = **25** g(4,2,3,3) = min { C42 + g(2,23,3), C43 + g(3,22,3) } = min { 8 + 15, 9 + 18 } = **23** g(1,2,3,4,3) = min { C12 + g(2,23,4,3), C13 + g(3,22,4,3), C14 + g(4,22,3,3) } = min { 10 + 25, 15 + 25, 20 + 23 } = **35** ## 0/1 Knapsack Problem Using Dynamic Programming * **Consider the Knapsack-instance with m = 3 and m = 6.** * Profits are (P1, P2, P3) = (1, 2, 5) and weights (W1, W2, W3) = (2, 3, 4). * m = capacity **Solution:** * S<sup>0</sup> = { (0, 0) } * S<sup>1</sup> = {(1, 2)} * S<sup>2</sup> = { (0, 0) (1, 2) (2, 3) } * S<sup>3</sup> = { (0, 0) (1, 2) (2, 3) (3, 5)} * S<sup>4</sup> = { (0, 0) (1, 2) (2, 3) (3, 5) (5, 4) (6, 6) } * S<sup>5</sup> = { (0, 0) (1, 2) (2, 3) (3, 5) (5, 4) (6, 6) (7, 7) (8, 9) } * S<sup>6</sup> = { (0, 0) (1, 2) (2, 3) (5, 4) (6, 6) (7, 7) (8, 9) } * (7,7) and (8, 9) cannot be merged because it exceeds the capacity (m=6) of the bag. * S<sup>6</sup> = {(0, 0) (1, 2) (2, 3) (5, 4) (6, 6) } * 5, 7 , 4 (eliminate the older pair (3, 5)) **Continue for (n = 4, m = 30)** * S<sup>0</sup> = { (0, 0) } * S<sup>1</sup> = {(2, 10) } * S<sup>2</sup> = { (0, 0) (2, 10) (5, 15) } * S<sup>3</sup> = { (0, 0) (2, 10) (5, 15) (7, 25) } * (P3, W3) = ( 8, 6) * S<sup>4</sup> = {(8, 6) (10, 16) (13,21) (15, 31) * S<sup>5</sup> = { (0, 0) (2, 10) (5, 15) (7, 25) ( 8,6) ( 10, 16) (13, 21) (15, 31) } * (15, 31) is removed, it exceeds the capacity (m=30) * Applying pruning rule, (7, 25) is removed. * S<sup>5</sup> = { (0, 0) (2, 10) (5, 15) ( 8, 6) (10, 16) (13, 21) } * Applying pruning rule, ( 15, 15) is removed. * S<sup>5</sup> = { (0, 0) (2, 10) ( 8, 6) (10, 16) (13, 21) } * Applying pruning rule, (9, 10) is removed. * S<sup>5</sup> = { (0, 0) (8, 6) (10, 16) (13, 21) } * (P4, W4) = (1, 9) * S<sup>6</sup> = { (0, 0) ( 8, 6) (10, 16) (13, 21) (1, 9) ( 9, 15) ( 11, 25 ) ( 14, 30) } * Applying pruning rule (10, 16) * S<sup>6</sup> = { (0, 0) ( 8, 6) (10, 16) (13, 21) (1, 9) ( 9, 15) ( 11, 25) ( 14, 30) } (14, 30) has the highest profit. (14, 30) ∈ S<sup>6</sup>, x<sub>4</sub> = 1, from (14, 30) sub (1, 9) (14, 30) ∈ S<sup>6</sup> (14 - 1, 30 - 9) = (13, 21) (13, 21) ∈ S<sup>6</sup>, x<sub>3</sub> = 1 from (13, 21) sub (8, 6) (13, 21) ∈ S<sup>6</sup> (13 - 8, 21 - 6) = (5, 15) (5, 15) ∈ S<sup>5</sup>, x<sub>2</sub> = 1 from (5, 15) sub (5, 15) (0, 0) ∈ S<sup>5</sup>, x<sub>1</sub> = 0 (0, 0) ∈ S<sup>4</sup>, x<sub>2</sub> = 0, x<sub>3</sub> = 1, x<sub>4</sub> = 1 **profit = 5 + 8 + 1 = 14** **weight = 15 + 6 + 9 = 30** ## Matrix Chain Multiplication * Let Mi,j denote the cost of multiplying A<sub>i</sub>...A<sub>j</sub>, where the cost is measured in the number of scalar multiplications. * Here Mi,j