CS353 Midterm Slides PDF

Summary

These slides cover the fundamental concepts of algorithms and their use in solving problems. They discuss algorithm characteristics, problem types, and different methods of specifying algorithms. The slides also provide examples and illustrate the process of designing and implementing efficient algorithms.

Full Transcript

‫ر‬ ‫الجامعة السعودية االلكتونية‬ ‫ر‬ ‫االلكتونية‬ ‫الجامعة السعودية‬ ‫‪26/12/2021‬‬ College of Computing and Informatics Design and Analysis of Algorithms Design and Analysis of Algorithms Module 1 Fundamental of Algorithms Week Learning Outcome...

‫ر‬ ‫الجامعة السعودية االلكتونية‬ ‫ر‬ ‫االلكتونية‬ ‫الجامعة السعودية‬ ‫‪26/12/2021‬‬ College of Computing and Informatics Design and Analysis of Algorithms Design and Analysis of Algorithms Module 1 Fundamental of Algorithms Week Learning Outcomes Explain algorithms’ meaning and their role in solving problems Classify the types of common problems and various types of data structures 4 Topic Outline The definition of algorithm Fundamentals of Algorithmic Problem Solving Important Problem Types 5 The definition of algorithm What is an algorithm An algorithm is a sequence of clear instructions for solving a problem for acquiring a required Problem output for any legitimate input in a finite amount of time. Algorithm Input Computer output 7 The Basic blocks of an algorithm ▪ The clear requirement for each step of an algorithm cannot be compromised. ▪ The careful specification for the range of inputs Problem for which an algorithm works for. ▪ Several different ways can be used for represented the same algorithm. ▪ Several algorithms can be found for solving the Algorithm same problem. What do distinguish these algorithms ? ✓ using different ideas , Input Computer output ✓ different speeds. 8 Characteristics of an algorithm ▪ Input: Zero / more quantities are externally supplied. ▪ Output: At least one quantity is produced. ▪ Definiteness: Each instruction is clear and unambiguous. ▪ Finiteness: If the instructions of an algorithm is traced then for all cases the algorithm must terminates after a finite number of steps. ▪ Efficiency: Every instruction must be very basic and runs in short time. 9 Steps for writing an algorithm 1. An algorithm is a procedure. It has two parts; the first part is head and the second part is body. 2. The Head section consists of keyword Algorithm and Name of the algorithm with parameter list. E.g. Algorithm name1(p1, p2,…,p3) The head section also has the following: //Problem Description: //Input: //Output: 3. In the body of an algorithm various programming constructs like if, for, while and some statements like assignments are used and uses different data structures likes Arrays, Trees , Lists and Sets. 10 Steps for writing an algorithm 4. The compound statements may be enclosed with { and } brackets. if, for, while can be closed by endif, endfor, endwhile respectively. Proper indention is must for block. 5. Comments are written using // at the beginning. 6. The identifier should begin by a letter and not by digit. It contains alpha numeric letters after first letter. No need to mention data types. 7. The left arrow “←” used as assignment operator. E.g. v←10 8. Boolean operators (TRUE, FALSE), Logical operators (AND, OR, NOT) and Relational operators (=,=, ≠, ) are also used. 9. Input and Output can be done using read and write. 11 Example Consider Euclid’s algorithm for computing gcd(m, n) in simple steps Step 1 If n = 0, return the value of m as the answer and stop; otherwise, proceed to Step 2. Step 2 Divide m by n and assign the value of the remainder to r. Step 3 Assign the value of n to m and the value of r to n. Go to Step 1. Euclid’s algorithm for computing gcd(m, n) expressed in pseudocode ALGORITHM Euclid_gcd(m, n) //Computes gcd(m, n) by Euclid’s algorithm //Input: Two nonnegative, not-both-zero integers m and n //Output: Greatest common divisor of m and n while n ≠ 0 do r ←m mod n m←n n←r 12 return m Fundamentals of Algorithmic Problem Solving Steps of Designing and Analysis of Algorithms 14 Understanding The problem This is the first step in designing of algorithm. In this step : Read the problem’s description carefully to understand the problem statement completely. Ask questions for clarifying the doubts about the problem. Identify the problem types and use existing algorithm to find solution. Input (instance) to the problem and range of the input get fixed. Remember that a correct algorithm is not one that works most of the time, but one that works correctly for all legitimate inputs. Skipping the first step will run the risk of unnecessary rework. 15 Decision making The Decision making is done on the following: (a) Ascertaining the Capabilities of the Computational Device In random-access machine (RAM), instructions are executed one after another (The central assumption is that one operation at a time) Algorithms designed to be executed on such machines are called sequential algorithms. In some newer computers, operations are executed concurrently. Algorithms that take advantage of this capability are called parallel algorithms. Choice of computational devices like Processor and memory is mainly based on space and time efficiency 16 Decision making The Decision making is done on the following: (b) Choosing between Exact and Approximate Problem Solving An algorithm used to solve the problem exactly and produce correct result is called an exact algorithm. If the problem is so complex and not able to get exact solution, then we have to choose an algorithm called an approximation algorithm. Examples: extracting square roots, solving nonlinear equations, and evaluating definite integrals. 17 Decision making The Decision making is done on the following: ( c ) Algorithm Design Techniques An algorithm design technique is a general approach to solving problems algorithmically that is applicable to a variety of problems from different areas of computing. Algorithms + Data Structures = Programs The choice of proper data structure is required before designing the algorithm. Implementation of algorithm is possible only with the help of Algorithms and Data Structures Some of Algorithmic technique Brute Force, Divide and Conquer, Dynamic Programming, Greedy Technique and so on. 18 Methods of Specifying an Algorithm There are three ways to specify an algorithm. Algorithm Spesifications ( a) ( b) ( c) Natural Language Pseudocode Flowchart Pseudocode is the option that are most widely used nowadays for specifying algorithms. 19 Methods of Specifying an Algorithm a. Natural Language It is very simple and easy to specify an algorithm Drawbacks: many times provides a brief specification. Example: An algorithm to perform addition of two numbers. Step 1: Read the first number, say a. Step 2: Read the second number, say b. Step 3: Add the above two numbers and store the result in c. Step 4:( a) Display the result from c. ThinkNatural of implementing Language this ? Many programmers prefer to have algorithm specification in Pseudocode. 20 Methods of Specifying an Algorithm b. Pseudocode Pseudocode is a mixture of a natural language and programming language constructs. Pseudocode is usually more precise than natural language. For Assignment operation left arrow “←”, for comments two slashes “//”,if condition, for, while loops are used. Example: ALGORITHM Sum(a,b) //Problem Description: This algorithm performs addition of two numbers //Input: Two integers a and b //Output:( a) Addition of two integers c←a+b Natural Language return c This specification is more useful for implementation of any language. 21 Methods of Specifying an Algorithm c. Flowchart Flowchart is a graphical representation of an algorithm by using a collection of connected geometric shapes containing descriptions of the algorithm’s steps. It is the dominant method in the earlier days of computing, This representation technique is inconvenient. Flowchart Symbols Start Start state Input & Output ( a) Transition / Assignment Condition / Decision Natural Language Processing/ read input Flow conectivity Stop Stop state 22 Methods of Specifying an Algorithm c. Flowchart Example: Addition two integers a & b Note: flowcharts can be found only in old algorithm books. Pseudocode is dominate method 23 Proving an Algorithm’s Correctness Algorithm's correctness comes after the specification of an algorithm. An algorithm must give a required result for every legitimate input in a finite amount of time. A mathematical induction is a common technique for proving correctness of an algorithm. It is used because an algorithm’s iterations provide a natural sequence of steps The notion of correctness for approximation algorithms is less straightforward than it is for exact algorithms. The error produced by the algorithm should not exceed a predefined limit. 24 Analyzing an Algorithm There are two kinds of algorithm efficiency. They are: Time efficiency, indicating how fast the algorithm runs, and Space efficiency, indicating how much extra memory it uses. The efficiency of an algorithm is determined by measuring both time efficiency and space efficiency. So factors to analyze an algorithm are: Time efficiency of an algorithm Space efficiency of an algorithm Simplicity of an algorithm Generality of an algorithm 25 Coding an Algorithm The coding / implementation of an algorithm is done by a suitable programming language like C, C++, JAVA. It is very essential to write an optimized code (efficient code) to reduce the burden of compiler. 26 Important Problem Types Important Problem Types The most important problems types are : Sorting Searching String processing Graph problems Combinatorial problems Geometric problems Numerical problems 28 Important Problem Types (i) Sorting The sorting problem is to rearrange the items of a given list in nondecreasing (ascending) order. Sorting can be done on numbers, characters, strings or records. To sort student records in alphabetical order of names or by student number or by student grade-point average. Such a specially chosen piece of information is called a key. An algorithm is said to be in-place if it does not require extra memory, E.g., Quick sort. A sorting algorithm is called stable if it preserves the relative order of any two equal elements in its input. 29 Important Problem Types (ii) Searching The searching problem deals with finding a given value, called a search key, in a given set. E.g., Ordinary Linear search and fast binary search. (iii) String processing A string is a sequence of characters from an alphabet. Strings comprise letters, numbers, and special characters; bit strings, which comprise zeros and ones; and gene sequences, which can be modeled by strings of characters from the four character alphabet {A, C, G, T}. It is very useful in bioinformatics. Searching for a given word in a text is called string matching 30 Important Problem Types (iv) Graph problems A graph is a collection of points called vertices, some of which are connected by line segments called edges. Some of the graph problems are graph traversal, shortest path algorithm, topological sort, traveling salesman problem and the graph-coloring problem and so on. (v) Combinatorial problems These are problems that ask, explicitly or implicitly, to find a combinatorial object such as a permutation, a combination, or a subset that satisfies certain constraints. A desired combinatorial object may also be required to have some additional property such as a maximum value or a minimum cost. In practical, the combinatorial problems are the most difficult problems in computing. The traveling salesman problem and the graph coloring problem are examples of combinatorial problems. 31 Important Problem Types (vi) Geometric problems Geometric algorithms deal with geometric objects such as points, lines, and polygons. Geometric algorithms are used in computer graphics, robotics, and tomography. The closest-pair problem and the convex-hull problem are comes under this category. (vii) Numerical problems Numerical problems are problems that involve mathematical equations, systems of equations, computing definite integrals, evaluating functions, and so on. The majority of such mathematical problems can be solved only approximately. 32 Activity !!! In groups , state the basic types of data structure and define their features ? 33 Summary Algorithms for the same problem can be based on very different ideas and can solve the problem with dramatically different speeds. Input, output, definiteness, finiteness and efficiency are the main characteristics of any algorithms We introduced the major steps for writing an algorithms and also the steps for designing and analysis an algorithms in details. Sorting problems , Searching problems , String matching problems Graph problems , Combinatorial problems , Geometric problems and Numerical problems Algorithm design techniques (or “strategies” or “paradigms”) are general approaches to solving problems algorithmically, applicable to a variety of problems from different areas of computing. 34 List of Readings Required Reading : Chapter 1 from Anany Levitin (2012). Introduction to the Design and Analysis of Algorithms. Pearson, 2012, 3rd edition Recommending Reading : Chapter 1 from Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein ( 2022 ). Introduction to Algorithms. MIT Press..Fourth edition Chapter 1 from Michael T.Goodrich, Roberto Tamassia (2014). Algorithm Design and Applications. First Edition Wiley. 35 Thank You 36 ‫ر‬ ‫الجامعة السعودية االلكتونية‬ ‫ر‬ ‫االلكتونية‬ ‫الجامعة السعودية‬ ‫‪26/12/2021‬‬ College of Computing and Informatics Design and Analysis of Algorithms Design and Analysis of Algorithms Module 2 Algorithms Analysis Methods Week Learning Outcomes Express of analysis of algorithms framework and analysis framework Explain the growth function Asymptotic notation, functions, and running times Apply analysis of algorithm framework for recursive and non-recursive functions 40 Topic Outline The algorithm analysis framework Asymptotic notations and basic efficiency classes Application of mathematical analysis for non- recursive and recursive algorithms Empirical Analysis of Algorithm 41 The Analysis Framework Analysis of algorithms Issues: correctness time efficiency space efficiency optimality Approaches: empirical analysis – less useful theoretical analysis – most important 43 The Analysis Framework There are two kinds of efficiency: Time efficiency also can be called time complexity indicating how fast the algorithm runs. Space efficiency, also called space complexity indicating the amount of memory units required by the algorithm as well as the space needed for its input and output. The algorithm analysis framework consists of the following: Measuring an Input’s Size Units for Measuring Running Time Orders of Growth Worst-Case, Best-Case, and Average-Case Efficiencies 44 Theoretical analysis of time efficiency Time efficiency is analyzed by determining the number of repetitions of the basic operation as a function of input size Basic operation: the operation that contributes most towards the running time of the algorithm input size T(n) ≈ copC(n) Number of times running time execution time basic operation is for basic operation executed 45 Measuring an Input’s Size An algorithm’s efficiency is defined as a function of some parameter n indicating the algorithm’s input size.. For the problem of evaluating a polynomial 𝑝 𝑥 = 𝑎𝑛 𝑥 𝑛 + …. +𝑎0 of degree n, the size of the parameter will be the polynomial’s degree or the number of its coefficients, which is larger by 1 than its degree. In computing the product of two n ✕ n matrices, the choice of a parameter indicating an input size does matter. Consider a spell-checking algorithm. If the algorithm examines individual characters of its input, then the size is measured by the number of characters. In measuring input size for algorithms solving problems such as checking primality of a positive integer n. The input is just one number. The input size by the number b of bits in the n’s binary representation is 𝑏 = ( log 2 𝑛) + 1. 46 Units for Measuring Running Time Some standard unit of time measurement such as a second, or millisecond, and so on can be used to measure the running time of a program after implementing the algorithm. Drawbacks 1. Dependence on the speed of a particular computer. 2. Dependence on the quality of a program implementing the algorithm. 3. The compiler used in generating the machine code. 4. The difficulty of clocking the actual running time of the program. One possible approach is to count the number of times each of the algorithm’s operations is executed. This approach is excessively difficult. The most important operation (+, -, *, /) of the algorithm, called the basic operation. Computing the number of times the basic operation is executed is easy. The total running time is determined by basic operations count. 47 Input size and basic operation examples Problem Input size measure Basic operation Searching for key in a list of n Number of list’s items, i.e. n Key comparison items Matrix dimensions or total number Multiplication of two Multiplication of two matrices of elements numbers Checking primality of a given n’size = number of digits (in binary Division integer n representation) Visiting a vertex or Typical graph problem #vertices and/or edges traversing an edge 48 Orders of Growth A difference in running times on small inputs is not what really distinguishes efficient algorithms from inefficient ones. For example, the greatest common divisor of two small numbers, it is not immediately clear how much more efficient Euclid’s algorithm is compared to the other algorithms, the difference in algorithm efficiencies becomes clear for larger numbers only. 49 Orders of Growth For large values of n, it is the function’s order of growth that counts As shown in table below, which contains values of a few functions particularly important for analysis of algorithms 50 Worst-case, Best-case, and Average-case Efficiencies Consider Sequential Search algorithm some search key K ALGORITHM SequentialSearch(A[0..n - 1], K) //Searches for a given value in a given array by sequential search //Input: An array A[0..n - 1] and a search key K //Output: The index of the first element in A that matches K or -1 if there are no // matching elements i ←0 while i < n and A[i] ≠ K do Clearly, the running time of this algorithm can be quite i ←i + 1 different for the same list size n. if i < n return i else return -1 In the worst case, there is no matching of elements or the first matching element can found at last on the list. In the best case, there is matching of elements at first on the list. 51 Worst-case Efficiency The worst-case efficiency of an algorithm is its efficiency for the worst case input of size 𝑛. The algorithm runs the longest among all possible inputs of that size. For the input of size 𝑛, the running time is 𝐶𝑤𝑜𝑟𝑠𝑡 𝑛 = 𝑛. The worst-case analysis provides very important information about an algorithm’s efficiency by bounding its running time from above it guarantees that for any instance of size n, the running time will not exceed 𝐶𝑤𝑜𝑟𝑠𝑡 𝑛 , its running time on the worst-case inputs Think about Worst-Case efficiency in Sequential Search !!! 52 Best-case Efficiency The best-case efficiency of an algorithm is its efficiency for the best case input of size 𝑛. The algorithm runs the fastest among all possible inputs of that size 𝑛. In sequential search, If we search a first element in list of size 𝑛. (i.e. first element equal to a search key), then the running time is 𝐶𝑏𝑒𝑠𝑡 𝑛 = 1 53 Average-case Efficiency The Average case efficiency lies between best case and worst case. To analyze the algorithm’s average case efficiency, we must make some assumptions about possible inputs of size 𝑛. The standard assumptions are that o The probability of a successful search is equal to 𝑝 0 ≤ 𝑝 ≤ 1 and o The probability of the first match occurring in the 𝑖𝑡ℎ position of the list is the same for every 𝑖. 54 Average case efficiency In the case of a successful search, the probability of the first match occurring in the 𝑖𝑡ℎ position of the list is 𝑝/𝑛 for every 𝑖, and the number of comparisons made by the algorithm in such a situation is obviously 𝑖. In the case of an unsuccessful search, the number of comparisons will be 𝑛 with the probability of such a search being (1 − 𝑝). Therefore, 55 Amortized Efficiency Amortized efficiency applies not to a single run of an algorithm but rather to a sequence of operations performed on the same data structure. In some situations ,a single operation can be expensive, but the total time for an entire sequence of 𝑛 such operations is always significantly better than the worst- case efficiency of that single operation multiplied by 𝑛. “Amortize” the high cost of such a worst-case occurrence over the entire sequence in a manner similar to the way a business would amortize the cost of an expensive item over the years of the item’s productive life. This sophisticated approach was discovered by Robert Tarjan, who used it, among other applications, in developing an interesting variation of the classic binary search tree 56 Key Points of the Analysis Framework Both time and space efficiencies are measured as functions of the algorithm’s input size. Time efficiency is measured by counting the number of times the algorithm’s basic operation is executed. Space efficiency is measured by counting the number of extra memory units consumed by the algorithm. The efficiencies of some algorithms may differ significantly for inputs of the same size. For such algorithms, we need to distinguish between the worst-case, average-case, and best-case efficiencies. The framework’s primary interest lies in the order of growth of the algorithm’s running time (extra memory units consumed) as its input size goes to infinity. 57 Asymptotic Notations and Basic Efficiency Classes Asymptotic notation Asymptotic notation is a notation, which is used to take meaningful statement about the efficiency of a program. The efficiency analysis framework concentrates on the order of growth of an algorithm’s basic operation count as the principal indicator of the algorithm’s efficiency. To compare and rank such orders of growth, computer scientists use three notations, they are: O - Big oh notation Ω - Big omega notation Θ - Big theta notation 59 Asymptotic notation Let 𝑡(𝑛) and 𝑔(𝑛) can be any nonnegative functions defined on the set of natural numbers. The algorithm’s running time 𝑡(𝑛) usually indicated by its basic operation count 𝐶(𝑛), and 𝑔(𝑛), some simple function to compare with the count. Example 1: 𝑶(𝒈(𝒏)) is the set of all functions with a lower or same order of growth as 𝑔(𝑛) (to within a constant multiple, as 𝑛 goes to infinity). Thus, to give a few examples, the following assertions are all true: n ∈ O(n2), 100n + 5 ∈ O(n2), 60 Asymptotic notation Example 1: The second notation, Ω (𝒈(𝒏)), stands for the set of all functions with a higher or same order of growth as 𝒈(𝒏) (to within a constant multiple, as 𝑛 goes to infinity). For example, The third notian ⍬(𝑔(𝑛)) is the set of all functions that have the same order of growth as 𝑔(𝑛) (to within a constant multiple, as n goes to infinity). Thus, every quadratic function is in ⊝ (𝑛2 ) 61 O-notation A function 𝑡(𝑛) is said to be in 𝑂(𝑔(𝑛)), denoted 𝑡 (𝑛) ∈ 𝑂(𝑔(𝑛)), if 𝑡(𝑛) is bounded above by some constant multiple of 𝑔(𝑛) for all large 𝑛, i.e., if there exist some positive constant 𝑐 and some nonnegative integer 𝑛0 such that t(n) ≤ cg(n) for all n ≥ n0. Where 𝑡(𝑛) and 𝑔(𝑛) are nonnegative functions defined on the set of natural numbers. O = Asymptotic upper bound = Useful for worst case analysis = Loose bound 62 O-notation Example : formally prove that 100 𝑛 + 5 ∈ 𝑂(𝑛2 ) 100 𝑛 + 5 ≤ 100 𝑛 + 𝑛 ∀ 𝑛 ≥ 5 = 101 𝑛 ≤ 101𝑛2 as values of the constants c and n0 required by the definition, we can take 101 and 5, respectively the definition gives us a lot of freedom in choosing specific values for constants c and n0. For example, we could also reason that 100𝑛 + 5 ≤ 100𝑛 + 5𝑛 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑛 ≥ 1 = 105𝑛 i.e., 100𝑛 + 5 ≤ 105𝑛 i.e., 𝑡(𝑛) ≤ 𝑐𝑔(𝑛) ∴ 100 𝑛 + 5 ∈ 𝑂(𝑛) with c=105 and n0=1 63 Ω-notation A function 𝑡 (𝑛) is said to be in Ω (𝑔(𝑛)), denoted 𝑡 𝑛 ∈ Ω 𝑔 𝑛 , if 𝑡 (𝑛) is bounded below by some positive constant multiple of 𝑔(𝑛) for all large 𝑛, i.e., if there exist some positive constant 𝑐 and some nonnegative integer 𝑛0 such that 𝑡 (𝑛) ≥ 𝑐𝑔(𝑛) 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑛 ≥ 𝑛0. Ω = Asymptotic lower bound = Useful for best case analysis = Loose bound 64 Ω-notation Example Prove the assertions 𝑛3 + 10 𝑛2 + 4 𝑛 + 2 ∈ Ω (𝑛2 ). Proof: 𝑛3 + 10 𝑛2 + 4 𝑛 + 2 ≥ 𝑛2 (for all n ≥ 0) i.e., by definition 𝑡(𝑛) ≥ 𝑐𝑔(𝑛), where 𝑐 = 1 and 𝑛0 =0 65 ⍬-notation A function 𝑡(𝑛) is said to be in ⍬(𝑔(𝑛)), denoted 𝑡(𝑛) ∈ ⍬(𝑔(𝑛)), if 𝑡(𝑛) is bounded both above and below by some positive constant multiples of g(n) for all large n, i.e., if there exist some positive constants 𝑐1 c1 and 𝑐2 and some nonnegative integer 𝑛0 such that 𝑐2 g(n) ≤ t(n) ≤ 𝑐1 g(n) for all 𝑛 ≥ 𝑛0 Where 𝑡(𝑛) and 𝑔(𝑛) are nonnegative functions defined on the set of natural numbers. Θ = Asymptotic tight bound = Useful for average case analysis 66 ⍬-notation Example 67 MATHEMATICAL ANALYSIS FOR RECURSIVE ALGORITHMS Mathematical Analysis of Recursive Algorithms General Plan for Analyzing the Time Efficiency of Recursive Algorithms 1. Decide on a parameter (or parameters) indicating an input’s size. 2. Identify the algorithm’s basic operation. 3. Check whether the number of times the basic operation is executed can vary on different inputs of the same size; if it can, the worst-case, average-case, and best-case efficiencies must be investigated separately. 4. Set up a recurrence relation, with an appropriate initial condition, for the number of times the basic operation is executed. 5. Solve the recurrence or, at least, ascertain the order of growth of its solution. 69 Mathematical Analysis of Recursive Algorithms EXAMPLE 1: Compute the factorial function 𝐹(𝑛) = 𝑛! for an arbitrary nonnegative integer n. Since 𝑛! = 1.... (𝑛 − 1) 𝑛 = (𝑛 − 1)! 𝑛, 𝑓𝑜𝑟 𝑛 ≥ 1 𝑎𝑛𝑑 0! = 1 by definition, we can compute 𝐹(𝑛) = 𝐹(𝑛 − 1) 𝑛 with the following recursive algorithm. ALGORITHM 𝐹(𝑛) //Computes n! recursively //Input: A nonnegative integer n //Output: The value of n! if n = 0 return 1 else return F(n − 1) * n 70 Mathematical Analysis of Recursive Algorithms Algorithm analysis For simplicity, we consider n itself as an indicator of this algorithm’s input size. i.e. 1. The basic operation of the algorithm is multiplication, whose number of executions we denote M(n). Since the function F(n) is computed according to the formula F(n) = F(n −1) n for n > 0. The number of multiplications M(n) needed to compute it must satisfy the equality M(n − 1) multiplications are spent to compute F(n − 1), and one more multiplication is needed to multiply the result by n. 71 Mathematical Analysis of Recursive Algorithms Algorithm analysis - Recurrence relations To solve the recurrence relation 𝑀(𝑛) = 𝑀(𝑛 − 1) + 1 , i.e., to find an explicit formula for 𝑀(𝑛) in terms of n only. To determine a solution uniquely, we need an initial condition that tells us the value with which the sequence starts. We can obtain this value by inspecting the condition that makes the algorithm stop its recursive calls: if n = 0 return 1. 1- Since the calls stop when n = 0, the smallest value of n for which this algorithm is executed and hence M(n) defined is 0. 2- By inspecting the pseudocode’s exiting line, we can see that when n = 0, the algorithm performs no multiplications. 72 Mathematical Analysis of Recursive Algorithms Algorithm analysis - Recurrence relations Thus, the recurrence relation and initial condition for the algorithm’s number of multiplications 𝑀(𝑛): 𝑀(𝑛) = 𝑀(𝑛 − 1) + 1 𝑓𝑜𝑟 𝑛 > 0, 𝑀(0) = 0 𝑓𝑜𝑟 𝑛 = 0. 73 Mathematical Analysis of Recursive Algorithms Algorithm analysis - Recurrence relations Method of backward substitutions 74 Mathematical Analysis of Recursive Algorithms Algorithm analysis - Recurrence relations Method of backward substitutions 75 Activity !!! Using method of backward substitutions Apply similar analysis for Recursive solution to the Tower of Hanoi puzzle 76 https://www.codeproject.com/Articles/393158/Towers-of-Hanoi Analyzing the Time Efficiency of Non-recursive Algorithms General Plan for Analyzing the Time Efficiency of Non- recursive Algorithms 1. Decide on a parameter (or parameters) indicating an input’s size. 2. Identify the algorithm’s basic operation (in the innermost loop). 3. Check whether the number of times the basic operation is executed depends only on the size of an input. If it also depends on some additional property, the worst-case, average-case, and, if necessary, best-case efficiencies have to be investigated separately. 4. Set up a sum expressing the number of times the algorithm’s basic operation is executed. 5. Using standard formulas and rules of sum manipulation either find a closed form formula for the count or at the least, establish its order of growth. 77 Analyzing the Time Efficiency of Non-recursive Algorithms EXAMPLE 1: Consider the problem of finding the value of the largest element in a list of n numbers. Assume that the list is implemented as an array for simplicity ALGORITHM 𝑀𝑎𝑥𝐸𝑙𝑒𝑚𝑒𝑛𝑡(𝐴[0.. 𝑛 − 1]) //Determines the value of the largest element in a given array //Input: An array A[0..n − 1] of real numbers //Output: The value of the largest element in A maxval ←A for i ←1 to n − 1 do if A[i]>maxval maxval←A[i] return maxval 78 Analyzing the Time Efficiency of Non-recursive Algorithms Algorithm analysis The measure of an input’s size here is the number of elements in the array, i.e., n. There are two operations in the for loop’s body: o The comparison A[i]> maxval and o The assignment maxval←A[i]. The comparison operation is considered as the algorithm’s basic operation, because the comparison is executed on each repetition of the loop and not the assignment. The number of comparisons will be the same for all arrays of size n; therefore, there is no need to distinguish among the worst, average, and best cases here. 79 Analyzing the Time Efficiency of Non-recursive Algorithms Algorithm analysis Let C(n) denotes the number of times this comparison is executed. The algorithm makes one comparison on each execution of the loop, which is repeated for each value of the loop’s variable i within the bounds 1 and n − 1, inclusive. Therefore, the sum for C(n) is calculated as follows: 80 Analyzing the Time Efficiency of Non-recursive Algorithms EXAMPLE 2: Consider the element uniqueness problem: check whether all the Elements in a given array of n elements are distinct. ALGORITHM UniqueElements(A[0..n − 1]) //Determines whether all the elements in a given array are distinct //Input: An array A[0..n − 1] //Output: Returns “true” if all the elements in A are distinct and “false” otherwise for i ←0 to n − 2 do for j ←i + 1 to n − 1 do if A[i]= A[j ] return false return true 81 Analyzing the Time Efficiency of Non-recursive Algorithms Algorithm analysis The natural measure of the input’s size here is again n (the number of elements in the array). Since the innermost loop contains a single operation (the comparison of two elements), we should consider it as the algorithm’s basic operation. The number of element comparisons depends not only on n but also on whether there are equal elements in the array and, if there are, which array positions they occupy. We will limit our investigation to the worst case only. One comparison is made for each repetition of the innermost loop, i.e., for each value of the loop variable j between its limits i + 1 and n − 1; this is repeated for each value of the outer loop, i.e., for each value of the loop variable i between its limits 0 and n − 2. 82 Analyzing the Time Efficiency of Non-recursive Algorithms Algorithm analysis 83 Empirical Analysis of Algorithm The principal alternative to the mathematical analysis of an algorithm’s efficiency is its empirical analysis. It is useful when the mathematical analysis is difficult General Plan for the Empirical Analysis of Algorithm Time Efficiency 1.Understand the experiment’s purpose. 2.Decide on the efficiency metric M to be measured and the measurement unit (an operation count vs. a time unit). 3.Decide on characteristics of the input sample (its range, size, and so on). 4.Prepare a program implementing the algorithm (or algorithms) for the experimentation. 5.Generate a sample of inputs. 6.Run the algorithm (or algorithms) on the sample’s inputs and record the data observed. 7.Analyze the data obtained. 84 Summary Time efficiency and Space efficacy are two major elements for analysis any algorithms We introduced the analysis framework and show how can be implemented and explained Asymptotic notation. For some algorithms, the running time can differ considerably for inputs of the same size, leading to worst-case efficiency, average-case efficiency, and best-case efficiency. The established framework for analyzing time efficiency is primarily grounded in the order of growth of the algorithm’s running time as its input size goes to infinity. We showed the mathematical analysis for recursive and non-recursive algorithms with examples Empirical analysis of an algorithm is performed by running a program implementing the algorithm on a sample of inputs and analyzing the data observed (the basic operation’s count or physical running time). This often involves generating pseudorandom numbers. 85 List of Readings Required Reading : Chapter 2 from Anany Levitin (2012). Introduction to the Design and Analysis of Algorithms. Pearson, 2012, 3rd edition Recommending Reading : Chapter 1 from Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein ( 2022 ). Introduction to Algorithms. MIT Press..Fourth edition Chapter 1 from Michael T.Goodrich, Roberto Tamassia (2014). Algorithm Design and Applications. First Edition Wiley. 86 Thank You 87 ‫ر‬ ‫الجامعة السعودية االلكتونية‬ ‫ر‬ ‫االلكتونية‬ ‫الجامعة السعودية‬ ‫‪26/12/2021‬‬ College of Computing and Informatics Design and Analysis of Algorithms Design and Analysis of Algorithms Module 3 Brute Force and Exhaustive Search Week Learning Outcomes Describe Brute force and exhaustive search Recognize two sorting problems solved by Brute force Demonstrate sequential search and brute-force in string matching problem Demonstrate brute-force in solving closest-Pair Distinguish between the Depth-First Search and Breadth-First Search Topic Outline Selection Sort and Bubble Sort Sequential Search and Brute-Force String Matching Solving Closest-Pair Problem by Brute Force Exhaustive Search Depth-First Search and Breadth-First Search Introduction Brute force is a straightforward approach to solving a problem, It is based on the problem statement and definitions of the concepts involved. The “force” implied by the strategy’s definition → ‘ Just do it ‘ The brute-force strategy is one of is easiest strategy Generally , it is inefficient but can be useful for solving small-size instances of a problem Selection Sort and Bubble Sort Selection Sort Selection sort started by scanning the entire given list to find its smallest element and exchange it with the first element, putting the smallest element in its final position in the sorted list. Then we scan the list, starting with the second element, to find the smallest among the last n − 1 elements and exchange it with the second element, putting the second smallest element in its final position Generally, on the ith pass through the list, which we number from 0 to n − 2, the algorithm searches for the smallest item among the last n − i elements and swaps it with Ai: Selection Sort Selection Sort Example Using Selection sort algorithm sort this list 89, 45, 68, 90, 29, 34, 17 Itration 1 Itration 2 Itration 3 Itration 4 Itration 5 Itration 6 Itration 7 Selection Sort Algorithm Analysis The input size is given by the number of elements n; the basic operation is the key comparison A[j ] < A[min]. The number of times it is executed depends only on the array size and is given by the following sum: or Selection sort is a 0(n 2) algorithm on all inputs. Bubble Sort Compare adjacent elements of the list and exchange them if they are out of order. By doing it repeatedly, we end up “bubbling up” the largest element to the last position on the list. The next pass bubbles up the second largest element, and so on, until after n − 1 passes the list is sorted. Pass i (0 ≤ i ≤ n − 2) of bubble sort can be represented by the following diagram: Bubble Sort Bubble Sort Example Using Bubble algorithm sort this list 89, 45, 68, 90, 29, 34, 17 and so on Bubble Sort Algorithm Analysis The number of key comparisons for the bubble-sort version is the same for all arrays of size n; it is obtained by a sum that is almost identical to the sum for selection sort : In the worst case of decreasing arrays, it is the same as the number of key comparisons Sequential Search and Brute-Force String Matching Sequential Search The sequential algorithm compares successive elements of a given list with a given search key until either a match is encountered (successful search) or the list is exhausted without finding a match (unsuccessful search). If we append the search key to the end of the list, the search for the key will have to be successful, and So, we can eliminate the end of list check altogether. Another enhancement in the algorithm is soring the list. This allows searching in a list be stopped as soon as an element greater than or equal to the search key is encountered. Sequential Search Brute-Force String Matching String Matching problem: For a given a string of n characters called the text & a string of m characters (m ≤ n) called the pattern, find a substring of the text that matches the pattern. We want to find 𝑖 − 𝑡ℎ𝑒 𝑖𝑛𝑑𝑒𝑥 of the leftmost character of the first matching substring in the text—such that 𝑡𝑖 = 𝑝 0 , … , 𝑡𝑖 + 𝑗 = 𝑝 𝑗 , … , 𝑡𝑖 + 𝑚 − 1 = 𝑝𝑚 − 1: text T pattern P Brute-Force String Matching Brute-Force String Matching Algorithm Analysis The algorithm shifts the pattern almost always after a single character comparison. The worst case : the algorithm may have to make all m comparisons before shifting the pattern, which it can happen for each of n − m + 1 tries. Worst -case : m(n − m + 1) character comparisons ∈ O(nm) class Average-case efficiency should be considerably better than the worst-case efficiency. For searching in random texts, it has been shown to be linear, i.e., 0(n) Closest-Pair Problem Closest-Pair Problem The problem is for finding the two closest points in a set of n points. It is the simplest problems in computational geometry dealing with proximity of points in the plane or higher-dimensional spaces. Points represents such physical objects such as airplanes or post offices as well as database records, statistical samples, DNA sequences … etc. Cluster analysis in statistics is one of the important applications of the closest-pair problem. Hierarchical cluster analysis seeks to organize them in a hierarchy of clusters based on some similarity metric. Usually similarity the Euclidean distance for numerical analysis, the Hamming distance for text and other nonnumerical data, metrics such as Closest-Pair Problem The brute-force approach for solving Closest-Pair Problem Compute the distance between each pair of distinct points and find a pair with the smallest distance. Considering only the pairs of points (pi, p j ) for which i < j , to avoid computing distance for same pair twice. Closest-Pair Problem Closest-Pair Problem Algorithm Analysis The basic operation of the algorithm is computing the square root which The number of times will be executed can be computed as follows: Speeding up the innermost loop of the algorithm could only decrease the algorithm’s running time by a constant factor Exhaustive Search Exhaustive Search Exhaustive search is simply a brute-force approach to combinatorial problems. It suggests generating each and every element of the problem domain, selecting those of them that satisfy all the constraints, and then finding a desired element e.g., the one that optimizes some objective function Its implementation typically requires an algorithm for generating certain combinatorial objects Illustrating exhaustive search into three important problems: the traveling salesman problem, the knapsack problem, and the assignment problem Traveling Salesman Problem (TSP) TSP has been intriguing researchers for the last 150 years. The problem asks to find the shortest tour through a given set of n cities that visits each city exactly once before returning to the city where it started. It can be modeled by a weighted graph, with the graph’s vertices representing the cities and the edge weights specifying the distances. Then the problem can be stated as the problem of finding the shortest Hamiltonian circuit of the graph. Hamiltonian circuit can also be defined as a sequence of n+1 adjacent vertices vi , vi ,..., vi , vi , where the first vertex of the sequence is the same as the last 0 1 n−1 0 one and all the other n − 1 vertices are distinct. All the tours obtained by generating all the permutations of n − 1 intermediate cities, computing the tour lengths and finding the shortest among them Traveling Salesman Problem (TSP) Illustrating a sample of Hamiltonian circuit The total number of permutations needed is still 1/2 (n − 1)! Exhaustive-search approach impractical for all but very small values of n Knapsack Problem Given n items of known weights w1, w 2 ,..., wn and values v1, v 2 ,..., vn and a knapsack of capacity W , find the most valuable subset of the items that fit into the knapsack Example : a transport plane that has to deliver the most valuable set of items to a remote location without exceeding the plane’s capacity. Knapsack Problem The exhaustive-search approach to knapsack problem is: Generating all the subsets of the set of n items given, computing the total weight of each subset in order to identify feasible subsets Algorithm Analysis of exhaustive search for knapsack search: The number of subsets of an n-element set is 2n, the exhaustive search leads to a Q(2n) algorithm, even efficiently individual subsets are generated. → Not best solution Both TSM and Knapsack problems are the best-known examples for NP-hard problems – will be discussed in week 11 Knapsack Problem & The exhaustive-search Knapsack Problem exhaustive-search Assignment Problem Assignment Problem There are n people who need to be assigned to execute n jobs, one person per job. (That is, each person is assigned to exactly one job and each job is assigned to exactly one person.) The cost that would accrue if the ith person is assigned to the j th job is a known quantity C[i, j ] for each pair i, j = 1, 2,..., n. The problem is to find an assignment with the minimum total cost. Assignment Problem Feasible solutions to the assignment problem as n-tuples (j 1 ,..., jn) in which the ith component, i = 1,..., n, indicates the column of the element selected in the ith row (i.e., the job number assigned to the ith person). For example, for the cost matrix above, (2, 3, 4, 1) indicates the assignment of Person 1 to Job 2, Person 2 to Job 3, Person 3 to Job 4, and Person 4 to Job 1. Assignment Problem Algorithm Analysis of exhaustive search for assignment problem: The requirements implies that there is a one-to-one correspondence between feasible assignments and permutations of the first n integers. As a result, the exhaustive-search approach to the assignment problem would require generating all the permutations of integers 1, 2,..., n, computing the total cost of each assignment by summing up the corresponding elements of the cost matrix, and finally selecting the one with the smallest sum. These are few iterations of exhaustive-search for assignment problem Depth-First Search and Breadth-First Search Depth-First Search ( DFS) Examples for applications Depth- First Search a) Graph b) Traversal’s stack c) DFS forest with the tree and back edges Depth-First Search ( DFS) DFS in Graph : Starts a graph’s traversal at an arbitrary vertex by marking it as visited On each iteration, the algorithm proceeds to an unvisited vertex that is adjacent to the one it is currently in. If there are several such vertices, it can be resolved arbitrarily. This process continues until a dead end—a vertex with no adjacent unvisited vertices— is encountered. At a dead end, the algorithm backs up one edge to the vertex it came from and tries to continue visiting unvisited vertices from there. The algorithm eventually halts after backing up to the starting vertex, with the latter being a dead end. By then, all the vertices in the same connected component as the starting vertex have been visited. If unvisited vertices still remain, the depth-first search must be restarted at any one of them. Depth-First Search ( DFS) Depth-First Search (DFS) How efficient is depth-first search? It takes just the time proportional to the size of the data structure used for representing the graph in question. For the adjacency matrix representation , the traversal time is in 0(|V |2) For the adjacency list representation , it is in 0(|V |+ |E|) where |V | and |E| are the numbers graph’s vertices and edges, respectively Breath-First Search ( BFS) Examples for applications Breath- First Search a) Graph b) Traversal’s stack c) BFS forest Breadth-First Search (BFS) It proceeds in a concentric manner by visiting first all the vertices that are adjacent to a starting vertex, then all unvisited vertices two edges apart from it, and so on, until all the vertices in the same connected component as the starting vertex are visited. If there still remain unvisited vertices, the algorithm has to be restarted at an arbitrary vertex of another connected component of the graph. It is convenient to use a queue. The queue is initialized with the traversal’s starting vertex, which is marked as visited. On each iteration, the algorithm identifies all unvisited vertices that are adjacent to the front vertex, marks them as visited, and adds them to the queue; after that, the front vertex is removed from the queue. Breadth-First Search (BFS) Breadth-First Search (BFS) Analysis of BFS: Breadth-first search has the same efficiency as depth-first search : For the adjacency matrix representation , it is in 0(|V |2) and For the adjacency list representation , it is 0(|V |+ |E|). In addition , the queue is a FIFO (first-in first-out) structure adds a single ordering of vertices. Summary Brute force is a straightforward approach to solving a problem, usually directly based on the problem statement and definitions of the concepts involved. A first application of the brute-force approach often results in an algorithm that can be improved with a modest amount of effort. The following noted algorithms can be considered as examples of the brute- force approach: definition-based algorithm for matrix multiplication selection sort , sequential search and straightforward string-matching algorithm Exhaustive search is a brute-force approach to combinatorial problems. It suggests generating each and every combinatorial object of the problem, selecting those of them that satisfy all the constraints, and then finding a desired object. Summary The traveling salesman problem, the knapsack problem, and the assignment problem are typical examples of problems that can be solved, at least theoretically, by exhaustive-search algorithms. Depth-first search (DFS) and breadth-first search (BFS) are two principal graph- traversal algorithms. By representing a graph in a form of a depth-first or breadth- first search forest, they help in the investigation of many important properties of the graph. Both algorithms have the same time efficiency: 0(|V |2) for the adjacency matrix representation and 0(|V |+ |E|) for the adjacency list representation. List of Readings Required Reading : Chapter 3 from Anany Levitin (2012). Introduction to the Design and Analysis of Algorithms. Pearson, 2012, 3rd edition Thank You ‫ر‬ ‫الجامعة السعودية االلكتونية‬ ‫ر‬ ‫االلكتونية‬ ‫الجامعة السعودية‬ ‫‪26/12/2021‬‬ College of Computing and Informatics Design and Analysis of Algorithms Design and Analysis of Algorithms Module 4 Decrease-and-Conquer Week Learning Outcomes Describe Decrease-and-Conquer and their various types Describe the insert sort , topological Sorting and analysis their complexity Recognize Constant-Factor Algorithms and variable size algorithms Topic Outline Insertion Sort Topological Sorting Decrease-by-a-Constant-Factor Algorithms Variable-Size-Decrease Algorithms Introduction The decrease-and-conquer technique is Problem (𝑥) based on exploiting the relationship Step 1: Decrease between a solution to a given instance of a problem and a solution to its Subproblem ( 𝑥 ′ ) smaller instance. Step 2: Conquer Sub-Solution Once such a relationship is established, Step 3: Combine it can be exploited either top down or bottom up. Solution Illustration of Decrease-(by one)-and-conquer technique https://www3.cs.stonybrook.edu/~pramod.ganapathi/doc/algorithms/Algo-DecreaseConquer.pdf Introduction The top down variations is usually implemented using recursive approach. The bottom-up variation is usually implemented iteratively, starting with a solution to the smallest instance of the problem; it is called sometimes the incremental approach. There are three major variations of decrease-and-conquer: 1- Decrease by a constant 2- Decrease by a constant factor 3- Variable size decrease Teplogical Sotring (1) https://www.scaler.com/topics/data-structures/topological-sort-algorithm/m/ (1) Decrease by a constant decrease-by-a-constant variation, the size of an instance is reduced by the same constant on each iteration of the algorithm. This constant is equal to one , although other constant size reductions do happen occasionally. The exponentiation problem of computing an where a /= 0 and n is a nonnegative integer. The relationship between a solution to an instance of size n and an instance of size n − 1 is obtained by the obvious formula an = an−1 f (n) = an Decrease by a constant factor The problem instance in decrease-by-a- constant-factor technique is reduced by the same constant factor on each iteration of the algorithm. In most applications, this constant factor is equal to two. Give an example of such an algorithm? Decrease by a constant factor The techinque analysis: The algorithm is in 0(log n) if we compute an recursively according to this formula and measure the algorithm’s efficiency by the number of multiplications. The reason for that is reduction on the size by about a half on each iteration as well as the expense of one or two multiplications. Variable size decrease The size reduction pattern varies at each iteration A good example of such approach is Euclid’s algorithm for computing the greatest common divisor. This algorithm is based on the formula gcd(m, n) = gcd(n, m mod n ) The value of the second argument is always smaller on the right-hand side than on the left-hand side. Euclid’s algorithm example (1) 1. https://scienceland.info/en/algebra8/euclid-algorithm Insertion Sort Insertion Sort Lets have a sorted array A[0..n − 1] , a new element x need to be inserted in the array It is required to find an appropriate position for A[n − 1] among the sorted elements and insert new element x there. By scanning the sorted subarray from right to left until the first element smaller than or equal to A[n − 1] is encountered to insert A[n − 1] right after that element. The resulting algorithm is called straight insertion sort or simply insertion sort. Insertion sort is clearly based on a recursive idea and more Bottom top implementation is more efficient https://programmercave0.github.io/blog/2017/08/20/C++-Insertion-Sort-using-STL-%28Sorting%29 Insertion Sort https://programmercave0.github.io/blog/2017/08/20/C++-Insertion-Sort-using-STL-%28Sorting%29 Insertion Sort The algorithm analysis: The basic operation of the algorithm is the key comparison A[j ] > v. The number of comparison depends on input nature. In the worst case, A[j ] >v is executed the largest number of times, i.e., for every j = i − 1,... , 0. Since v = A[i], it happens if and only if A[j ] > A[i] for j = i − 1,... , 0. Thus, for the worst-case input, we get A > A (for i = 1), A > A (for i = 2),..., A[n − 2] > A[n − 1] (for i = n − 1) Insertion Sort The algorithm analysis: The worst-case input is an array of decreasing values In the best case, the comparison A[j ] >v is executed only once on every iteration of the outer loop. It happens if and only if A[i − 1] ≤ A[i] for every i = 1,...,n − 1, i.e., if the input array is already sorted in non-decreasing order. Insertion Sort The algorithm analysis: Average-case efficiency is based on investigating the number of element pairs that are out of order It happens if and only if A[i − 1] ≤ A[i] for every i = 1,...,n − 1, It shows that on randomly ordered arrays, insertion sort makes on average half as many comparisons as on decreasing arrays, i.e., This twice-as-fast average-case performance coupled with an excellent efficiency on almost-sorted arrays Insertion sort is better than selection sorting and bubble sort Topological Sorting Topological Sorting What is a directed graph? A directed graph, or digraph for short, is a graph with directions specified for all its edges. The adjacency matrix and adjacency lists are still two principal means of representing a digraph. a) An example of directed graph The differences between undirected and directed graphs (1) the adjacency matrix of a directed graph does not have to be symmetric; (2) an edge in a directed graph has just one (not two) corresponding nodes in the digraph’s adjacency lists. b) DFS for forest of the digraph for the DFS traversal started at a Topological Sorting What is a directed graph? Traversal algorithms for traversing digraphs are Depth- first search and breadth-first search all four types of edges possible in a DFS forest of a directed graph: a) An example of directed graph o tree edges (ab, bc, de), o back edges (ba) from vertices to their ancestors o forward edges (ac) from vertices to their descendants in the tree other than their children o cross edges (dc), which are none of the aforementioned types. b) DFS for forest of the digraph for the DFS traversal started at a Topological Sorting What is a directed graph? A directed cycle in a digraph is a sequence of three or more of its vertices that starts and ends with the same vertex and in which every vertex is connected to its immediate predecessor by an edge directed from the predecessor to the successor. a) An example of directed graph For example, a, b, a is a directed cycle in the digraph in the figure a. If a DFS forest of a digraph has no back edges, the digraph is a dag, an acronym for directed acyclic graph b) DFS for forest of the digraph for the DFS traversal started at a Topological Sorting Motivation Example Consider a set of five required courses {C1, C2, C3, C4, C5} a part-time student has to take in some degree program. The courses can be taken in any order as long as the following course prerequisites are met: C1 and C2 have no prerequisites C3 requires C1 and C2 C4 requires C3 C5 requires C3 and C4. The student can take only one course per term. In which order should the student take the courses? Topological Sorting Motivation Example Modelling the problem into a directed graph representing the prerequisite structure of five courses This is a topological sorting problem finding the order a student needs to follow Topological sorting of vertices of a directed acyclic graph is an ordering of the vertices𝑣1 , 𝑣2 ,... , 𝑣𝑛 in such a way that there is an edge directed towards vertex 𝑣𝑗 from vertex𝑣𝑖 , then𝑣𝑖. comes before 𝑣𝑗 Note : if a digraph has no directed cycles, the topological sorting problem for it has a solution Topological Sorting The source-removal algorithm Repeatedly, identify in a remaining digraph a source, which is a vertex with no incoming edges, and delete it along with all the edges outgoing from it. If there are several sources, break the tie arbitrarily The order in which the vertices are deleted yields a solution to the topological sorting problem. Topological Sorting The application for the source-removal algorithm in solving the motivated example Topological Sorting The pseudocode for source-removal algorithm Algorithm Decrease-by-a-Constant-Factor Algorithms Binary Search Binary search is a remarkably efficient algorithm for searching in a sorted array. It works by comparing a search key K with the array’s middle element A[m]. If they match, the algorithm stops; otherwise, the same operation is repeated recursively for the first half of the array if K < A[m], and for the second half if K > A[m]: Binary Search Given the following array find the Key= 70 Index 0 1 2 3 4 5 6 7 8 9 10 11 12 Value 3 14 27 31 39 42 55 70 74 81 85 93 98 r Iteration 1 l m Iteration 2 l m r Iteration 3 l, m r Binary Search Binary search is clearly based on a recursive idea, but it can be implemented as a non recursive algorithm. Here is pseudocode of this non recursive version. Binary Search The algorithm analysis: We need to count the number of times the search key is compared with an element of the array. This assumes that after one comparison of K with A[m], the algorithm can determine whether K is smaller, equal to, or larger than A[m]. In the worst case Cworst(n). The worst-case inputs include all arrays that do not contain a given search key, as well as some successful searches. Since after one comparison the algorithm faces the same situation but for an array half the size, we get the following recurrence relation for Cworst(n): Binary Search The algorithm analysis: For Average-case analysis , A sophisticated analysis shows that the average number of key comparisons made by binary search is only slightly smaller than that in the worst case: Binary search is an optimal searching algorithm if we restrict our op- erations only to comparisons between keys Variable-Size-Decrease Algorithms Computing a Median and the Selection Problem The slection problem: Finding the kth smallest element (or kth order statistic) in a given array A[0..n − 1]. The easiest case is to extract the Minimum (k=1) and Max ( k= n) values from an array The hardest case is to extract the Median value (k= n/2) To find the kth smallest element in a list : Sort the list first and then select the kth element in the output of a sorting algorithm. Note : The time of such an algorithm is determined by the efficiency of the sorting algorithm used. → So, use the efficient sorting algorithm The Selection Problem The Randomized partitioning : Another effective strategy in selection is randomized partitioning a given list around some value p this is a rearrangement of the list’s elements so that the left part contains all the elements smaller than or equal to p, followed by the pivot p itself, followed by all the elements greater than or equal to p p value Rearrangement around p value The Selection Problem The Randomized partitioning : p value Rearrangement around p value https://www3.cs.stonybrook.edu/~pramod.ganapathi/doc/algorithms/Algo-DecreaseConquer.pdf The Lomuto partitioning Any subarray A[l..r] (0 ≤ l ≤ r ≤ n − 1) as composed of three contiguous segments. Listed in the order they follow pivot p a segment with elements known to be smaller than p, the segment of elements known to be greater than or equal to p, and the segment of elements yet to be compared to p Note that the segments can be empty; The Lomuto partitioning The Lomuto partitioning Applying Lomuto partitioning algorithm with an example https://www3.cs.stonybrook.edu/~pramod.ganapathi/doc/algorithms/Algo-DecreaseConquer.pdf Quick Select To find the kth smallest element in array A[0..n − 1] let s be the partition’s split position, i.e., the index of the array’s element occupied by the pivot after partitioning. If s = k − 1, pivot p itself is obviously the kth smallest element If s > k − 1, the kth smallest element in the entire array can be found as the kth smallest element in the left part of the partitioned array. if s < k − 1, it can be found as the (k − s)th smallest element in its right part. if we do not solve the problem, we reduce its instance to a smaller one, which can be solved by the same approach, i.e., recursively. Quick Select Quick Select Quickselect Algorithm Analysis ? Partitioning an n-element array always requires n − 1 key comparisons. If it produces the split that solves the selection problem without requiring more iterations, then for this best case is Cbest(n) = n − 1 ∈ 0(n). the algorithm can produce an extremely unbalanced partition of a given array, with one part being empty and the other containing n − 1 elements. In the worst case, this can happen on each of the n − 1 iterations. This implies that: Cbest(n) = n − 1 ∈ 0(n). Summary Decrease-and-conquer is based on exploiting a relationship between a solution to a given instance of a problem and a solution to a smaller instance of the same problem. After establishing relationship , it can be exploited either top down (usually recursively) or bottom up. Three major types of Decrease-and-conquer are decrease-by-a-constant, most often by one (e.g., insertion sort), decrease-by-a-constant-factor, most often by the factor of two (e.g., binary search) and variable-size-decrease (e.g., Euclid’s algorithm) Insertion sort is a 0(n 2) algorithm both in the worst and average cases, but it is about twice as fast on average than in the worst case. The algorithm’s notable advantage is a good performance on almost-sorted arrays. Summary The topological sorting problem asks to list vertices of a digraph in an order such that for every edge of the digraph, the vertex it starts at is listed before the vertex it points to. Binary search is a very efficient algorithm for searching in a sorted array. It is a principal example of a decrease-by-a-constant-factor algorithm. the size reduction varies from one iteration of the algorithm to another. Examples of such variable-size- decrease algorithms include Euclid’s algorithm, the partition- based algorithm for the selection problem List of Readings Required Reading : Chapter 4 from Anany Levitin (2012). Introduction to the Design and Analysis of Algorithms. Pearson, 2012, 3rd edition Thank You ‫ر‬ ‫الجامعة السعودية االلكتونية‬ ‫ر‬ ‫االلكتونية‬ ‫الجامعة السعودية‬ ‫‪26/12/2021‬‬ College of Computing and Informatics Design and Analysis of Algorithms Design and Analysis of Algorithms Module 5 Divide-and-Conquer Week Learning Outcomes Describe Divide-and-Conquer and their various types Describe the Mergesort, Quicksort and analysis their complexity Recognize Binary Tree Traversals and Related Properties The apply multiplication of large Integers and Strassen’s Matrix Multiplication Demonstrate the Closest-Pair Problem by Divide-and-Conquer and its analysis Topic Outline Divide-and-Conquer Explanation Merge sort & Quick Sort Binary Tree Traversals and Related Properties Multiplication of Large Integers and Strassen’s Matrix Multiplication The Closest-Pair Problem by Divide-and-Conquer Divide-and-Conquer Explanation Introduction Divide-and-conquer is the best-known general algorithm design technique. Divide-and-conquer plan is : 1. Divide : a problem is divided into several subproblems of the same type, ideally of about equal size. 2. Conquer: the subproblems are solved recursively. 3. Combine : the solutions to the subproblems are combined to get a solution to the original problem. A divide-and-conquer algorithm is necessarily more efficient than even a brute-force solution. The divide-and-conquer technique can suited for parallel Divide-and-conquer technique computations used for solving complex problems Divide-and-conquer https://www.enjoyalgorithms.com/blog/merge-sort-algorithm Introduction Divide-and-conquer Analysis: For an instance of size n can be divided into b instances of size n/b, with a of them needing to be solved. where a and b are constants; a ≥ 1 and b> 1 Assuming that size n is a power of b The recurrence is for the running time : T (n) = aT (n/b) + f (n) → divide-and-conquer recurrence f (n) is a function that accounts for the time spent on dividing an instance of size n into instances of size n/b and combining their solutions The order of growth of its solution T (n) depends on the values of the constants a and b and the order of growth of the function f (n) Divide-and-conquer Analysis Master Theorem T (n) = aT (n/b) + f (n) then Analogous results hold for the O and Q notations. Example: in sum-computation algorithm ( the recurrence for the number of additions A(n) on inputs of size n = 2k is A(n) = 2A(n/2) + 1. where a = 2,b = 2, and d = 0; hence, since a > bd, Thus, A(n) ∈ 0(nlogb a) = 0(nlog2 2) = 0(n). Mergesort Mergesort Mergesort is a perfect example of a successful application of the divide-and- conquer technique. It sorts a given array A[0..n − 1] by dividing it into two halves A[0..Ln/2J− 1] and A[Ln/2J..n − 1], Sorting each of them recursively, and Then merging the two smaller sorted arrays into a single sorted one. Mergesort Merging process of two sorted arrays can be done as follows: Two pointers (array indices) are initialized to point to the first elements of the arrays being merged. The elements pointed to are compared, and the smaller of them is added to a new array being constructed; after that, the index of the smaller element is incremented to point to its immediate successor in the array it was copied from. This operation is repeated until one of the two given arrays is exhausted, and then the remaining elements of the other array are copied to the end of the new array. Mergesort Here is an illustrated example https://www.enjoyalgorithms.com/blog/merge-sort-algorithm Mergesort Pseudocode Mergesort Analysis How efficient is mergesort? For simplicity, assume that n is a power of 2, the recurrence relation for the number of key comparisons C(n) is Let us analyze Cmerge(n): At each step, exactly one comparison is made, after which the total number of elements in the two arrays still needing to be processed is reduced by 1. Mergesort Analysis How efficient is mergesort? In the worst case, neither of the two arrays becomes empty before the other one contains just one element (e.g., smaller elements may come from the alternating arrays). or the worst case, Cmerge(n) = n − 1, and the recurrence Hence, according to the Master Theorem, Cworst(n) ∈ 0(n log n) For n = 2k , the exact solution to the worst-case recurrence is Quicksort Quicksort Quicksort divides its input elements according to their position in the array. The division is done according to their value. A partition is an arrangement of the array’s elements so that all the elements to the left of some element A[s] are less than or equal to A[s], and all the elements to the right of A[s] are greater than or equal to it After a partition is achieved, A[s] will be in its final position in the sorted array, and we can continue sorting the two subarrays to the left and to the right of A[s] independently Quicksort & Hoare Partitioning Pseudocode Quicksort Here is an illustrated example https://www.java67.com/2014/07/quicksort-algorithm-in-java-in-place-example.html Hoare Partitioning – in Quick sort Here is an other example explaining Hoare Partitioning https://www3.cs.stonybrook.edu/~pramod.ganapathi/doc/algorithms/Algo-DecreaseConquer.pdf Quick sort Analysis The number of key comparisons made before a partition is achieved is n + 1 if the scanning indices cross over and n if they coincide If all the splits happen in the middle of corresponding subarrays, we will have the best case. The number of key comparisons in the best case satisfies the recurrence According to the Master Theorem, Cbest(n) ∈ 0(n log2 n); solving it exactly for n = 2k yields Cbest(n) = n log2 n. Quick sort Analysis In the worst case, if A[0..n − 1] is a strictly increasing array and we use A as the pivot, the left-to- right scan will stop on A while the right-to-left scan will go all the way to reach A, indicating the split at position 0. After making n + 1 comparisons to get to this partition and exchanging the pivot A with itself, the algorithm will be left with the strictly increasing array A[1..n − 1] to sort. This sorting of strictly increasing arrays of diminishing sizes will continue until the last one A[n − 2..n − 1] has been processed. The total number of key comparisons made will be equal to Quick sort Analysis Let Cavg(n) be the average number of key comparisons made by quicksort on a randomly ordered array of size n. A partition can happen in any position s (0 ≤ s ≤ n − 1) after n + 1 comparisons are made to achieve the partition. After the partition, the left and right subarrays will have s and n − 1 − s elements, respectively. Assuming that the partition split can happen in each position s with the same probability 1/n, we get the following recurrence relation: on the average, quicksort makes only 39% more comparisons than in the best case Activity !!! In groups, Compare between QuickSort implemented with Hoare Partitioning and QuickSort implemented with Lomuto partition ? Binary Tree Traversals Binary Tree A binary tree T is defined as a finite set of nodes that is either empty or consists of a root and two disjoint binary trees TL and TR called, respectively, the left and right subtree of the root. A binary tree as a special case of an ordered tree Many problems about binary trees can be solved by applying the divide-and-conquer technique https://www.javatpoint.com/binary-tree-java Binary Tree The most important divide-and-conquer algorithms for binary trees are the three classic traversals: preorder, inorder, and postorder. All three traversals visit nodes of a binary tree recursively, i.e., by visiting the tree’s root and its left and right subtrees. They differ only by the timing of the root’s visit: In the preorder traversal, the root is visited before the left and right subtrees are visited (in that order). In the inorder traversal, the root is visited after visiting its left subtree but before visiting the right subtree. In the postorder traversal, the root is visited after visiting the left and right subtrees (in that order). Binary Tree An example of Binary tree and its traversals. Multiplication of Large Integers & Strassen’s Matrix Multiplication Multiplication of Large Integers Some applications like cryptography, require manipulation of integers that are over 100 decimal digits long. This practical need supports investigations of algorithms for efficient manipulation of large integers. Example: Consider the problem of multiplying two (large) n-digit integers represented by arrays of their digits such as: A = 12345678901357986429 B = 87654321284820912836 Multiplication of Large Integers The previous example is solved by Solved by grade-school algorithm Similar the hand-and pencil method we use for performing multiplication numbers Efficiency: 𝑛2. one-digit multiplications Multiplication of Large Integers – First Cut Multiplication of Large Integers – Second Cut Strassen’s Matrix Multiplication The principal insight of the algorithm lies in the discovery that we can find the product C of two 2 × 2 matrices A and B with just seven multiplications Where, Strassen’s Matrix Multiplication Algorithm Analysis: If M(n) is the number of multiplications made by Strassen’s algorithm in multiplying two n × n matrices which is smaller than n3 required by the brute-force algorithm Strassen’s Matrix Multiplication Algorithm Analysis: The number of additions A(n) made by Strassen’s algorithm. 18 additions/subtractions of matrices of size n/2; when n = 1, no additions are made since two numbers are simply multiplied. recurrence relation: According to the Master Theorem, A(n) ∈ 0(nlog2 7) Therefore, the number of additions has the same order of growth as the number of multiplications This puts Strassen’s algorithm in 0(nlog2 7) The Closest-Pair Problem by Divide-and-Conquer The Closest-Pair Problem One of the efficient way to solve closest-pair problem by divide-and-using brute- conquer The steps for the algorithm: Step 1: Divide the points given into two subsets 𝑆1 and 𝑆2 by a vertical line 𝑥 = 𝑐 so that half the points lie to the left or on the line and half the points lie to the right or on the line. The Closest-Pair Problem The steps for the algorithm: Step 2: Find recursively the closest pairs for the left and right subsets. Step 3: Set d= min{d1, d2} We can limit our attention to the points in the symmetric vertical strip of width 2d as possible closest pair. Let 𝐶1 and 𝐶2 be the subsets of points in the left subset 𝑆1 and of the right subset 𝐶1 , respectively, that lie in this vertical strip. The points in 𝐶1 and 𝐶2 are stored in increasing order of their y coordinates, which is maintained by merging during the execution of the next step. Step 4: For every point 𝑃(𝑥,𝑦)in 𝐶1 , we inspect points in 𝐶2 that may be closer to 𝑃 than d. There can be no more than 6 such points (because 𝑑 ≤ 𝑑2 ) The efficiency of Closest-Pair Problem Algorithm Analysis: Summary Divide-and-conquer is a general algorithm design technique that solves a problem by dividing it into several smaller subproblems of the same type (ideally, of about equal size), solving each of them recursively, and then combining their solutions to get a solution to the original problem. Running time T (n) of many divide-and-conquer algorithms satisfies the recurrence T (n) = aT (n/b) + f (n). The Master Theorem establishes the order of growth of its solutions. Mergesort is works by dividing an input array into two halves, sorting them recursively, and then merging the two sorted halves to get the original array sorted. The algorithm’s time efficiency is in 0(n log n) in all cases, with the number of key comparisons being very close to the theoretical minimum. Its principal drawback is a significant extra storage requirement. Summary Quicksort works by partitioning its input elements according to their value relative to some preselected element. Quicksort is noted for its superior efficiency among n log n algorithms for sorting randomly ordered arrays but also for the quadratic worst-case efficiency. The classic traversals of a binary tree—preorder, inorder, and postorder— and similar algorithms that require recursive processing of both left and right subtrees can be considered. Strassen’s algorithm needs only seven multiplications to multiply two 2 × 2 matrices. By exploiting the divide-and-conquer technique, this algorithm can multiply two n × n matrices with about n2.807 multiplications. The divide-and-conquer technique can be successfully applied the closest-pair problem. List of Readings Required Reading : Chapter 5 from Anany Levitin (2012). Introduction to the Design and Analysis of Algorithms. Pearson, 2012, 3rd edition Recommending Reading : Chapter 4 from Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein ( 2022 ). Introduction to Algorithms. MIT Press..Fourth edition Chapter 8 & Chapter 11from Michael T.Goodrich, Roberto Tamassia (2014). Algorithm Design and Applications. First Edition Wiley. Thank You ‫ر‬ ‫الجامعة السعودية االلكتونية‬ ‫ر‬ ‫االلكتونية‬ ‫الجامعة السعودية‬ ‫‪26/12/2021‬‬ College of Computing and Informatics Design and Analysis of Algorithms Design and Analysis of Algorithms Module 6 Transform-and-Conquer Week Learning Outcomes Describe Transform-and-Conquer and their various types Express the presorting and Gaussian Elimination Recognize Balanced Search Trees ( AVL) tree & its properties 233 Topic Outline Transform-and-Conquer Explanation Presorting Gaussian Elimination Balanced Search Trees (AVL trees and its properties ) 234 Transform-and-Conquer Explanation Transform-and-Conquer Explanation A group of techniques based on the idea of transformation to a problem that is easier to solve. Work as two-stage procedures: First Stage Transformation : the problem’s instance is modified to be, for one reason or another, more amenable to solution. Second Stage Conquering : where the problem is solved. 236 Transform-and-Conquer Explanation Transformation variations can be either : Instance Simplification : a simpler/more convenient instance [e.g. sorted] of the same problem Representation Change : a different representation of the same instance [e.g. different data structure] Problem Reduction: a different problem for which an algorithm is already available 237 Instance Simplification - Presorting Instance Simplification - Presorting Solve a problem’s instance by transforming it into another simpler/easier instance of the same problem Presorting Many problems involving lists are easier when list is sorted, e.g. searching computing the median (selection problem) checking if all elements are distinct (element uniqueness) Also: Topological sorting helps solving some problems for dags. Presorting is used in many geometric algorithms [eg close pr] 239 Instance Simplification - Presorting How can we fast the sorting process ? Efficiency of algorithms involving sorting depends on efficiency of sorting Theorem: log2 n!  n log2 n comparisons are necessary in the worst case to sort a list of size n by any comparison-based algorithm. Note: About n log2 n comparisons are also sufficient to sort array of size n (by mergesort and also quick sort 240 Searching with Presorting Problem: Search for a given K in A[0..n-1] Presorting-based algorithm: Stage 1 Sort the array by an efficient sorting algorithm Stage 2 Apply binary search Efficiency: Θ(nlog n) + O(log n) = Θ(nlog n) Is this Good or bad? Why do we have our dictionaries, telephone directories, etc. sorted? 241 Element Uniqueness with Presorting Presorting-based algorithm Stage 1: sort by efficient sorting algorithm (e.g. mergesort) Stage 2: scan array to check pairs of adjacent elements Efficiency: Θ(nlog n) + O(n) = Θ(nlog n) One solution : Brute force algorithm Compare all pairs of elements Efficiency: O(n2) Can you think of another algorithms? 242 Element Uniqueness with Presorting Presorting-based algorithm Stage 1: sort by efficient sorting algorithm (e.g. mergesort) Stage 2: scan array to check pairs of adjacent elements Running time of this algorithm is the sum of the time spent on sorting and the time spent on checking consecutive elements Efficiency: Θ(nlog n) + O(n) = Θ(n log n) Do the detailed analysis to see how we obtained this formula ? 243 Element Uniqueness with Presorting Presorting-based algorithm Another consideration to solve this problem is Brute force algorithm Compare all pairs of elements But Efficiency: O(n2) not the best of suggested method Can you think of another algorithms? … 244 Element Uniqueness with Presorting Presorting-based algorithm Another consideration to solve this problem is Brute force algorithm Compare all pairs of elements But Efficiency: O(n2) not the best of suggested method Can you think of another algorithms? … Hashing – frequently useful Closest pair 245 Activity !!! In groups Search how hashing is used in finding uniqness elements in an array ? 246 Instance simplification – Gaussian Elimination Systems of two linear equations in two unknowns would be formulated as This is easily be solved But for given: A system of n linear equations in n unknowns with an arbitrary coefficient matrix. How can be is easily solved ? 247 Instance simplification – Gaussian Elimination Transform to: An equivalent system of n linear equations in n unknowns with an upper triangular coefficient matrix. Solve the latter by substitutions starting with the last equation and moving up to the first one 248 Instance simplification – Gaussian Elimination More Explanation: The transformation is accomplished by a sequence of elementary operations on the system’s coefficient matrix (which don’t change the system’s solution): for i ←1 to n-1 do replace each of the subsequent rows (i.e., rows i+1, …, n) by a difference between that row and an appropriate multiple of the i-th row to make the new coefficient in the i-th column of that row 0 249 Example of Gaussian Elimination Solve : 2x1 - 4x2 + x3 = 6 3x1 - x2 + x3 = 11 x1 + x2 - x3 = -3 Gaussian elimination 2 -4 1 6 2 -4 1 6 3 -1 1 11 row2 – (3/2)*row1 0 5 -1/2 2 1 1 -1 -3 row3 – (1/2)*row1 0 3 -3/2 -6 row3–(3/5)*row2 2 -4 1 6 0 5 -1/2 2 0 0 -6/5 -36/5 Backward substitution x3 = (-36/5) / (-6/5) = 6 x2 = (2+(1/2)*6) / 5 = 1 x1 = (6 – 6 + 4*1)/2 = 2 250 Pseudocode and Efficiency of Gaussian Elim. Stage 1: Reduction to an upper-triangular matrix for i ← 1 to n-1 do for j ← i+1 to n do for k ← i to n+1 do A[j, k] ← A[j, k] - A[i, k] * A[j, i] / A[i, i] //improve! Stage 2: Back substitutions for j ← n down to 1 do t←0 for k ← j +1 to n do t ← t + A[j, k] * x[k] x[j] ← (A[j, n+1] - t) / A[j, j] Efficiency: Θ(n3) + Θ(n2) = Θ(n3) 251 Searching Problem Searching Problem Problem: Given a (multi)set S of keys and a search key K, find an occurrence of K in S, if any Searching must be considered in the context of [context determines best data structure]: o file size (internal vs. external) o dynamics of data (static vs. dynamic) Dictionary operations (dynamic data) [data structure determines performance]: o find (search) o insert o delete 253 Taxonomy of Searching Algorithms List searching o sequential search o binary search o interpolation search Tree searching o binary search tree o balanced binary trees: - AVL trees - red-black trees o multiway trees [balanced]: - 2-3 trees -2-3-4 trees, B trees Hashing o open hashing (separ

Use Quizgecko on...
Browser
Browser