CS/IT 341 Algorithms Analysis and Design Lecture 2 PDF
Document Details
Uploaded by PromisedTangent
EELU - The Egyptian E-Learning University
Tags
Summary
This document provides lecture notes for CS/IT 341, Algorithms Analysis and Design, focusing on the theory of algorithms and their efficiency analysis. The summary reviews algorithms and computational complexity from a historical perspective.
Full Transcript
Faculty of Computers and Information Technology CS/IT 341 ALGORITHMS ANALYSIS AND DESIGN Lecture 2 Recap: Algorithm Problem Statement Relationship between input and output Algorithm Procedure to achieve the relationship Definition A sequence of s...
Faculty of Computers and Information Technology CS/IT 341 ALGORITHMS ANALYSIS AND DESIGN Lecture 2 Recap: Algorithm Problem Statement Relationship between input and output Algorithm Procedure to achieve the relationship Definition A sequence of steps that transform the input to output Instance The input needed to compute solution Correct Algorithm 2 for every input it halts with correct output Brief History The study of algorithms began with mathematicians and was a significant area of work in the early years. The goal of those early studies was to find a single, general algorithm that could solve all problems of a single type. Named after 9th century Persian Muslim mathematician Abu Abdullah Muhammad ibn Musa al-Khwarizmi who lived in Baghdad and worked at the Dar al-Hikma Dar al-Hikma acquired & translated books on science & philosophy, particularly those in Greek, as well as publishing original research. The word algorism originally referred only to the rules of 3 performing arithmetic using Hindu-Arabic numerals, but later evolved to include all definite procedures for solving problems. Al-Khwarizmi’s Golden Principle (1) All complex problems can be and must be solved using the following simple steps: 1. Break down the problem into small, simple sub-problems 2. Arrange the sub-problems in such an order that each of them can be solved without effecting any other 3. Solve them separately, in the correct order 4. Combine the solutions of the sub-problems to form the solution of the original problem 4 Al-Khwarizmi’s Golden Principle (2) 5 Al-Khwarizmi’s Golden Principle (3) 6 Problem Solving Process Problem Algorithm Input Output Steps Analysis of Algorithm Correctness (Input-Output Correlation) Complexity (Resources Consumption) Efficiency (Resources Optimality) Implementation Verification 7 Complexity of Algorithms The complexity of an algorithm is the amount of work the algorithm performs to complete its task. It is the level of difficulty in solving mathematically posed problems as usually measured by required: Time (computational complexity): No. of steps Space (memory complexity): Storage capacity Bandwidth (network complexity): Minimum delay Complexity is a function T(n) which yields the time, space, bandwidth required to execute the algorithm of a problem of size ‘n’. 8 But Computers Are So Fast These Days?? Do we need to bother with algorithmics and complexity any more? CPUs of more speeds, compared to even 10 years ago... Memories of more capacities, compared to even 10 years ago.. Networks of more throughputs, compared to even 10 years ago.. Many problems are so computationally demanding that no growth in the speed of processors, space of memories and throughput of networks will help very much Efficiency of an algorithm is still important 9 Efficiency of Algorithms The efficiency of an algorithm is a measure of the amount of resources consumed in solving a problem of size n Running Time (number of primitive steps that are executed) Memory Space (capacity of storage that is consumed) Network Bandwidth (time of delay that is measured) Analysis in the context of algorithms is concerned with predicting the required resources There are always tradeoffs between these two efficiencies Allow one to decrease the running time of an algorithm solution by increasing space to store and vice-versa 10 Time is the resource of the most interest Analysis of Algorithms (AofA) Many criteria affect the efficiency of an algorithm, including speed of CPU, bus and peripheral hardware design time, programming time and debugging time capacities of available memory and required memory language used and coding professionalism of the programmer quality of input (good, bad or average) But To analyze the efficiencies of programs derived from algorithms for solving the same problem, they should be: Machine independent Language independent 11 Amenable to mathematical study Realistic Importance of Analyzing Algorithms (1) Need to recognize limitations of various algorithms for solving a problem Need to understand relationship between problem size and running time When is a running program not good enough? Need to learn how to analyze an algorithm's running time without coding it Need to learn techniques for writing more efficient code Need to recognize bottlenecks in code as well as which parts of code 12 are easiest to optimize Importance of Analyzing Algorithms (2) When selecting the implementation of an Abstract Data Type (ADT)- Stack, Queue, Tree, Graph, Table, Dictionary, List, …- we have to consider how frequently particular ADT operations occur in a given application. An array-based list retrieve operation takes at most “1” operation, a linked-list based list retrieve operation at most “n” operations. But insert and delete operations are much easier on a linked-list based list implementation. For small problem size, we can ignore the algorithm’s efficiency We weigh the trade-offs between an algorithm’s time requirement and memory requirements 13 What do we Analyze about Algorithms? Algorithms are analyzed in terms of: Correctness Does the input/output relation match algorithm requirement? Amount of Work Done Basic operations to do task Amount of Space Used Memory used Simplicity & Clarity Verification and implementation. Optimality Is it impossible to do better? To understand their behavior to improve them if possible 14 Computation Model for Analysis Analysis of the algorithm is performed with respect to a computational model called RAM (Random Access Machine) A RAM is an idealized unary processor machine with an infinite large random-access memory Instruction are executed one by one All memory equally expensive to access No concurrent operations Constant word size All reasonable instructions (basic operations) take unit time 15 Algorithm Analysis (1) Two essential approaches to measuring algorithm efficiency: Empirical analysis: Program the algorithm and measure its running time on example instances Theoretical analysis Employ mathematical techniques to derive a function which relates the running time to the size of instance 16 Algorithm Analysis (2) The following three cases are investigated in algorithm analysis: A) Worst case: The worst outcome for any possible input We often concentrate on this for our analysis as it provides a clear upper bound of resources an absolute guarantee B) Average case: Measures performance over the entire set of possible instances Very useful, but treat with care: what is “average”? Random (equally likely) inputs vs. real-life inputs C) Best Case: The best outcome for any possible input provides lower bound of resources 17 Algorithm Analysis (3) An algorithm may perform very differently on different example instances. e.g: bubble sort algorithm might be presented with data: already in order in random order in the exact reverse order of what is required Average case analysis can be difficult in practice to do a realistic analysis we need to know the likely distribution of instances However, it is often very useful and more relevant than worst case; for example quicksort has a catastrophic (extremly harmful) worst case, but in practice it is one of the best sorting algorithms known The average case uses the following concept in probability theory. Suppose the numbers n1, n2 , …, nk occur with respective probabilities p1, p2,…..pk. Then the expectation or average value E 18 is given by E = n1p1 + n2p2 +...+ nk.рk Empirical Analysis (1) Write a program implementing the algorithm Run the program with inputs of varying size and compositions Use timing routines to get an accurate measure of the actual running time e.g. System.currentTimeMillis() Plot the results 19 Empirical Analysis (2) Most algorithms transform input objects into output objects The running time of an algorithm typically grows with the input size Average case time is often difficult to determine We focus on the worst case running time Easier to analyze 20 Crucial to applications such as games, finance and robotics Empirical Analysis (3) 21 Empirical Analysis (4) 22 Empirical Analysis (5) 23 Empirical Analysis (6) 24 Empirical Analysis (7) 25 Empirical Analysis (8) 26 Empirical Analysis (9) 27 Empirical Analysis (10) 28 Empirical Analysis (11) 29 Empirical Analysis (12) 30 Empirical Analysis (13) 31 Limitations of Empirical Analysis Implementation dependent Execution time differ for different implementations of same program Platform dependent Execution time differ on different architectures Data dependent Execution time is sensitive to amount and type of data minipulated. Language dependent Execution time differ for same code, coded in different 32 languages ∴ absolute measure for an algorithm is not appropriate Theorerical Analysis Data independent Takes into account all possible inputs Platform independent Language independent Implementatiton independent not dependent on skill of programmer can save time of programming an inefficient solution Characterizes running time as a function of input size, n. Easy to extrapolate without risk 33 Pseudocode (1) High-level description of an algorithm More structured than English prose but Less detailed than a program Preferred notation for describing algorithms Hides program design issues ArrayMax(A, n) Input: Array A of n integers Output: maximum element of A 1. currentMax A; 2. for i = 1 to n-1 do 3. if A[i] > currentMax then 34 4. currentMax A[i] 5. return currentMax; Pseudocode (2) Indentation indicates block structure. e.g body of loop Looping Constructs while, for and the conditional if-then-else The symbol // indicates that the reminder of the line is a comment. Arithmetic & logical expressions: (+, -,*,/, ) (and, or and not) Assignment & swap statements: a b , ab c, a b Return/Exit/End: termination of an algorithm or block ArrayMax(A, n) Input: Array A of n integers Output: maximum element of A 1. currentMax A; 2. for i = 1 to n-1 do 3. if A[i] > currentMax then 35 4. currentMax A[i] 5. return currentMax; Pseudocode (3) Local variables mostly used unless global variable explicitly defined If A is a structure then |A| is size of structure. If A is an Array then n =length[A], upper bound of array. All Array elements are accessed by name followed by index in square brackets A[i]. Parameters are passed to a procedure by values Semicolons used for multiple short statement written on one line ArrayMax(A, n) Input: Array A of n integers Output: maximum element of A 1. currentMax A 2. for i = 1 to n-1 do 3. if A[i] > currentMax then 36 4. currentMax A[i] 5. return currentMax Elementary Operations An elementary operation is an operation which takes constant time regardless of problem size. The running time of an algorithm on a particular input is determined by the number of “Elementary Operations” executed. Theoretical analysis on paper from a description of an algorithm Defining elementary operations is a little trickier than it appears We want elementary operations to be relatively machine and language independent, but still as informative and easy to define as possible Example of elementary operations include variable assignment arithmetic operations (+, -, x, /) on integers comparison operations (a < b) boolean operations accessing an element of an array 37 We will measure number of steps taken in term of size of input Components of Algorithm Variables and values Instructions Selections Repetitions 38 Instruction A linear sequence of elementary operations is also performed in constant time. More generally, given two program fragments P1 and P2 which run sequentially in times t1 and t2 use the maximum rule which states that the larger time dominates complexity will be max(t1,t2) e.g. Assignment Statements x=a.....................1 x= a+b*c/h-u......1 T(n) = 1+1+1 a>b......................1 T(n) = 3 39 Selection If then P1 else P2 structures are a little harder; conditional loops. The maximum rule can be applied here too: max(t1, t2), assuming t1, t2 are times of P1, P2 However, the maximum rule may prove too conservative if is always true the time is t1 if is always false the time is t2 e.g. if (a>b) then...........1 a=2......................1 b=b+3*t.............1 else x=x+2*3................1 Block #1 t1 Block #2 t2 Max(t1,t2) 40 T(n)= 1+max(2,1) = 3 Repetition (Loops) - 1 Analyzing loops: Any loop has two parts: How many iterations are performed? How many steps per iteration? for i = 1 to n do P(i); Assume that P(i) takes time t, where t is independent of i Running time of a for-loop is at most the running time of the statements inside the for-loop times number of iterations T(n) = nt 41 This approach is reasonable, provided n is positive If n is zero or negative the relationship T(n) = nt is not valid Repetition (Loops) - 2 Analysing Nested Loops for i = 0 to n do for j = 0 to m do P(j); Assume that P(j) takes time t, where t is independent of i and j Start with outer loop: How many iterations? n How much time per iteration? Need to evaluate inner loop Analyze inside-out. Total running time is running time of the statement multiplied by product of the sizes of all the for-loops 42 T(n) = nmt Repetition (Loops) - 3 Analysing Nested Loops for i = 0 to n do for j = 0 to i do P(j); Assume that P(j) takes time t, where t is independent of i and j How do we analyze the running time of an algorithm that has complex nested loop? The answer is we write out the loops as summations and then solve the summations. To convert loops into summations, we work from inside-out. i 0 ri + t n n T(n) = n + r i 0 i 43 = n + n(n+1)/2 + tn(n+1)/2 Analysis Example Algorithm: Number of times executed 1. n = read input from user 1 2. sum = 0 1 3. i = 0 1 4. while i < n n n or i 0 1 n 1 5. number = read input from user n or i 0 1 n 1 6. sum = sum + number 7. i=i+1 8. mean = sum / n n or n 11 1 i 0 The computing time for this algorithm in terms on input size n is: T(n) = 1 + 1 + 1 + n + n + n + n + 1 44 T(n) = 4n + 4 Another Analysis Example i=1...............................1 while (i < n)................n-1 a=2+g...............n-1 i=i+1................n-1 if (i