Ch01 - Introduction CS133 (1).pdf

Chapter 1 Introduction Algorithm  An algorithm is a computational method for solving a problem.  It is a sequence of steps that take us from the input to the output.  An algorithm must be  Correct  It should provide a correct solution according to the specifications.  Finite  It should terminate.  General  It should work for every instance of a problem  Efficient  It should use few resources (such as time or memory). Pseudocode  An English-like representation of the code required for an algorithm.  Most common tool to define algorithms. History of Programming Concepts 1. Nonstructured, linear programs (known as spaghetti code) the logic flow wound through the program 2. Modular programming (structured/ procedural) programs were organized in functions, each of which still used a linear coding technique. 3. Object-oriented programming the functions are developed around an object. one part of OO concept is encapsulation, in which all processing for an object is bundled together in a library and hidden from the user. Statement Constructs 1. Sequence a series of statements that do not alter the execution path within an algorithm. 2. Selection Evaluates one or more alternative 3. Loop Iterates a block of code. Data Structure  a way of organizing the data considering not only the items stored but also their releationship from each other  an aggregation of atomic and composite data types into a set with defined relationships.  a combination of elements each of which is either a data type or another data structure.  a group of data elements grouped together under one name Data Structure (cont…)  A set of associations or relationships (structure) involving the combined elements  Atomic data are data that we choose to consider as a single, nondecomposible entity. (e.g. int, floating points, char)  Composite data can be broken out into subfields that have meaning. (e.g. tel. #. 0 63 2 2475000)  Data type is a set of data and the operations that can be performed on the data.  Structure means a set of rules that hold the data together. Some Types of Data Structure  Set  Array  List  Queues  Stacks Figure 1-1: Some data structures that supports a list. Abstract Data Type  is a mathematical model for data types  class of objects whose logical behavior is defined by a set of values and a set of operations  a data declaration packaged together with the operations that are meaningful for the data type.  with an ADT, users are not concerned with how the task is done but rather with what it can do.  the ADT consists of a set of definitions that allow programmers to use the functions while hiding the implementation (known as abstraction). Figure 1-2: An ADT model. Figure 1-3 A pointer to hidden structure Abstract Data Type List ADT A list contains elements of same type arranged in sequential order and following operations can be performed on the list. get() – Return an element from the list at any given position. insert() – Insert an element at any position of the list. remove() – Remove the first occurrence of any element from a non- empty list. removeAt() – Remove the element at a specified location from a non- empty list. replace() – Replace an element at any position by another element. size() – Return the number of elements in the list. isEmpty() – Return true if the list is empty, otherwise return false. isFull() – Return true if the list is full, otherwise return false. Abstract Data Type Stack ADT A Stack contains elements of same type arranged in sequential order. All operations takes place at a single end that is top of the stack and following operations can be performed: push() – Insert an element at one end of the stack called top. pop() – Remove and return the element at the top of the stack, if it is not empty. peek() – Return the element at the top of the stack without removing it, if the stack is not empty. size() – Return the number of elements in the stack. isEmpty() – Return true if the stack is empty, otherwise return false. isFull() – Return true if the stack is full, otherwise return false. Introduction  dynamic data structures - grow and shrink during execution  Linked lists - insertions and removals made anywhere  Stacks - insertions and removals made only at top of stack  Queues - insertions made at the back and removals made from the front  Binary trees - high-speed searching and sorting of data and efficient elimination of duplicate data items Algorithm Efficiency  A loop that does a fixed number of operations n times is O(n)  worst-case: maximum number of operations for inputs of given size  average-case: average number of operations for inputs of given size  best-case: fewest number of operations for inputs of given size Computational Complexity  The same problem can be solved using many different algorithms;  Factorials can be calculated iteratively or recursively  Sorting can be done using shellsort, heapsort, quicksort etc.  So how do we know what the best algorithm is? And why do we need to know? Why do we need to know?  If searching for a value amongst 10 values, as with many of the exercises we have encountered while learning computer programming, the efficiency of the program is maybe not as significant as getting the job done.  However, if we are looking for a value amongst several trillion values, as only one step in a longer algorithm establishing the most efficient searching algorithm is very significant. How do we find the most efficient algorithm?  To compare the efficiency of algorithms, computational complexity can be used.  Computational Complexity is a measure of how much effort is needed to apply an algorithm, or how much it costs.  An algorithm’s cost can be considered in different ways, but for our means Time and Space are critical. Time being the most significant. Computational Complexity Considerations I  Computational Complexity is both platform / system and language dependent;  An algorithm will run faster on my PC at home than the PC’s in the lab.  A precompiled program written in C++ is likely to be much faster than the same program written in Basic.  Therefore to compare algorithms all should be run on the same machine. Computational Complexity Considerations II  When comparing algorithm efficiencies, real-time units such as microseconds or nanoseconds need not be used.  Instead logical units representing the relationship between ‘n’ the size of a file, and ‘t’ the time taken to process the data should be used. Time / Size relationships  Linear Relationship  If t=cn, then an increase in the size of data increases the execution time by the same factor  Logarithmic Relationship logarithm n. Mathematics. The power to which a base, such as 10, must be raised to produce a given number  If t=log2n then doubling the size ‘n’ increases ‘t’ by one time unit. Asymptotic Complexity  Functions representing ‘n’ and ‘t’ are normally much more complex, but calculating such a function is only important when considering large bodies of data, large ‘n’.  Ergo, any terms which don’t significantly affect the outcome of the function can be eliminated, producing a function which approximates the functions efficiency. This is called Asymptotic Complexity. Asymptotic complexity Complexity theory is part of the theory of computation dealing with the resources required during computation to solve a given problem. The most common resources are time (how many steps does it take to solve a problem) and space (how much memory does it take to solve a problem). Other resources can also be considered, such as how many parallel processors are needed to solve a problem in parallel. Example I  Mowing grass has linear complexity because it takes double the time to mow double the area. However, looking up something in a dictionary has only logarithmic complexity because for a double sized dictionary you have to open it only one time more (e.g. exactly in the middle - then the problem is reduced to the half) Example II  Consider this example;  f(n) = n2 + 100n + log10n + 1000  For small values of n, the final term is the most significant.  However as n grows, the first term becomes most significant. Hence for large ‘n’ it isn’t worth considering the final term – how about the penultimate term? Some Big-O Function that appear in algorithm analysis Big-O  Big-O is used to give an asymptotic upper bound for a function, i.e. an approximation of the upper bound of a function which is difficult to formulate.  Just as there is an upper bound, there is a lower bound (Big-Ω), we’ll come on to that shortly…  But first, some useful properties of Big-O. Upper Bound The number of steps an algorithm requires to solve a specific problem is denoted as the running time of the algorithm. The notion of a step refers to an underlying machine model. The machine must be able to execute a single step in constant time. In general, the running time depends on the size of the problem and on the respective input. Example: The running time of a sorting algorithm grows if the length of the input sequence grows. If the input sequence is presorted, compared to an unsorted sequence possibly less steps are sufficient. Big-O and Logarithms  First lets state that if the complexity of an algorithm is on the order of a logarithmic function, it is very good!  Second lets state that despite that, there are an infinite number of better functions, however very few are useful; O(log log n) or O(1).  Therefore, it is important to understand big-O when it comes to Logarithms. Lower Bound Once an algorithm for solving a specific problem is found the question arises whether it is possible to design a faster algorithm or not. How can we know unless we have found such an algorithm? In most cases a lower bound for the problem can be given, i.e. a certain number of steps that every algorithm has to execute at least in order to solve the problem. As with the upper bounds the notion of a step refers to an underlying machine model. Example: In order to sort n numbers, every algorithm at least has to take a look at every number. A sequential machine that can take a look at a number in one step therefore needs at least n steps. Thus f(n) = n is a lower bound for sorting, with respect to the sequential machine model. O, Ω & Θ  For the function;  f(n) = 2n2 + 3n + 1  Options for big-O include;  g(n) = n2, g(n) = n3, g(n) = n4 etc.  Options for big-Ω include;  g(n) = n2, g(n) = n, g(n) = n½  Options for big-Θ include;  g(n) = n2, g(n) = 2n2, g(n) = 3n2  Therefore, while there are still an infinite number of equations to choose from, it is obvious which equation should be chosen. O(n) represents upper bound. Θ(n) means tight bound. Ω(n) represents lower bound. Why Complexity Analysis?  Today’s computers can perform millions of operations per second at relatively low cost, so why complexity analysis?  With a PC that can perform 1 million operations per second and 1 million items to be processed.  A quadratic equation O(n2) would take 11.6 days.  A cubic equation O(n3) would take 31,709 years.  An exponential equation O(2n) is not worth thinking about. Why Complexity Analysis  Even a 1,000 times improvement in processing power (consider Moore’s Law).  The cubic equation would take over 31 years.  The quadratic would still be over 16 minutes.  To make scalable programs algorithm complexity does need to be analysed. Typical Complexities Complexity Notation Description Constant number of operations, not constant O(1) depending on the input data size, e.g. n = 1 000 000 → 1-2 operations Number of operations propor-tional of log2(n) where n is the size of the input logarithmic O(log n) data, e.g. n = 1 000 000 000 → 30 operations Number of operations proportional to the linear O(n) input data size, e.g. n = 10 000 → 5 000 operations 41 Typical Complexities (2) Complexity Notation Description Number of operations proportional to the quadratic O(n2) square of the size of the input data, e.g. n = 500 → 250 000 operations Number of operations propor-tional to the cube of the size of the input data, e.g. cubic O(n3) n= 200 → 8 000 000 operations O(2n), Exponential number of operations, fast exponential O(kn), growing, e.g. n = 20 → 1 048 576 O(n!) operations 42 Complexity Classes 1 operation per μsec (microsecond), 10 operations to be completed.  Constant = O(1) = 1 μsec  Logarithmic = O(lg n) = 3 μsec  Linear = O(n) = 10 μsec  O(n lg n) = 33.2 μsec  Quadratic = O(n2) = 100 μsec  Cubic = O(n3) = 1msec  Exponential = O(2n) = 10msec Complexity Classes 1 operation per μsec (microsecond), 102 operations to be completed.  Constant = O(1) = 1 μsec  Logarithmic = O(lg n) = 7 μsec  Linear = O(n) = 100 μsec  O(n lg n) = 664 μsec  Quadratic = O(n2) = 10 msec  Cubic = O(n3) = 1 sec  Exponential = O(2n) = 3.17*1017 yrs Complexity Classes 1 operation per μsec (microsecond), 103 operations to be completed.  Constant = O(1) = 1 μsec  Logarithmic = O(lg n) = 10 μsec  Linear = O(n) = 1 msec  O(n lg n) = 10 msec  Quadratic = O(n2) = 1 sec  Cubic = O(n3) = 16.7min  Exponential = O(2n) = …… Complexity Classes 1 operation per μsec (microsecond), 104 operations to be completed.  Constant = O(1) = 1 μsec  Logarithmic = O(lg n) = 13 μsec  Linear = O(n) = 10 msec  O(n lg n) = 133 msec  Quadratic = O(n2) = 1.7 min  Cubic = O(n3) = 11.6 days Complexity Classes 1 operation per μsec (microsecond), 105 operations to be completed.  Constant = O(1) = 1 μsec  Logarithmic = O(lg n) = 17 μsec  Linear = O(n) = 0.1 sec  O(n lg n) = 1.6 sec  Quadratic = O(n2) = 16.7 min  Cubic = O(n3) = 31.7 years Complexity Classes 1 operation per μsec (microsecond), 106 operations to be completed.  Constant = O(1) = 1 μsec  Logarithmic = O(lg n) = 20 μsec  Linear = O(n) = 1 sec  O(n lg n) = 20 sec  Quadratic = O(n2) = 11.6 days  Cubic = O(n3) = 31,709 years Asymptotic Complexity Example  Consider this simple code; for (i = 0,sum=0; i < n; i++) sum += a[i];  First 2 variables are initialised.  The loop executes n times, with 2 assignments each time (one updates sum and one updates i)  Thus there are 2+2n assignments for this code; and so an Asymptotic Complexity of O(n). Asymptotic Complexity Example 2  Consider this code; for (i = 0; i < n; i++) { for (j = 1, sum = a; j

Ch01 - Introduction CS133 (1).pdf

Document Details

Tags

Related

Full Transcript

Upgrade to continue