tinywow_1. Introduction to Algorithms_61597133_1.pdf
Document Details
Uploaded by ToughestVoice
Tags
Full Transcript
3 Growth of Functions The order of growth of the running time of an algorithm, defined in Chapter 2, gives a simple characterization of the algorithm’s efficiency and also allows us to compare the relative performance of alternative algorithms. Once the inp...
3 Growth of Functions The order of growth of the running time of an algorithm, defined in Chapter 2, gives a simple characterization of the algorithm’s efficiency and also allows us to compare the relative performance of alternative algorithms. Once the input size n becomes large enough, merge sort, with its ‚.n lg n/ worst-case running time, beats insertion sort, whose worst-case running time is ‚.n2 /. Although we can sometimes determine the exact running time of an algorithm, as we did for insertion sort in Chapter 2, the extra precision is not usually worth the effort of computing it. For large enough inputs, the multiplicative constants and lower-order terms of an exact running time are dominated by the effects of the input size itself. When we look at input sizes large enough to make only the order of growth of the running time relevant, we are studying the asymptotic efficiency of algorithms. That is, we are concerned with how the running time of an algorithm increases with the size of the input in the limit, as the size of the input increases without bound. Usually, an algorithm that is asymptotically more efficient will be the best choice for all but very small inputs. This chapter gives several standard methods for simplifying the asymptotic anal- ysis of algorithms. The next section begins by defining several types of “asymp- totic notation,” of which we have already seen an example in ‚-notation. We then present several notational conventions used throughout this book, and finally we review the behavior of functions that commonly arise in the analysis of algorithms. 3.1 Asymptotic notation The notations we use to describe the asymptotic running time of an algorithm are defined in terms of functions whose domains are the set of natural numbers N D f0; 1; 2; : : :g. Such notations are convenient for describing the worst-case running-time function T.n/, which usually is defined only on integer input sizes. We sometimes find it convenient, however, to abuse asymptotic notation in a va- 44 Chapter 3 Growth of Functions riety of ways. For example, we might extend the notation to the domain of real numbers or, alternatively, restrict it to a subset of the natural numbers. We should make sure, however, to understand the precise meaning of the notation so that when we abuse, we do not misuse it. This section defines the basic asymptotic notations and also introduces some common abuses. Asymptotic notation, functions, and running times We will use asymptotic notation primarily to describe the running times of algo- rithms, as when we wrote that insertion sort’s worst-case running time is ‚.n2 /. Asymptotic notation actually applies to functions, however. Recall that we charac- terized insertion sort’s worst-case running time as an2 CbnCc, for some constants a, b, and c. By writing that insertion sort’s running time is ‚.n2 /, we abstracted away some details of this function. Because asymptotic notation applies to func- tions, what we were writing as ‚.n2 / was the function an2 C bn C c, which in that case happened to characterize the worst-case running time of insertion sort. In this book, the functions to which we apply asymptotic notation will usually characterize the running times of algorithms. But asymptotic notation can apply to functions that characterize some other aspect of algorithms (the amount of space they use, for example), or even to functions that have nothing whatsoever to do with algorithms. Even when we use asymptotic notation to apply to the running time of an al- gorithm, we need to understand which running time we mean. Sometimes we are interested in the worst-case running time. Often, however, we wish to characterize the running time no matter what the input. In other words, we often wish to make a blanket statement that covers all inputs, not just the worst case. We shall see asymptotic notations that are well suited to characterizing running times no matter what the input. ‚-notation In Chapter 2, we found that the worst-case running time of insertion sort is T.n/ D ‚.n2 /. Let us define what this notation means. For a given function g.n/, we denote by ‚.g.n// the set of functions ‚.g.n// D ff.n/ W there exist positive constants c1 , c2 , and n0 such that 0 c1 g.n/ f.n/ c2 g.n/ for all n n0 g :1 1 Within set notation, a colon means “such that.” 3.1 Asymptotic notation 45 c2 g.n/ cg.n/ f.n/ f.n/ f.n/ cg.n/ c1 g.n/ n n n n0 n0 n0 f.n/ D ‚.g.n// f.n/ D O.g.n// f.n/ D .g.n// (a) (b) (c) Figure 3.1 Graphic examples of the ‚, O, and notations. In each part, the value of n0 shown is the minimum possible value; any greater value would also work. (a) ‚-notation bounds a func- tion to within constant factors. We write f.n/ D ‚.g.n// if there exist positive constants n0 , c1 , and c2 such that at and to the right of n0 , the value of f.n/ always lies between c1 g.n/ and c2 g.n/ inclusive. (b) O-notation gives an upper bound for a function to within a constant factor. We write f.n/ D O.g.n// if there are positive constants n0 and c such that at and to the right of n0 , the value of f.n/ always lies on or below cg.n/. (c) -notation gives a lower bound for a function to within a constant factor. We write f.n/ D .g.n// if there are positive constants n0 and c such that at and to the right of n0 , the value of f.n/ always lies on or above cg.n/. A function f.n/ belongs to the set ‚.g.n// if there exist positive constants c1 and c2 such that it can be “sandwiched” between c1 g.n/ and c2 g.n/, for suffi- ciently large n. Because ‚.g.n// is a set, we could write “f.n/ 2 ‚.g.n//” to indicate that f.n/ is a member of ‚.g.n//. Instead, we will usually write “f.n/ D ‚.g.n//” to express the same notion. You might be confused because we abuse equality in this way, but we shall see later in this section that doing so has its advantages. Figure 3.1(a) gives an intuitive picture of functions f.n/ and g.n/, where f.n/ D ‚.g.n//. For all values of n at and to the right of n0 , the value of f.n/ lies at or above c1 g.n/ and at or below c2 g.n/. In other words, for all n n0 , the function f.n/ is equal to g.n/ to within a constant factor. We say that g.n/ is an asymptotically tight bound for f.n/. The definition of ‚.g.n// requires that every member f.n/ 2 ‚.g.n// be asymptotically nonnegative, that is, that f.n/ be nonnegative whenever n is suf- ficiently large. (An asymptotically positive function is one that is positive for all sufficiently large n.) Consequently, the function g.n/ itself must be asymptotically nonnegative, or else the set ‚.g.n// is empty. We shall therefore assume that every function used within ‚-notation is asymptotically nonnegative. This assumption holds for the other asymptotic notations defined in this chapter as well. 46 Chapter 3 Growth of Functions In Chapter 2, we introduced an informal notion of ‚-notation that amounted to throwing away lower-order terms and ignoring the leading coefficient of the highest-order term. Let us briefly justify this intuition by using the formal defi- nition to show that 21 n2 3n D ‚.n2 /. To do so, we must determine positive constants c1 , c2 , and n0 such that 1 c1 n2 n2 3n c2 n2 2 for all n n0. Dividing by n2 yields 1 3 c1 c2 : 2 n We can make the right-hand inequality hold for any value of n 1 by choosing any constant c2 1=2. Likewise, we can make the left-hand inequality hold for any value of n 7 by choosing any constant c1 1=14. Thus, by choosing c1 D 1=14, c2 D 1=2, and n0 D 7, we can verify that 21 n2 3n D ‚.n2 /. Certainly, other choices for the constants exist, but the important thing is that some choice exists. Note that these constants depend on the function 21 n2 3n; a different function belonging to ‚.n2 / would usually require different constants. We can also use the formal definition to verify that 6n3 ¤ ‚.n2 /. Suppose for the purpose of contradiction that c2 and n0 exist such that 6n3 c2 n2 for all n n0. But then dividing by n2 yields n c2 =6, which cannot possibly hold for arbitrarily large n, since c2 is constant. Intuitively, the lower-order terms of an asymptotically positive function can be ignored in determining asymptotically tight bounds because they are insignificant for large n. When n is large, even a tiny fraction of the highest-order term suf- fices to dominate the lower-order terms. Thus, setting c1 to a value that is slightly smaller than the coefficient of the highest-order term and setting c2 to a value that is slightly larger permits the inequalities in the definition of ‚-notation to be sat- isfied. The coefficient of the highest-order term can likewise be ignored, since it only changes c1 and c2 by a constant factor equal to the coefficient. As an example, consider any quadratic function f.n/ D an2 C bn C c, where a, b, and c are constants and a > 0. Throwing away the lower-order terms and ignoring the constant yields f.n/ D ‚.n2 /. Formally, to show the samep thing, we take the constants c1 D a=4, c2 D 7a=4, and n0 D 2 max.jbj =a; jcj =a/. You may verify that 0 c1 n2 an2 C bn C c c2 n2 for all n n0. In general, Pd for any polynomial p.n/ D i D0 ai ni , where the ai are constants and ad > 0, we have p.n/ D ‚.nd / (see Problem 3-1). Since any constant is a degree-0 polynomial, we can express any constant func- tion as ‚.n0 /, or ‚.1/. This latter notation is a minor abuse, however, because the 3.1 Asymptotic notation 47 expression does not indicate what variable is tending to infinity.2 We shall often use the notation ‚.1/ to mean either a constant or a constant function with respect to some variable. O-notation The ‚-notation asymptotically bounds a function from above and below. When we have only an asymptotic upper bound, we use O-notation. For a given func- tion g.n/, we denote by O.g.n// (pronounced “big-oh of g of n” or sometimes just “oh of g of n”) the set of functions O.g.n// D ff.n/ W there exist positive constants c and n0 such that 0 f.n/ cg.n/ for all n n0 g : We use O-notation to give an upper bound on a function, to within a constant factor. Figure 3.1(b) shows the intuition behind O-notation. For all values n at and to the right of n0 , the value of the function f.n/ is on or below cg.n/. We write f.n/ D O.g.n// to indicate that a function f.n/ is a member of the set O.g.n//. Note that f.n/ D ‚.g.n// implies f.n/ D O.g.n//, since ‚- notation is a stronger notion than O-notation. Written set-theoretically, we have ‚.g.n// O.g.n//. Thus, our proof that any quadratic function an2 C bn C c, where a > 0, is in ‚.n2 / also shows that any such quadratic function is in O.n2 /. What may be more surprising is that when a > 0, any linear function an C b is in O.n2 /, which is easily verified by taking c D a C jbj and n0 D max.1; b=a/. If you have seen O-notation before, you might find it strange that we should write, for example, n D O.n2 /. In the literature, we sometimes find O-notation informally describing asymptotically tight bounds, that is, what we have defined using ‚-notation. In this book, however, when we write f.n/ D O.g.n//, we are merely claiming that some constant multiple of g.n/ is an asymptotic upper bound on f.n/, with no claim about how tight an upper bound it is. Distinguish- ing asymptotic upper bounds from asymptotically tight bounds is standard in the algorithms literature. Using O-notation, we can often describe the running time of an algorithm merely by inspecting the algorithm’s overall structure. For example, the doubly nested loop structure of the insertion sort algorithm from Chapter 2 immediately yields an O.n2 / upper bound on the worst-case running time: the cost of each it- eration of the inner loop is bounded from above by O.1/ (constant), the indices i 2 The real problem is that our ordinary notation for functions does not distinguish functions from values. In -calculus, the parameters to a function are clearly specified: the function n2 could be written as n:n2 , or even r:r 2. Adopting a more rigorous notation, however, would complicate algebraic manipulations, and so we choose to tolerate the abuse. 48 Chapter 3 Growth of Functions and j are both at most n, and the inner loop is executed at most once for each of the n2 pairs of values for i and j. Since O-notation describes an upper bound, when we use it to bound the worst- case running time of an algorithm, we have a bound on the running time of the algo- rithm on every input—the blanket statement we discussed earlier. Thus, the O.n2 / bound on worst-case running time of insertion sort also applies to its running time on every input. The ‚.n2 / bound on the worst-case running time of insertion sort, however, does not imply a ‚.n2 / bound on the running time of insertion sort on every input. For example, we saw in Chapter 2 that when the input is already sorted, insertion sort runs in ‚.n/ time. Technically, it is an abuse to say that the running time of insertion sort is O.n2 /, since for a given n, the actual running time varies, depending on the particular input of size n. When we say “the running time is O.n2 /,” we mean that there is a function f.n/ that is O.n2 / such that for any value of n, no matter what particular input of size n is chosen, the running time on that input is bounded from above by the value f.n/. Equivalently, we mean that the worst-case running time is O.n2 /. -notation Just as O-notation provides an asymptotic upper bound on a function, -notation provides an asymptotic lower bound. For a given function g.n/, we denote by .g.n// (pronounced “big-omega of g of n” or sometimes just “omega of g of n”) the set of functions .g.n// D ff.n/ W there exist positive constants c and n0 such that 0 cg.n/ f.n/ for all n n0 g : Figure 3.1(c) shows the intuition behind -notation. For all values n at or to the right of n0 , the value of f.n/ is on or above cg.n/. From the definitions of the asymptotic notations we have seen thus far, it is easy to prove the following important theorem (see Exercise 3.1-5). Theorem 3.1 For any two functions f.n/ and g.n/, we have f.n/ D ‚.g.n// if and only if f.n/ D O.g.n// and f.n/ D .g.n//. As an example of the application of this theorem, our proof that an2 C bn C c D ‚.n2 / for any constants a, b, and c, where a > 0, immediately implies that an2 C bn C c D .n2 / and an2 C bn C c D O.n2 /. In practice, rather than using Theorem 3.1 to obtain asymptotic upper and lower bounds from asymptotically tight bounds, as we did for this example, we usually use it to prove asymptotically tight bounds from asymptotic upper and lower bounds. 3.1 Asymptotic notation 49 When we say that the running time (no modifier) of an algorithm is .g.n//, we mean that no matter what particular input of size n is chosen for each value of n, the running time on that input is at least a constant times g.n/, for sufficiently large n. Equivalently, we are giving a lower bound on the best-case running time of an algorithm. For example, the best-case running time of insertion sort is .n/, which implies that the running time of insertion sort is .n/. The running time of insertion sort therefore belongs to both .n/ and O.n2 /, since it falls anywhere between a linear function of n and a quadratic function of n. Moreover, these bounds are asymptotically as tight as possible: for instance, the running time of insertion sort is not .n2 /, since there exists an input for which insertion sort runs in ‚.n/ time (e.g., when the input is already sorted). It is not contradictory, however, to say that the worst-case running time of insertion sort is .n2 /, since there exists an input that causes the algorithm to take .n2 / time. Asymptotic notation in equations and inequalities We have already seen how asymptotic notation can be used within mathematical formulas. For example, in introducing O-notation, we wrote “n D O.n2 /.” We might also write 2n2 C 3n C 1 D 2n2 C ‚.n/. How do we interpret such formulas? When the asymptotic notation stands alone (that is, not within a larger formula) on the right-hand side of an equation (or inequality), as in n D O.n2 /, we have already defined the equal sign to mean set membership: n 2 O.n2 /. In general, however, when asymptotic notation appears in a formula, we interpret it as stand- ing for some anonymous function that we do not care to name. For example, the formula 2n2 C 3n C 1 D 2n2 C ‚.n/ means that 2n2 C 3n C 1 D 2n2 C f.n/, where f.n/ is some function in the set ‚.n/. In this case, we let f.n/ D 3n C 1, which indeed is in ‚.n/. Using asymptotic notation in this manner can help eliminate inessential detail and clutter in an equation. For example, in Chapter 2 we expressed the worst-case running time of merge sort as the recurrence T.n/ D 2T.n=2/ C ‚.n/ : If we are interested only in the asymptotic behavior of T.n/, there is no point in specifying all the lower-order terms exactly; they are all understood to be included in the anonymous function denoted by the term ‚.n/. The number of anonymous functions in an expression is understood to be equal to the number of times the asymptotic notation appears. For example, in the ex- pression n X O.i/ ; i D1 50 Chapter 3 Growth of Functions there is only a single anonymous function (a function of i). This expression is thus not the same as O.1/ C O.2/ C C O.n/, which doesn’t really have a clean interpretation. In some cases, asymptotic notation appears on the left-hand side of an equation, as in 2n2 C ‚.n/ D ‚.n2 / : We interpret such equations using the following rule: No matter how the anony- mous functions are chosen on the left of the equal sign, there is a way to choose the anonymous functions on the right of the equal sign to make the equation valid. Thus, our example means that for any function f.n/ 2 ‚.n/, there is some func- tion g.n/ 2 ‚.n2 / such that 2n2 C f.n/ D g.n/ for all n. In other words, the right-hand side of an equation provides a coarser level of detail than the left-hand side. We can chain together a number of such relationships, as in 2n2 C 3n C 1 D 2n2 C ‚.n/ D ‚.n2 / : We can interpret each equation separately by the rules above. The first equa- tion says that there is some function f.n/ 2 ‚.n/ such that 2n2 C 3n C 1 D 2n2 C f.n/ for all n. The second equation says that for any function g.n/ 2 ‚.n/ (such as the f.n/ just mentioned), there is some function h.n/ 2 ‚.n2 / such that 2n2 C g.n/ D h.n/ for all n. Note that this interpretation implies that 2n2 C 3n C 1 D ‚.n2 /, which is what the chaining of equations intuitively gives us. o-notation The asymptotic upper bound provided by O-notation may or may not be asymp- totically tight. The bound 2n2 D O.n2 / is asymptotically tight, but the bound 2n D O.n2 / is not. We use o-notation to denote an upper bound that is not asymp- totically tight. We formally define o.g.n// (“little-oh of g of n”) as the set o.g.n// D ff.n/ W for any positive constant c > 0, there exists a constant n0 > 0 such that 0 f.n/ < cg.n/ for all n n0 g : For example, 2n D o.n2 /, but 2n2 ¤ o.n2 /. The definitions of O-notation and o-notation are similar. The main difference is that in f.n/ D O.g.n//, the bound 0 f.n/ cg.n/ holds for some con- stant c > 0, but in f.n/ D o.g.n//, the bound 0 f.n/ < cg.n/ holds for all constants c > 0. Intuitively, in o-notation, the function f.n/ becomes insignificant relative to g.n/ as n approaches infinity; that is, 3.1 Asymptotic notation 51 f.n/ lim D0: (3.1) n!1 g.n/ Some authors use this limit as a definition of the o-notation; the definition in this book also restricts the anonymous functions to be asymptotically nonnegative. !-notation By analogy, !-notation is to -notation as o-notation is to O-notation. We use !-notation to denote a lower bound that is not asymptotically tight. One way to define it is by f.n/ 2 !.g.n// if and only if g.n/ 2 o.f.n// : Formally, however, we define !.g.n// (“little-omega of g of n”) as the set !.g.n// D ff.n/ W for any positive constant c > 0, there exists a constant n0 > 0 such that 0 cg.n/ < f.n/ for all n n0 g : For example, n2 =2 D !.n/, but n2 =2 ¤ !.n2 /. The relation f.n/ D !.g.n// implies that f.n/ lim D1; n!1 g.n/ if the limit exists. That is, f.n/ becomes arbitrarily large relative to g.n/ as n approaches infinity. Comparing functions Many of the relational properties of real numbers apply to asymptotic comparisons as well. For the following, assume that f.n/ and g.n/ are asymptotically positive. Transitivity: f.n/ D ‚.g.n// and g.n/ D ‚.h.n// imply f.n/ D ‚.h.n// ; f.n/ D O.g.n// and g.n/ D O.h.n// imply f.n/ D O.h.n// ; f.n/ D .g.n// and g.n/ D .h.n// imply f.n/ D .h.n// ; f.n/ D o.g.n// and g.n/ D o.h.n// imply f.n/ D o.h.n// ; f.n/ D !.g.n// and g.n/ D !.h.n// imply f.n/ D !.h.n// : Reflexivity: f.n/ D ‚.f.n// ; f.n/ D O.f.n// ; f.n/ D .f.n// : 52 Chapter 3 Growth of Functions Symmetry: f.n/ D ‚.g.n// if and only if g.n/ D ‚.f.n// : Transpose symmetry: f.n/ D O.g.n// if and only if g.n/ D .f.n// ; f.n/ D o.g.n// if and only if g.n/ D !.f.n// : Because these properties hold for asymptotic notations, we can draw an analogy between the asymptotic comparison of two functions f and g and the comparison of two real numbers a and b: f.n/ D O.g.n// is like ab; f.n/ D .g.n// is like ab; f.n/ D ‚.g.n// is like aDb; f.n/ D o.g.n// is like ab: We say that f.n/ is asymptotically smaller than g.n/ if f.n/ D o.g.n//, and f.n/ is asymptotically larger than g.n/ if f.n/ D !.g.n//. One property of real numbers, however, does not carry over to asymptotic nota- tion: Trichotomy: For any two real numbers a and b, exactly one of the following must hold: a < b, a D b, or a > b. Although any two real numbers can be compared, not all functions are asymptot- ically comparable. That is, for two functions f.n/ and g.n/, it may be the case that neither f.n/ D O.g.n// nor f.n/ D .g.n// holds. For example, we cannot compare the functions n and n1Csin n using asymptotic notation, since the value of the exponent in n1Csin n oscillates between 0 and 2, taking on all values in between. Exercises 3.1-1 Let f.n/ and g.n/ be asymptotically nonnegative functions. Using the basic defi- nition of ‚-notation, prove that max.f.n/; g.n// D ‚.f.n/ C g.n//. 3.1-2 Show that for any real constants a and b, where b > 0,.n C a/b D ‚.nb / : (3.2) 3.2 Standard notations and common functions 53 3.1-3 Explain why the statement, “The running time of algorithm A is at least O.n2 /,” is meaningless. 3.1-4 Is 2nC1 D O.2n /? Is 22n D O.2n /? 3.1-5 Prove Theorem 3.1. 3.1-6 Prove that the running time of an algorithm is ‚.g.n// if and only if its worst-case running time is O.g.n// and its best-case running time is .g.n//. 3.1-7 Prove that o.g.n// \ !.g.n// is the empty set. 3.1-8 We can extend our notation to the case of two parameters n and m that can go to infinity independently at different rates. For a given function g.n; m/, we denote by O.g.n; m// the set of functions O.g.n; m// D ff.n; m/ W there exist positive constants c, n0 , and m0 such that 0 f.n; m/ cg.n; m/ for all n n0 or m m0 g : Give corresponding definitions for .g.n; m// and ‚.g.n; m//. 3.2 Standard notations and common functions This section reviews some standard mathematical functions and notations and ex- plores the relationships among them. It also illustrates the use of the asymptotic notations. Monotonicity A function f.n/ is monotonically increasing if m n implies f.m/ f.n/. Similarly, it is monotonically decreasing if m n implies f.m/ f.n/. A function f.n/ is strictly increasing if m < n implies f.m/ < f.n/ and strictly decreasing if m < n implies f.m/ > f.n/. 54 Chapter 3 Growth of Functions Floors and ceilings For any real number x, we denote the greatest integer less than or equal to x by bxc (read “the floor of x”) and the least integer greater than or equal to x by dxe (read “the ceiling of x”). For all real x, x 1 < bxc x dxe < x C 1 : (3.3) For any integer n, dn=2e C bn=2c D n ; and for any real number x 0 and integers a; b > 0, lx m dx=ae D ; (3.4) b ab jx k bx=ac D ; (3.5) b ab la m a C.b 1/ ; (3.6) b b ja k a .b 1/ : (3.7) b b The floor function f.x/ D bxc is monotonically increasing, as is the ceiling func- tion f.x/ D dxe. Modular arithmetic For any integer a and any positive integer n, the value a mod n is the remainder (or residue) of the quotient a=n: a mod n D a n ba=nc : (3.8) It follows that 0 a mod n < n : (3.9) Given a well-defined notion of the remainder of one integer when divided by an- other, it is convenient to provide special notation to indicate equality of remainders. If.a mod n/ D.b mod n/, we write a b.mod n/ and say that a is equivalent to b, modulo n. In other words, a b.mod n/ if a and b have the same remain- der when divided by n. Equivalently, a b.mod n/ if and only if n is a divisor of b a. We write a 6 b.mod n/ if a is not equivalent to b, modulo n. 3.2 Standard notations and common functions 55 Polynomials Given a nonnegative integer d , a polynomial in n of degree d is a function p.n/ of the form d X p.n/ D a i ni ; i D0 where the constants a0 ; a1 ; : : : ; ad are the coefficients of the polynomial and ad ¤ 0. A polynomial is asymptotically positive if and only if ad > 0. For an asymptotically positive polynomial p.n/ of degree d , we have p.n/ D ‚.nd /. For any real constant a 0, the function na is monotonically increasing, and for any real constant a 0, the function na is monotonically decreasing. We say that a function f.n/ is polynomially bounded if f.n/ D O.nk / for some constant k. Exponentials For all real a > 0, m, and n, we have the following identities: a0 D 1; a1 D a; a1 D 1=a ;.am /n D amn ;.am /n D.an /m ; am an D amCn : For all n and a 1, the function an is monotonically increasing in n. When convenient, we shall assume 00 D 1. We can relate the rates of growth of polynomials and exponentials by the fol- lowing fact. For all real constants a and b such that a > 1, nb lim D0; (3.10) n!1 a n from which we can conclude that nb D o.an / : Thus, any exponential function with a base strictly greater than 1 grows faster than any polynomial function. Using e to denote 2:71828 : : :, the base of the natural logarithm function, we have for all real x, X 1 x x2 x3 xi e D1CxC C C D ; (3.11) 2Š 3Š i D0 iŠ 56 Chapter 3 Growth of Functions where “Š” denotes the factorial function defined later in this section. For all real x, we have the inequality ex 1 C x ; (3.12) where equality holds only when x D 0. When jxj 1, we have the approximation 1 C x ex 1 C x C x 2 : (3.13) x When x ! 0, the approximation of e by 1 C x is quite good: e x D 1 C x C ‚.x 2 / : (In this equation, the asymptotic notation is used to describe the limiting behavior as x ! 0 rather than as x ! 1.) We have for all x, x n lim 1 C D ex : (3.14) n!1 n Logarithms We shall use the following notations: lg n D log2 n (binary logarithm) , ln n D loge n (natural logarithm) , lgk n D.lg n/k (exponentiation) , lg lg n D lg.lg n/ (composition). An important notational convention we shall adopt is that logarithm functions will apply only to the next term in the formula, so that lg n C k will mean.lg n/ C k and not lg.n C k/. If we hold b > 1 constant, then for n > 0, the function logb n is strictly increasing. For all real a > 0, b > 0, c > 0, and n, a D b logb a ; logc.ab/ D logc a C logc b ; logb an D n logb a ; logc a logb a D ; (3.15) logc b logb.1=a/ D logb a ; 1 logb a D ; loga b alogb c D c logb a ; (3.16) where, in each equation above, logarithm bases are not 1. 3.2 Standard notations and common functions 57 By equation (3.15), changing the base of a logarithm from one constant to an- other changes the value of the logarithm by only a constant factor, and so we shall often use the notation “lg n” when we don’t care about constant factors, such as in O-notation. Computer scientists find 2 to be the most natural base for logarithms because so many algorithms and data structures involve splitting a problem into two parts. There is a simple series expansion for ln.1 C x/ when jxj < 1: x2 x3 x4 x5 ln.1 C x/ D x C C : 2 3 4 5 We also have the following inequalities for x > 1: x ln.1 C x/ x ; (3.17) 1Cx where equality holds only for x D 0. We say that a function f.n/ is polylogarithmically bounded if f.n/ D O.lgk n/ for some constant k. We can relate the growth of polynomials and polylogarithms by substituting lg n for n and 2a for a in equation (3.10), yielding lgb n lgb n lim D lim D0: n!1.2a /lg n n!1 na From this limit, we can conclude that lgb n D o.na / for any constant a > 0. Thus, any positive polynomial function grows faster than any polylogarithmic function. Factorials The notation nŠ (read “n factorial”) is defined for integers n 0 as ( 1 if n D 0 ; nŠ D n .n 1/Š if n > 0 : Thus, nŠ D 1 2 3 n. A weak upper bound on the factorial function is nŠ nn , since each of the n terms in the factorial product is at most n. Stirling’s approximation, p n n 1 nŠ D 2 n 1C‚ ; (3.18) e n 58 Chapter 3 Growth of Functions where e is the base of the natural logarithm, gives us a tighter upper bound, and a lower bound as well. As Exercise 3.2-3 asks you to prove, nŠ D o.nn / ; nŠ D !.2n / ; lg.nŠ/ D ‚.n lg n/ ; (3.19) where Stirling’s approximation is helpful in proving equation (3.19). The following equation also holds for all n 1: p n n nŠ D 2 n e ˛n (3.20) e where 1 1 < ˛n < : (3.21) 12n C 1 12n Functional iteration We use the notation f.i /.n/ to denote the function f.n/ iteratively applied i times to an initial value of n. Formally, let f.n/ be a function over the reals. For non- negative integers i, we recursively define ( n if i D 0 ; f.i /.n/ D.i 1/ f.f.n// if i > 0 : For example, if f.n/ D 2n, then f.i /.n/ D 2i n. The iterated logarithm function We use the notation lg n (read “log star of n”) to denote the iterated logarithm, de- fined as follows. Let lg.i / n be as defined above, with f.n/ D lg n. Because the log- arithm of a nonpositive number is undefined, lg.i / n is defined only if lg.i 1/ n > 0. Be sure to distinguish lg.i / n (the logarithm function applied i times in succession, starting with argument n) from lgi n (the logarithm of n raised to the ith power). Then we define the iterated logarithm function as ˚ lg n D min i 0 W lg.i / n 1 : The iterated logarithm is a very slowly growing function: lg 2 D 1; lg 4 D 2; lg 16 D 3; lg 65536 D 4; lg.265536 / D 5: 3.2 Standard notations and common functions 59 Since the number of atoms in the observable universe is estimated to be about 1080 , which is much less than 265536 , we rarely encounter an input size n such that lg n > 5. Fibonacci numbers We define the Fibonacci numbers by the following recurrence: F0 D 0 ; F1 D 1 ; (3.22) Fi D Fi 1 C Fi 2 for i 2 : Thus, each Fibonacci number is the sum of the two previous ones, yielding the sequence 0; 1; 1; 2; 3; 5; 8; 13; 21; 34; 55; : : : : y which Fibonacci numbers are related to the golden ratio and to its conjugate , are the two roots of the equation x2 D x C 1 (3.23) and are given by the following formulas (see Exercise 3.2-6): p 1C 5 D (3.24) 2 D 1:61803 : : : ; p 1 5 y D 2 D :61803 : : : : Specifically, we have i yi Fi D p ; 5 ˇ ˇ which we can prove by induction (Exercise 3.2-7). Since ˇyˇ < 1, we have ˇ iˇ ˇy ˇ 1 p < p 5 5 1 < ; 2 which implies that 60 Chapter 3 Growth of Functions i 1 Fi D p C ; (3.25) 5 2 p which is to say that the ith Fibonacci number Fi is equal to i = 5 rounded to the nearest integer. Thus, Fibonacci numbers grow exponentially. Exercises 3.2-1 Show that if f.n/ and g.n/ are monotonically increasing functions, then so are the functions f.n/ C g.n/ and f.g.n//, and if f.n/ and g.n/ are in addition nonnegative, then f.n/ g.n/ is monotonically increasing. 3.2-2 Prove equation (3.16). 3.2-3 Prove equation (3.19). Also prove that nŠ D !.2n / and nŠ D o.nn /. 3.2-4 ? Is the function dlg neŠ polynomially bounded? Is the function dlg lg neŠ polynomi- ally bounded? 3.2-5 ? Which is asymptotically larger: lg.lg n/ or lg.lg n/? 3.2-6 Show that the golden ratio and its conjugate y both satisfy the equation x 2 D x C 1. 3.2-7 Prove by induction that the ith Fibonacci number satisfies the equality i yi Fi D p ; 5 where is the golden ratio and y is its conjugate. 3.2-8 Show that k ln k D ‚.n/ implies k D ‚.n= ln n/. Problems for Chapter 3 61 Problems 3-1 Asymptotic behavior of polynomials Let d X p.n/ D a i ni ; i D0 where ad > 0, be a degree-d polynomial in n, and let k be a constant. Use the definitions of the asymptotic notations to prove the following properties. a. If k d , then p.n/ D O.nk /. b. If k d , then p.n/ D .nk /. c. If k D d , then p.n/ D ‚.nk /. d. If k > d , then p.n/ D o.nk /. e. If k < d , then p.n/ D !.nk /. 3-2 Relative asymptotic growths Indicate, for each pair of expressions.A; B/ in the table below, whether A is O, o, , !, or ‚ of B. Assume that k 1, > 0, and c > 1 are constants. Your answer should be in the form of the table with “yes” or “no” written in each box. A B O o ! ‚ a. lgk n n b. nk cn p c. n nsin n d. 2n 2n=2 e. nlg c c lg n f. lg.nŠ/ lg.nn / 3-3 Ordering by asymptotic growth rates a. Rank the following functions by order of growth; that is, find an arrangement g1 ; g2 ; : : : ; g30 of the functions satisfying g1 D .g2 /, g2 D .g3 /,... , g29 D .g30 /. Partition your list into equivalence classes such that functions f.n/ and g.n/ are in the same class if and only if f.n/ D ‚.g.n//. 62 Chapter 3 Growth of Functions n p lg.lg n/ 2lg. 2/lg n n2 nŠ.lg n/Š n. 23 /n n3 lg2 n lg.nŠ/ 22 n1= lg n ln ln n lg n n 2n nlg lg n ln n 1 p 2lg n.lg n/lg n en 4lg n.n C 1/Š lg n p nC1 lg.lg n/ 2 2 lg n n 2n n lg n 22 b. Give an example of a single nonnegative function f.n/ such that for all func- tions gi.n/ in part (a), f.n/ is neither O.gi.n// nor .gi.n//. 3-4 Asymptotic notation properties Let f.n/ and g.n/ be asymptotically positive functions. Prove or disprove each of the following conjectures. a. f.n/ D O.g.n// implies g.n/ D O.f.n//. b. f.n/ C g.n/ D ‚.min.f.n/; g.n///. c. f.n/ D O.g.n// implies lg.f.n// D O.lg.g.n///, where lg.g.n// 1 and f.n/ 1 for all sufficiently large n. d. f.n/ D O.g.n// implies 2f.n/ D O 2g.n/. e. f.n/ D O..f.n//2 /. f. f.n/ D O.g.n// implies g.n/ D .f.n//. g. f.n/ D ‚.f.n=2//. h. f.n/ C o.f.n// D ‚.f.n//. 3-5 Variations on O and ˝ 1 Some authors define in a slightly different way than we do; let’s use (read 1 “omega infinity”) for this alternative definition. We say that f.n/ D .g.n// if there exists a positive constant c such that f.n/ cg.n/ 0 for infinitely many integers n. a. Show that for any two functions f.n/ and g.n/ that are asymptotically nonneg- 1 ative, either f.n/ D O.g.n// or f.n/ D .g.n// or both, whereas this is not 1 true if we use in place of . Problems for Chapter 3 63 1 b. Describe the potential advantages and disadvantages of using instead of to characterize the running times of programs. Some authors also define O in a slightly different manner; let’s use O 0 for the alternative definition. We say that f.n/ D O 0.g.n// if and only if jf.n/j D O.g.n//. c. What happens to each direction of the “if and only if” in Theorem 3.1 if we substitute O 0 for O but still use ? e (read “soft-oh”) to mean O with logarithmic factors ig- Some authors define O nored: e O.g.n// D ff.n/ W there exist positive constants c, k, and n0 such that 0 f.n/ cg.n/ lgk.n/ for all n n0 g : e and ‚ d. Define e in a similar manner. Prove the corresponding analog to Theo- rem 3.1. 3-6 Iterated functions We can apply the iteration operator used in the lg function to any monotonically increasing function f.n/ over the reals. For a given constant c 2 R, we define the iterated function fc by ˚ fc.n/ D min i 0 W f.i /.n/ c ; which need not be well defined in all cases. In other words, the quantity fc.n/ is the number of iterated applications of the function f required to reduce its argu- ment down to c or less. For each of the following functions f.n/ and constants c, give as tight a bound as possible on fc.n/. f.n/ c fc.n/ a. n1 0 b. lg n 1 c. n=2 1 d. n=2 2 p e. n 2 p f. n 1 g. n1=3 2 h. n= lg n 2 64 Chapter 3 Growth of Functions Chapter notes Knuth traces the origin of the O-notation to a number-theory text by P. Bach- mann in 1892. The o-notation was invented by E. Landau in 1909 for his discussion of the distribution of prime numbers. The and ‚ notations were advocated by Knuth to correct the popular, but technically sloppy, practice in the literature of using O-notation for both upper and lower bounds. Many people continue to use the O-notation where the ‚-notation is more technically precise. Further dis- cussion of the history and development of asymptotic notations appears in works by Knuth [209, 213] and Brassard and Bratley. Not all authors define the asymptotic notations in the same way, although the various definitions agree in most common situations. Some of the alternative def- initions encompass functions that are not asymptotically nonnegative, as long as their absolute values are appropriately bounded. Equation (3.20) is due to Robbins. Other properties of elementary math- ematical functions can be found in any good mathematical reference, such as Abramowitz and Stegun or Zwillinger , or in a calculus book, such as Apostol or Thomas et al.. Knuth and Graham, Knuth, and Patash- nik contain a wealth of material on discrete mathematics as used in computer science. 4 Divide-and-Conquer In Section 2.3.1, we saw how merge sort serves as an example of the divide-and- conquer paradigm. Recall that in divide-and-conquer, we solve a problem recur- sively, applying three steps at each level of the recursion: Divide the problem into a number of subproblems that are smaller instances of the same problem. Conquer the subproblems by solving them recursively. If the subproblem sizes are small enough, however, just solve the subproblems in a straightforward manner. Combine the solutions to the subproblems into the solution for the original prob- lem. When the subproblems are large enough to solve recursively, we call that the recur- sive case. Once the subproblems become small enough that we no longer recurse, we say that the recursion “bottoms out” and that we have gotten down to the base case. Sometimes, in addition to subproblems that are smaller instances of the same problem, we have to solve subproblems that are not quite the same as the original problem. We consider solving such subproblems as part of the combine step. In this chapter, we shall see more algorithms based on divide-and-conquer. The first one solves the maximum-subarray problem: it takes as input an array of num- bers, and it determines the contiguous subarray whose values have the greatest sum. Then we shall see two divide-and-conquer algorithms for multiplying n n matri- ces. One runs in ‚.n3 / time, which is no better than the straightforward method of multiplying square matrices. But the other, Strassen’s algorithm, runs in O.n2:81 / time, which beats the straightforward method asymptotically. Recurrences Recurrences go hand in hand with the divide-and-conquer paradigm, because they give us a natural way to characterize the running times of divide-and-conquer algo- rithms. A recurrence is an equation or inequality that describes a function in terms 66 Chapter 4 Divide-and-Conquer of its value on smaller inputs. For example, in Section 2.3.2 we described the worst-case running time T.n/ of the M ERGE -S ORT procedure by the recurrence ( ‚.1/ if n D 1 ; T.n/ D (4.1) 2T.n=2/ C ‚.n/ if n > 1 ; whose solution we claimed to be T.n/ D ‚.n lg n/. Recurrences can take many forms. For example, a recursive algorithm might divide subproblems into unequal sizes, such as a 2=3-to-1=3 split. If the divide and combine steps take linear time, such an algorithm would give rise to the recurrence T.n/ D T.2n=3/ C T.n=3/ C ‚.n/. Subproblems are not necessarily constrained to being a constant fraction of the original problem size. For example, a recursive version of linear search (see Exercise 2.1-3) would create just one subproblem containing only one el- ement fewer than the original problem. Each recursive call would take con- stant time plus the time for the recursive calls it makes, yielding the recurrence T.n/ D T.n 1/ C ‚.1/. This chapter offers three methods for solving recurrences—that is, for obtaining asymptotic “‚” or “O” bounds on the solution: In the substitution method, we guess a bound and then use mathematical in- duction to prove our guess correct. The recursion-tree method converts the recurrence into a tree whose nodes represent the costs incurred at various levels of the recursion. We use techniques for bounding summations to solve the recurrence. The master method provides bounds for recurrences of the form T.n/ D aT.n=b/ C f.n/ ; (4.2) where a 1, b > 1, and f.n/ is a given function. Such recurrences arise frequently. A recurrence of the form in equation (4.2) characterizes a divide- and-conquer algorithm that creates a subproblems, each of which is 1=b the size of the original problem, and in which the divide and combine steps together take f.n/ time. To use the master method, you will need to memorize three cases, but once you do that, you will easily be able to determine asymptotic bounds for many simple recurrences. We will use the master method to determine the running times of the divide-and-conquer algorithms for the maximum-subarray problem and for matrix multiplication, as well as for other algorithms based on divide- and-conquer elsewhere in this book. Chapter 4 Divide-and-Conquer 67 Occasionally, we shall see recurrences that are not equalities but rather inequal- ities, such as T.n/ 2T.n=2/ C ‚.n/. Because such a recurrence states only an upper bound on T.n/, we will couch its solution using O-notation rather than ‚-notation. Similarly, if the inequality were reversed to T.n/ 2T.n=2/ C ‚.n/, then because the recurrence gives only a lower bound on T.n/, we would use -notation in its solution. Technicalities in recurrences In practice, we neglect certain technical details when we state and solve recur- rences. For example, if we call M ERGE -S ORT on n elements when n is odd, we end up with subproblems of size bn=2c and dn=2e. Neither size is actually n=2, because n=2 is not an integer when n is odd. Technically, the recurrence describing the worst-case running time of M ERGE -S ORT is really ( ‚.1/ if n D 1 ; T.n/ D (4.3) T.dn=2e/ C T.bn=2c/ C ‚.n/ if n > 1 : Boundary conditions represent another class of details that we typically ignore. Since the running time of an algorithm on a constant-sized input is a constant, the recurrences that arise from the running times of algorithms generally have T.n/ D ‚.1/ for sufficiently small n. Consequently, for convenience, we shall generally omit statements of the boundary conditions of recurrences and assume that T.n/ is constant for small n. For example, we normally state recurrence (4.1) as T.n/ D 2T.n=2/ C ‚.n/ ; (4.4) without explicitly giving values for small n. The reason is that although changing the value of T.1/ changes the exact solution to the recurrence, the solution typi- cally doesn’t change by more than a constant factor, and so the order of growth is unchanged. When we state and solve recurrences, we often omit floors, ceilings, and bound- ary conditions. We forge ahead without these details and later determine whether or not they matter. They usually do not, but you should know when they do. Ex- perience helps, and so do some theorems stating that these details do not affect the asymptotic bounds of many recurrences characterizing divide-and-conquer algo- rithms (see Theorem 4.1). In this chapter, however, we shall address some of these details and illustrate the fine points of recurrence solution methods. 68 Chapter 4 Divide-and-Conquer 4.1 The maximum-subarray problem Suppose that you been offered the opportunity to invest in the Volatile Chemical Corporation. Like the chemicals the company produces, the stock price of the Volatile Chemical Corporation is rather volatile. You are allowed to buy one unit of stock only one time and then sell it at a later date, buying and selling after the close of trading for the day. To compensate for this restriction, you are allowed to learn what the price of the stock will be in the future. Your goal is to maximize your profit. Figure 4.1 shows the price of the stock over a 17-day period. You may buy the stock at any one time, starting after day 0, when the price is $100 per share. Of course, you would want to “buy low, sell high”—buy at the lowest possible price and later on sell at the highest possible price—to maximize your profit. Unfortunately, you might not be able to buy at the lowest price and then sell at the highest price within a given period. In Figure 4.1, the lowest price occurs after day 7, which occurs after the highest price, after day 1. You might think that you can always maximize profit by either buying at the lowest price or selling at the highest price. For example, in Figure 4.1, we would maximize profit by buying at the lowest price, after day 7. If this strategy always worked, then it would be easy to determine how to maximize profit: find the highest and lowest prices, and then work left from the highest price to find the lowest prior price, work right from the lowest price to find the highest later price, and take the pair with the greater difference. Figure 4.2 shows a simple counterexample, 120 110 100 90 80 70 60 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Day 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Price 100 113 110 85 105 102 86 63 81 101 94 106 101 79 94 90 97 Change 13 3 25 20 3 16 23 18 20 7 12 5 22 15 4 7 Figure 4.1 Information about the price of stock in the Volatile Chemical Corporation after the close of trading over a period of 17 days. The horizontal axis of the chart indicates the day, and the vertical axis shows the price. The bottom row of the table gives the change in price from the previous day. 4.1 The maximum-subarray problem 69 11 10 9 Day 0 1 2 3 4 8 Price 10 11 7 10 6 7 Change 1 4 3 4 6 0 1 2 3 4 Figure 4.2 An example showing that the maximum profit does not always start at the lowest price or end at the highest price. Again, the horizontal axis indicates the day, and the vertical axis shows the price. Here, the maximum profit of $3 per share would be earned by buying after day 2 and selling after day 3. The price of $7 after day 2 is not the lowest price overall, and the price of $10 after day 3 is not the highest price overall. demonstrating that the maximum profit sometimes comes neither by buying at the lowest price nor by selling at the highest price. A brute-force solution We can easily devise a brute-force solution to this problem: just try every possible pair of buy and sell dates in which the buy date precedes the sell date. A period of n days has n2 such pairs of dates. Since n2 is ‚.n2 /, and the best we can hope for is to evaluate each pair of dates in constant time, this approach would take .n2 / time. Can we do better? A transformation In order to design an algorithm with an o.n2 / running time, we will look at the input in a slightly different way. We want to find a sequence of days over which the net change from the first day to the last is maximum. Instead of looking at the daily prices, let us instead consider the daily change in price, where the change on day i is the difference between the prices after day i 1 and after day i. The table in Figure 4.1 shows these daily changes in the bottom row. If we treat this row as an array A, shown in Figure 4.3, we now want to find the nonempty, contiguous subarray of A whose values have the largest sum. We call this contiguous subarray the maximum subarray. For example, in the array of Figure 4.3, the maximum subarray of AŒ1 : : 16 is AŒ8 : : 11, with the sum 43. Thus, you would want to buy the stock just before day 8 (that is, after day 7) and sell it after day 11, earning a profit of $43 per share. At first glance, this transformation does not help. We still need to check n1 2 D ‚.n2 / subarrays for a period of n days. Exercise 4.1-2 asks you to show 70 Chapter 4 Divide-and-Conquer 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 A 13 –3 –25 20 –3 –16 –23 18 20 –7 12 –5 –22 15 –4 7 maximum subarray Figure 4.3 The change in stock prices as a maximum-subarray problem. Here, the subar- ray AŒ8 : : 11, with sum 43, has the greatest sum of any contiguous subarray of array A. that although computing the cost of one subarray might take time proportional to the length of the subarray, when computing all ‚.n2 / subarray sums, we can orga- nize the computation so that each subarray sum takes O.1/ time, given the values of previously computed subarray sums, so that the brute-force solution takes ‚.n2 / time. So let us seek a more efficient solution to the maximum-subarray problem. When doing so, we will usually speak of “a” maximum subarray rather than “the” maximum subarray, since there could be more than one subarray that achieves the maximum sum. The maximum-subarray problem is interesting only when the array contains some negative numbers. If all the array entries were nonnegative, then the maximum-subarray problem would present no challenge, since the entire array would give the greatest sum. A solution using divide-and-conquer Let’s think about how we might solve the maximum-subarray problem using the divide-and-conquer technique. Suppose we want to find a maximum subar- ray of the subarray AŒlow : : high. Divide-and-conquer suggests that we divide the subarray into two subarrays of as equal size as possible. That is, we find the midpoint, say mid, of the subarray, and consider the subarrays AŒlow : : mid and AŒmid C 1 : : high. As Figure 4.4(a) shows, any contiguous subarray AŒi : : j of AŒlow : : high must lie in exactly one of the following places: entirely in the subarray AŒlow : : mid, so that low i j mid, entirely in the subarray AŒmid C 1 : : high, so that mid < i j high, or crossing the midpoint, so that low i mid < j high. Therefore, a maximum subarray of AŒlow : : high must lie in exactly one of these places. In fact, a maximum subarray of AŒlow : : high must have the greatest sum over all subarrays entirely in AŒlow : : mid, entirely in AŒmid C 1 : : high, or crossing the midpoint. We can find maximum subarrays of AŒlow : : mid and AŒmidC1 : : high recursively, because these two subproblems are smaller instances of the problem of finding a maximum subarray. Thus, all that is left to do is find a 4.1 The maximum-subarray problem 71 crosses the midpoint AŒmid C 1 : : j low mid high low i mid high mid C 1 mid C 1 j entirely in AŒlow : : mid entirely in AŒmid C 1 : : high AŒi : : mid (a) (b) Figure 4.4 (a) Possible locations of subarrays of AŒlow : : high: entirely in AŒlow : : mid, entirely in AŒmid C 1 : : high, or crossing the midpoint mid. (b) Any subarray of AŒlow : : high crossing the midpoint comprises two subarrays AŒi : : mid and AŒmid C 1 : : j , where low i mid and mid < j high. maximum subarray that crosses the midpoint, and take a subarray with the largest sum of the three. We can easily find a maximum subarray crossing the midpoint in time linear in the size of the subarray AŒlow : : high. This problem is not a smaller instance of our original problem, because it has the added restriction that the subarray it chooses must cross the midpoint. As Figure 4.4(b) shows, any subarray crossing the midpoint is itself made of two subarrays AŒi : : mid and AŒmid C 1 : : j , where low i mid and mid < j high. Therefore, we just need to find maximum subarrays of the form AŒi : : mid and AŒmid C 1 : : j and then combine them. The procedure F IND -M AX -C ROSSING -S UBARRAY takes as input the array A and the indices low, mid, and high, and it returns a tuple containing the indices demarcating a maximum subarray that crosses the midpoint, along with the sum of the values in a maximum subarray. F IND -M AX -C ROSSING -S UBARRAY.A; low; mid; high/ 1 left-sum D 1 2 sum D 0 3 for i D mid downto low 4 sum D sum C AŒi 5 if sum > left-sum 6 left-sum D sum 7 max-left D i 8 right-sum D 1 9 sum D 0 10 for j D mid C 1 to high 11 sum D sum C AŒj 12 if sum > right-sum 13 right-sum D sum 14 max-right D j 15 return.max-left; max-right; left-sum C right-sum/ 72 Chapter 4 Divide-and-Conquer This procedure works as follows. Lines 1–7 find a maximum subarray of the left half, AŒlow : : mid. Since this subarray must contain AŒmid, the for loop of lines 3–7 starts the index i at mid and works down to low, so that every subarray it considers is of the form AŒi : : mid. Lines 1–2 initialize the variables left-sum, which holds the greatest sum found so far, and sum, holding the sum of the entries in AŒi : : mid. Whenever we find, in line 5, a subarray AŒi : : mid with a sum of values greater than left-sum, we update left-sum to this subarray’s sum in line 6, and in line 7 we update the variable max-left to record this index i. Lines 8–14 work analogously for the right half, AŒmid C 1 : : high. Here, the for loop of lines 10–14 starts the index j at midC1 and works up to high, so that every subarray it considers is of the form AŒmid C 1 : : j . Finally, line 15 returns the indices max-left and max-right that demarcate a maximum subarray crossing the midpoint, along with the sum left-sum Cright-sum of the values in the subarray AŒmax-left : : max-right. If the subarray AŒlow : : high contains n entries (so that n D high low C 1), we claim that the call F IND -M AX -C ROSSING -S UBARRAY.A; low; mid; high/ takes ‚.n/ time. Since each iteration of each of the two for loops takes ‚.1/ time, we just need to count up how many iterations there are altogether. The for loop of lines 3–7 makes mid low C 1 iterations, and the for loop of lines 10–14 makes high mid iterations, and so the total number of iterations is.mid low C 1/ C.high mid/ D high low C 1 D n: With a linear-time F IND -M AX -C ROSSING -S UBARRAY procedure in hand, we can write pseudocode for a divide-and-conquer algorithm to solve the maximum- subarray problem: F IND -M AXIMUM -S UBARRAY.A; low; high/ 1 if high == low 2 return.low; high; AŒlow/ // base case: only one element 3 else mid D b.low C high/=2c 4.left-low; left-high; left-sum/ D F IND -M AXIMUM -S UBARRAY.A; low; mid/ 5.right-low; right-high; right-sum/ D F IND -M AXIMUM -S UBARRAY.A; mid C 1; high/ 6.cross-low; cross-high; cross-sum/ D F IND -M AX -C ROSSING -S UBARRAY.A; low; mid; high/ 7 if left-sum right-sum and left-sum cross-sum 8 return.left-low; left-high; left-sum/ 9 elseif right-sum left-sum and right-sum cross-sum 10 return.right-low; right-high; right-sum/ 11 else return.cross-low; cross-high; cross-sum/ 4.1 The maximum-subarray problem 73 The initial call F IND -M AXIMUM -S UBARRAY.A; 1; A:length/ will find a maxi- mum subarray of AŒ1 : : n. Similar to F IND -M AX -C ROSSING -S UBARRAY, the recursive procedure F IND - M AXIMUM -S UBARRAY returns a tuple containing the indices that demarcate a maximum subarray, along with the sum of the values in a maximum subarray. Line 1 tests for the base case, where the subarray has just one element. A subar- ray with just one element has only one subarray—itself—and so line 2 returns a tuple with the starting and ending indices of just the one element, along with its value. Lines 3–11 handle the recursive case. Line 3 does the divide part, comput- ing the index mid of the midpoint. Let’s refer to the subarray AŒlow : : mid as the left subarray and to AŒmid C 1 : : high as the right subarray. Because we know that the subarray AŒlow : : high contains at least two elements, each of the left and right subarrays must have at least one element. Lines 4 and 5 conquer by recur- sively finding maximum subarrays within the left and right subarrays, respectively. Lines 6–11 form the combine part. Line 6 finds a maximum subarray that crosses the midpoint. (Recall that because line 6 solves a subproblem that is not a smaller instance of the original problem, we consider it to be in the combine part.) Line 7 tests whether the left subarray contains a subarray with the maximum sum, and line 8 returns that maximum subarray. Otherwise, line 9 tests whether the right subarray contains a subarray with the maximum sum, and line 10 returns that max- imum subarray. If neither the left nor right subarrays contain a subarray achieving the maximum sum, then a maximum subarray must cross the midpoint, and line 11 returns it. Analyzing the divide-and-conquer algorithm Next we set up a recurrence that describes the running time of the recursive F IND - M AXIMUM -S UBARRAY procedure. As we did when we analyzed merge sort in Section 2.3.2, we make the simplifying assumption that the original problem size is a power of 2, so that all subproblem sizes are integers. We denote by T.n/ the running time of F IND -M AXIMUM -S UBARRAY on a subarray of n elements. For starters, line 1 takes constant time. The base case, when n D 1, is easy: line 2 takes constant time, and so T.1/ D ‚.1/ : (4.5) The recursive case occurs when n > 1. Lines 1 and 3 take constant time. Each of the subproblems solved in lines 4 and 5 is on a subarray of n=2 elements (our assumption that the original problem size is a power of 2 ensures that n=2 is an integer), and so we spend T.n=2/ time solving each of them. Because we have to solve two subproblems—for the left subarray and for the right subarray—the contribution to the running time from lines 4 and 5 comes to 2T.n=2/. As we have 74 Chapter 4 Divide-and-Conquer already seen, the call to F IND -M AX -C ROSSING -S UBARRAY in line 6 takes ‚.n/ time. Lines 7–11 take only ‚.1/ time. For the recursive case, therefore, we have T.n/ D ‚.1/ C 2T.n=2/ C ‚.n/ C ‚.1/ D 2T.n=2/ C ‚.n/ : (4.6) Combining equations (4.5) and (4.6) gives us a recurrence for the running time T.n/ of F IND -M AXIMUM -S UBARRAY: ( ‚.1/ if n D 1 ; T.n/ D (4.7) 2T.n=2/ C ‚.n/ if n > 1 : This recurrence is the same as recurrence (4.1) for merge sort. As we shall see from the master method in Section 4.5, this recurrence has the solution T.n/ D ‚.n lg n/. You might also revisit the recursion tree in Figure 2.5 to un- derstand why the solution should be T.n/ D ‚.n lg n/. Thus, we see that the divide-and-conquer method yields an algorithm that is asymptotically faster than the brute-force method. With merge sort and now the maximum-subarray problem, we begin to get an idea of how powerful the divide- and-conquer method can be. Sometimes it will yield the asymptotically fastest algorithm for a problem, and other times we can do even better. As Exercise 4.1-5 shows, there is in fact a linear-time algorithm for the maximum-subarray problem, and it does not use divide-and-conquer. Exercises 4.1-1 What does F IND -M AXIMUM -S UBARRAY return when all elements of A are nega- tive? 4.1-2 Write pseudocode for the brute-force method of solving the maximum-subarray problem. Your procedure should run in ‚.n2 / time. 4.1-3 Implement both the brute-force and recursive algorithms for the maximum- subarray problem on your own computer. What problem size n0 gives the crossover point at which the recursive algorithm beats the brute-force algorithm? Then, change the base case of the recursive algorithm to use the brute-force algorithm whenever the problem size is less than n0. Does that change the crossover point? 4.1-4 Suppose we change the definition of the maximum-subarray problem to allow the result to be an empty subarray, where the sum of the values of an empty subar- 4.2 Strassen’s algorithm for matrix multiplication 75 ray is 0. How would you change any of the algorithms that do not allow empty subarrays to permit an empty subarray to be the result? 4.1-5 Use the following ideas to develop a nonrecursive, linear-time algorithm for the maximum-subarray problem. Start at the left end of the array, and progress toward the right, keeping track of the maximum subarray seen so far. Knowing a maximum subarray of AŒ1 : : j , extend the answer to find a maximum subarray ending at in- dex j C1 by using the following observation: a maximum subarray of AŒ1 : : j C 1 is either a maximum subarray of AŒ1 : : j or a subarray AŒi : : j C 1, for some 1 i j C 1. Determine a maximum subarray of the form AŒi : : j C 1 in constant time based on knowing a maximum subarray ending at index j. 4.2 Strassen’s algorithm for matrix multiplication If you have seen matrices before, then you probably know how to multiply them. (Otherwise, you should read Section D.1 in Appendix D.) If A D.aij / and B D.bij / are square n n matrices, then in the product C D A B, we define the entry cij , for i; j D 1; 2; : : : ; n, by n X cij D ai k bkj : (4.8) kD1 We must compute n2 matrix entries, and each is the sum of n values. The following procedure takes n n matrices A and B and multiplies them, returning their n n product C. We assume that each matrix has an attribute rows, giving the number of rows in the matrix. S QUARE -M ATRIX -M ULTIPLY.A; B/ 1 n D A:rows 2 let C be a new n n matrix 3 for i D 1 to n 4 for j D 1 to n 5 cij D 0 6 for k D 1 to n 7 cij D cij C ai k bkj 8 return C The S QUARE -M ATRIX -M ULTIPLY procedure works as follows. The for loop of lines 3–7 computes the entries of each row i, and within a given row i, the 76 Chapter 4 Divide-and-Conquer for loop of lines 4–7 computes each of the entries cij , for each column j. Line 5 initializes cij to 0 as we start computing the sum given in equation (4.8), and each iteration of the for loop of lines 6–7 adds in one more term of equation (4.8). Because each of the triply-nested for loops runs exactly n iterations, and each execution of line 7 takes constant time, the S QUARE -M ATRIX -M ULTIPLY proce- dure takes ‚.n3 / time. You might at first think that any matrix multiplication algorithm must take .n3 / time, since the natural definition of matrix multiplication requires that many mul- tiplications. You would be incorrect, however: we have a way to multiply matrices in o.n3 / time. In this section, we shall see Strassen’s remarkable recursive algo- rithm for multiplying n n matrices. It runs in ‚.nlg 7 / time, which we shall show in Section 4.5. Since lg 7 lies between 2:80 and 2:81, Strassen’s algorithm runs in O.n2:81 / time, which is asymptotically better than the simple S QUARE -M ATRIX - M ULTIPLY procedure. A simple divide-and-conquer algorithm To keep things simple, when we use a divide-and-conquer algorithm to compute the matrix product C D A B, we assume that n is an exact power of 2 in each of the n n matrices. We make this assumption because in each divide step, we will divide n n matrices into four n=2 n=2 matrices, and by assuming that n is an exact power of 2, we are guaranteed that as long as n 2, the dimension n=2 is an integer. Suppose that we partition each of A, B, and C into four n=2 n=2 matrices A11 A12 B11 B12 C11 C12 AD ; BD ; C D ; (4.9) A21 A22 B21 B22 C21 C22 so that we rewrite the equation C D A B as C11 C12 A11 A12 B11 B12 D : (4.10) C21 C22 A21 A22 B21 B22 Equation (4.10) corresponds to the four equations C11 D A11 B11 C A12 B21 ; (4.11) C12 D A11 B12 C A12 B22 ; (4.12) C21 D A21 B11 C A22 B21 ; (4.13) C22 D A21 B12 C A22 B22 : (4.14) Each of these four equations specifies two multiplications of n=2 n=2 matrices and the addition of their n=2 n=2 products. We can use these equations to create a straightforward, recursive, divide-and-conquer algorithm: 4.2 Strassen’s algorithm for matrix multiplication 77 S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE.A; B/ 1 n D A:rows 2 let C be a new n n matrix 3 if n == 1 4 c11 D a11 b11 5 else partition A, B, and C as in equations (4.9) 6 C11 D S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE.A11 ; B11 / C S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE.A12 ; B21 / 7 C12 D S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE.A11 ; B12 / C S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE.A12 ; B22 / 8 C21 D S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE.A21 ; B11 / C S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE.A22 ; B21 / 9 C22 D S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE.A21 ; B12 / C S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE.A22 ; B22 / 10 return C This pseudocode glosses over one subtle but important implementation detail. How do we partition the matrices in line 5? If we were to create 12 new n=2 n=2 matrices, we would spend ‚.n2 / time copying entries. In fact, we can partition the matrices without copying entries. The trick is to use index calculations. We identify a submatrix by a range of row indices and a range of column indices of the original matrix. We end up representing a submatrix a little differently from how we represent the original matrix, which is the subtlety we are glossing over. The advantage is that, since we can specify submatrices by index calculations, executing line 5 takes only ‚.1/ time (although we shall see that it makes no difference asymptotically to the overall running time whether we copy or partition in place). Now, we derive a recurrence to characterize the running time of S QUARE - M ATRIX -M ULTIPLY-R ECURSIVE. Let T.n/ be the time to multiply two n n matrices using this procedure. In the base case, when n D 1, we perform just the one scalar multiplication in line 4, and so T.1/ D ‚.1/ : (4.15) The recursive case occurs when n > 1. As discussed, partitioning the matrices in line 5 takes ‚.1/ time, using index calculations. In lines 6–9, we recursively call S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE a total of eight times. Because each recursive call multiplies two n=2 n=2 matrices, thereby contributing T.n=2/ to the overall running time, the time taken by all eight recursive calls is 8T.n=2/. We also must account for the four matrix additions in lines 6–9. Each of these matrices contains n2 =4 entries, and so each of the four matrix additions takes ‚.n2 / time. Since the number of matrix additions is a constant, the total time spent adding ma- 78 Chapter 4 Divide-and-Conquer trices in lines 6–9 is ‚.n2 /. (Again, we use index calculations to place the results of the matrix additions into the correct positions of matrix C , with an overhead of ‚.1/ time per entry.) The total time for the recursive case, therefore, is the sum of the partitioning time, the time for all the recursive calls, and the time to add the matrices resulting from the recursive calls: T.n/ D ‚.1/ C 8T.n=2/ C ‚.n2 / D 8T.n=2/ C ‚.n2 / : (4.16) Notice that if we implemented partitioning by copying matrices, which would cost ‚.n2 / time, the recurrence would not change, and hence the overall running time would increase by only a constant factor. Combining equations (4.15) and (4.16) gives us the recurrence for the running time of S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE: ( ‚.1/ if n D 1 ; T.n/ D 2 (4.17) 8T.n=2/ C ‚.n / if n > 1 : As we shall see from the master method in Section 4.5, recurrence (4.17) has the solution T.n/ D ‚.n3 /. Thus, this simple divide-and-conquer approach is no faster than the straightforward S QUARE -M ATRIX -M ULTIPLY procedure. Before we continue on to examining Strassen’s algorithm, let us review where the components of equation (4.16) came from. Partitioning each n n matrix by index calculation takes ‚.1/ time, but we have two matrices to partition. Although you could say that partitioning the two matrices takes ‚.2/ time, the constant of 2 is subsumed by the ‚-notation. Adding two matrices, each with, say, k entries, takes ‚.k/ time. Since the matrices we add each have n2 =4 entries, you could say that adding each pair takes ‚.n2 =4/ time. Again, however, the ‚-notation subsumes the constant factor of 1=4, and we say that adding two n2 =4 n2 =4 matrices takes ‚.n2 / time. We have four such matrix additions, and once again, instead of saying that they take ‚.4n2 / time, we say that they take ‚.n2 / time. (Of course, you might observe that we could say that the four matrix additions take ‚.4n2 =4/ time, and that 4n2 =4 D n2 , but the point here is that ‚-notation subsumes constant factors, whatever they are.) Thus, we end up with two terms of ‚.n2 /, which we can combine into one. When we account for the eight recursive calls, however, we cannot just sub- sume the constant factor of 8. In other words, we must say that together they take 8T.n=2/ time, rather than just T.n=2/ time. You can get a feel for why by looking back at the recursion tree in Figure 2.5, for recurrence (2.1) (which is identical to recurrence (4.7)), with the recursive case T.n/ D 2T.n=2/C‚.n/. The factor of 2 determined how many children each tree node had, which in turn determined how many terms contributed to the sum at each level of the tree. If we were to ignore 4.2 Strassen’s algorithm for matrix multiplication 79 the factor of 8 in equation (4.16) or the factor of 2 in recurrence (4.1), the recursion tree would just be linear, rather than “bushy,” and each level would contribute only one term to the sum. Bear in mind, therefore, that although asymptotic notation subsumes constant multiplicative factors, recursive notation such as T.n=2/ does not. Strassen’s method The key to Strassen’s method is to make the recursion tree slightly less bushy. That is, instead of performing eight recursive multiplications of n=2 n=2 matrices, it performs only seven. The cost of eliminating one matrix multiplication will be several new additions of n=2 n=2 matrices, but still only a constant number of additions. As before, the constant number of matrix additions will be subsumed by ‚-notation when we set up the recurrence equation to characterize the running time. Strassen’s method is not at all obvious. (This might be the biggest understate- ment in this book.) It has four steps: 1. Divide the input matrices A and B and output matrix C into n=2 n=2 subma- trices, as in equation (4.9). This step takes ‚.1/ time by index calculation, just as in S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE. 2. Create 10 matrices S1 ; S2 ; : : : ; S10 , each of which is n=2 n=2 and is the sum or difference of two matrices created in step 1. We can create all 10 matrices in ‚.n2 / time. 3. Using the submatrices created in step 1 and the 10 matrices created in step 2, recursively compute seven matrix products P1 ; P2 ; : : : ; P7. Each matrix Pi is n=2 n=2. 4. Compute the desired submatrices C11 ; C12 ; C21 ; C22 of the result matrix C by adding and subtracting various combinations of the Pi matrices. We can com- pute all four submatrices in ‚.n2 / time. We shall see the details of steps 2–4 in a moment, but we already have enough information to set up a recurrence for the running time of Strassen’s method. Let us assume that once the matrix size n gets down to 1, we perform a simple scalar mul- tiplication, just as in line 4 of S QUARE -M ATRIX -M ULTIPLY-R ECURSIVE. When n > 1, steps 1, 2, and 4 take a total of ‚.n2 / time, and step 3 requires us to per- form seven multiplications of n=2 n=2 matrices. Hence, we obtain the following recurrence for the running time T.n/ of Strassen’s algorithm: ( ‚.1/ if n D 1 ; T.n/ D 2 (4.18) 7T.n=2/ C ‚.n / if n > 1 : 80 Chapter 4 Divide-and-Conquer We have traded off one matrix multiplication for a constant number of matrix ad- ditions. Once we understand recurrences and their solutions, we shall see that this tradeoff actually leads to a lower asymptotic running time. By the master method in Section 4.5, recurrence (4.18) has the solution T.n/ D ‚.nlg 7 /. We now proceed to describe the details. In step 2, we create the following 10 matrices: S1 D B12 B22 ; S2 D A11 C A12 ; S3 D A21 C A22 ; S4 D B21 B11 ; S5 D A11 C A22 ; S6 D B11 C B22 ; S7 D A12 A22 ; S8 D B21 C B22 ; S9 D A11 A21 ; S10 D B11 C B12 : Since we must add or subtract n=2 n=2 matrices 10 times, this step does indeed take ‚.n2 / time. In step 3, we recursively multiply n=2 n=2 matrices seven times to compute the following n=2 n=2 matrices, each of which is the sum or difference of products of A and B submatrices: P1 D A11 S1 D A11 B12 A11 B22 ; P2 D S2 B22 D A11 B22 C A12 B22 ; P3 D S3 B11 D A21 B11 C A22 B11 ; P4 D A22 S4 D A22 B21 A22 B11 ; P5 D S5 S6 D A11 B11 C A11 B22 C A22 B11 C A22 B22 ; P6 D S7 S8 D A12 B21 C A12 B22 A22 B21 A22 B22 ; P7 D S9 S10 D A11 B11 C A11 B12 A21 B11 A21 B12 : Note that the only multiplications we need to perform are those in the middle col- umn of the above equations. The right-hand column just shows what these products equal in terms of the original submatrices created in step 1. Step 4 adds and subtracts the Pi matrices created in step 3 to construct the four n=2 n=2 submatrices of the product C. We start with C11 D P5 C P4 P2 C P6 : 4.2 Strassen’s algorithm for matrix multiplication 81 Expanding out the right-hand side, with the expansion of each Pi on its own line and vertically aligning terms that cancel out, we see that C11 equals A11 B11 C A11 B22 C A22 B11 C A22 B22 A22 B11 C A22 B21 A11 B22 A12 B22 A22 B22 A22 B21 C A12 B22 C A12 B21 A11 B11 C A12 B21 ; which corresponds to equation (4.11). Similarly, we set C12 D P1 C P2 ; and so C12 equals A11 B12 A11 B22 C A11 B22 C A12 B22 A11 B12 C A12 B22 ; corresponding to equation (4.12). Setting C21 D P3 C P4 makes C21 equal A21 B11 C A22 B11 A22 B11 C A22 B21 A21 B11 C A22 B21 ; corresponding to equation (4.13). Finally, we set C22 D P5 C P1 P3 P7 ; so that C22 equals A11 B11 C A11 B22 C A22 B11 C A22 B22 A11 B22 C A11 B12 A22 B11 A21 B11 A11 B11 A11 B12 C A21 B11 C A21 B12 A22 B22 C A21 B12 ; 82 Chapter 4 Divide-and-Conquer which corresponds to equation (4.14). Altogether, we add or subtract n=2 n=2 matrices eight times in step 4, and so this step indeed takes ‚.n2 / time. Thus, we see that Strassen’s algorithm, comprising steps 1–4, produces the cor- rect matrix product and that recurrence (4.18) characterizes its running time. Since we shall see in Section 4.5 that this recurrence has the solution T.n/ D ‚.nlg 7 /, Strassen’s method is asymptotically faster than the straightforward S QUARE - M ATRIX -M ULTIPLY procedure. The notes at the end of this chapter discuss some of the practical aspects of Strassen’s algorithm. Exercises Note: Although Exercises 4.2-3, 4.2-4, and 4.2-5 are about variants on Strassen’s algorithm, you should read Section 4.5 before trying to solve them. 4.2-1 Use Strassen’s algorithm to compute the matrix product 1 3 6 8 : 7 5 4 2 Show your work. 4.2-2 Write pseudocode for Strassen’s algorithm. 4.2-3 How would you modify Strassen’s algorithm to multiply n n matrices in which n is not an exact power of 2? Show that the resulting algorithm runs in time ‚.nlg 7 /. 4.2-4 What is the largest k such that if you can multiply 3 3 matrices using k multi- plications (not assuming commutativity of multiplication), then you can multiply n n matrices in time o.nlg 7 /? What would the running time of this algorithm be? 4.2-5 V. Pan has discovered a way of multiplying 68 68 matrices using 132,464 mul- tiplications, a way of multiplying 70 70 matrices using 143,640 multiplications, and a way of multiplying 72 72 matrices using 155,424 multiplications. Which method yields the best asymptotic running time when used in a divide-and-conquer matrix-multiplication algorithm? How does it compare to Strassen’s algorithm? 4.3 The substitution method for solving recurrences 83 4.2-6 How quickly can you multiply a k n n matrix by an n k n matrix, using Strassen’s algorithm as a subroutine? Answer the same question with the order of the input matrices reversed. 4.2-7 Show how to multiply the complex numbers a C bi and c C d i using only three multiplications of real numbers. The algorithm should take a, b, c, and d as input and produce the real component ac bd and the imaginary component ad C bc separately. 4.3 The substitution method for solving recurrences Now that we have seen how recurrences characterize the running times of divide- and-conquer algorithms, we will learn how to solve recurrences. We start in this section with the “substitution” method. The substitution method for solving recurrences comprises two steps: 1. Guess the form of the solution. 2. Use mathematical induction to find the constants and show that the solution works. We substitute the guessed solution for the function when applying the inductive hypothesis to smaller values; hence the name “substitution method.” This method is powerful, but we must be able to guess the form of the answer in order to apply it. We can use the substitution method to establish either upper or lower bounds on a recurrence. As an example, let us determine an upper bound on the recurrence T.n/ D 2T.bn=2c/ C n ; (4.19) which is similar to recurrences (4.3) and (4.4). We guess that the solution is T.n/ D O.n lg n/. The substitution method requires us to prove that T.n/ cn lg n for an appropriate choice of the constant c > 0. We start by assuming that this bound holds for all positive m < n, in particular for m D bn=2c, yielding T.bn=2c/ c bn=2c lg.bn=2c/. Substituting into the recurrence yields T.n/ 2.c bn=2c lg.bn=2c// C n cn lg.n=2/ C n D cn lg n cn lg 2 C n D cn lg n cn C n cn lg n ; 84 Chapter 4 Divide-and-Conquer where the last step holds as long as c 1. Mathematical induction now requires us to show that our solution holds for the boundary conditions. Typically, we do so by showing that the boundary condi- tions are suitable as base cases for the inductive proof. For the recurrence (4.19), we must show that we can choose the constant c large enough so that the bound T.n/ cn lg n works for the boundary conditions as well. This requirement can sometimes lead to problems. Let us assume, for the sake of argument, that T.1/ D 1 is the sole boundary condition of the recurrence. Then for n D 1, the bound T.n/ cn lg n yields T.1/ c1 lg 1 D 0, which is at odds with T.1/ D 1. Consequently, the base case of our inductive proof fails to hold. We can overcome this obstacle in proving an inductive hypothesis for a spe- cific boundary condition with only a little more effort. In the recurrence (4.19), for example, we take advantage of asymptotic notation requiring us only to prove T.n/ cn lg n for n n0 , where n0 is a constant that we get to choose. We keep the troublesome boundary condition T.1/ D 1, but remove it from consid- eration in the inductive proof. We do so by first observing that for n > 3, the recurrence does not depend directly on T.1/. Thus, we can replace T.1/ by T.2/ and T.3/ as the base cases in the inductive proof, letting n0 D 2. Note that we make a distinction between the base case of the recurrence (n D 1) and the base cases of the inductive proof (n D 2 and n D 3). With T.1/ D 1, we derive from the recurrence that T.2/ D 4 and T.3/ D 5. Now we can complete the inductive proof that T.n/ cn lg n for some constant c 1 by choosing c large enough so that T.2/ c2 lg 2 and T.3/ c3 lg 3. As it turns out, any choice of c 2 suffices for the base cases of n D 2 and n D 3 to hold. For most of the recurrences we shall examine, it is straightforward to extend boundary conditions to make the inductive assumption work for small n, and we shall not always explicitly work out the details. Making a good guess Unfortunately, there is no general way to guess the correct solutions to recurrences. Guessing a solution takes experience and, occasionally, creativity. Fortunately, though, you can use some heuristics to help you become a good guesser. You can also use recursion trees, which we shall see in Section 4.4, to generate good guesses. If a recurrence is similar to one you have seen before, then guessing a similar solution is reasonable. As an example, consider the recurrence T.n/ D 2T.bn=2c C 17/ C n ; which looks difficult because of the added “17” in the argument to T on the right- hand side. Intuitively, however, this additional term cannot substantially affect the 4.3 The substitution method for solving recurrences 85 solution to the recurrence. When n is large, the difference between bn=2c and bn=2c C 17 is not that large: both cut n nearly evenly in half. Consequently, we make the guess that T.n/ D O.n lg n/, which you can verify as correct by using the substitution method (see Exercise 4.3-6). Another way to make a good guess is to prove loose upper and lower bounds on the recurrence and then reduce the range of uncertainty. For example, we might start with a lower bound of T.n/ D .n/ for the recurrence (4.19), since we have the term n in the recurrence, and we can prove an initial upper bound of T.n/ D O.n2 /. Then, we can gradually lower the upper bound and raise the lower bound until we converge on the correct, asymptotically tight solution of T.n/ D ‚.n lg n/. Subtleties Sometimes you might correctly guess an asymptotic bound on the solution of a recurrence, but somehow the math fails to work out in the induction. The problem frequently turns out to be that the inductive assumption is not strong enough to prove the detailed bound. If you revise the guess by subtracting a lower-order term when you hit such a snag, the math often goes through. Consider the recurrence T.n/ D T.bn=2c/ C T.dn=2e/ C 1 : We guess that the solution is T.n/ D O.n/, and we try to show that T.n/ cn for an appropriate choice of the constant c. Substituting our guess in the recurrence, we obtain T.n/ c bn=2c C c dn=2e C 1 D cn C 1 ; which does not imply T.n/ cn for any choice of c. We might be tempted to try a larger guess, say T.n/ D O.n2 /. Although we can make this larger guess work, our original guess of T.n/ D O.n/ is correct. In order to show that it is correct, however, we must make a stronger inductive hypothesis. Intuitively, our guess is nearly right: we are off only by the constant 1, a lower-order term. Nevertheless, mathematical induction does not work unless we prove the exact form of the inductive hypothesis. We overcome our difficulty by subtracting a lower-order term from our previous guess. Our new guess is T.n/ cn d , where d 0 is a constant. We now have T.n/ .c bn=2c d / C.c dn=2e d / C 1 D cn 2d C 1 cn d ; 86