Lecture Notes: Sets and Functions PDF
Document Details
Uploaded by SleekSweetPea
Tags
Related
Summary
These lecture notes provide an introduction to sets and functions in mathematics, focusing on number systems, particularly the set of real numbers. The notes cover properties of sets and functions, including examples, definitions, and graphs. The document also introduces polynomial functions.
Full Transcript
CHAPTER 1 Sets and functions 1.1. Sets s:set Sets are the building blocks of modern mathematics. We recall them briefly, focussing on number systems, particularl...
CHAPTER 1 Sets and functions 1.1. Sets s:set Sets are the building blocks of modern mathematics. We recall them briefly, focussing on number systems, particularly on the set of real numbers. 1.1.1. Sets. A set consists of elements. Let us begin with a couple of exam- ples of sets. A = set of dogs in iitb campus, B = set of students in MA 105. You can write down many similar examples. 1.1.2. Number systems. Now let us look at some standard sets related to number systems. (1) N = {0, 1, 2, 3,...} = set of natural numbers (2) N+ = {1, 2, 3,...} = set of positive natural numbers (3) Z = {... , −3, −2, −1, 0, 1, 2, 3,... } = set of integers (4) Q = {m/n : m, n ∈ Z, n ̸= 0} = set of rational numbers (5) R = set of real numbers (6) R \ Q = set of irrational numbers Lemma 1.1. There is no rational number whose square is 2. Proof. Suppose (p/q)2 = 2, that is, p2 = 2q 2 for some integers p, q such that q ̸= 0, and p and q have no common factor. Now 2 divides p2 , and hence also divides p. So p = 2r for an integer r. Then 2q 2 = p2 = (2r)2 = 4r2 , and so q 2 = 2r2. Now 2 divides q 2 , and hence also divides q. Thus 2 is a common factor of p and q, which is a contradiction. □ The above result motivates the consideration of number systems which are larger than Q such as R. Remark 1.2. There is no formal definition of a number system. However, the above considerations led to abstract concepts such as monoids, groups, rings, fields (in the later part of the nineteenth and early part of the twentieth century). For example, Z is an example of a ring, while Q and R are examples of fields. For more details, see Artin , Dummit-Foote. 1.1.3. Set of real numbers. It is customary to represent the set of real numbers R as a line as follows. √ 0 1 2 2 4 1.1. SETS 5 Elements of R are points on the line. Have we filled all the “holes” in the line? The set of rational numbers does not achieve this goal, but we believe that the set of real numbers does. There are two standard ways to pass from Q to R, namely, Dedekind cuts, Cauchy sequences. These two √ constructions were made in the nineteenth century around 1870. Note: 2 ∈ R. 1.1.4. Properties of R. We mention that the set of real numbers satisfies the following properties. algebraic properties (related to addition and multiplication). order properties (related to greater than and less than). completeness property. archimedean property (implied by completeness property). The archimedean property says that for any x ∈ R, there is a natural number n ∈ N such that n > x. Let us use this property to prove that between any two distinct real numbers, there is a rational number and an irrational number: l:int-rat-irrat Lemma 1.3. Let a, b ∈ R with a < b. Then there is r ∈ Q and s ∈ R \ Q such that a < r, s < b. Proof. Let us do this in two steps. (i) Let [x] denote the integer part of x, that is, x − 1 < [x] ≤ x. Pick 1 n > b−a , and put m = [na] + 1. Then a < m n√< b. Now take√r := m/n. √ (i), find r ∈ Q such that (ii) Using item √ a + 2 < r < b + 2. Then a < r − 2 < b. Now take s := r − 2. □ 1.1.5. Intervals. We say I ⊆ R is an interval if a, b ∈ I and a < x < b , then x ∈ I. Some standard examples of intervals are given below. For a ≤ b ∈ R, define (a, b) := {x ∈ R : a < x < b} and [a, b] := {x ∈ R : a ≤ x ≤ b}. These are the open interval and closed interval, respectively, from a to b. See illustrations below. Similarly, define (a, ∞) := {x ∈ R : a < x} and [a, ∞) := {x ∈ R : a ≤ x}, and (−∞, b) := {x ∈ R : x < b} and (−∞, b] := {x ∈ R : x ≤ b}. Note: The empty set ∅ and R are also intervals. Observe: ∞ ∞ \ 1 [ 1 (a, b + ) = (a, b] and [a, b − ] = [a, b). n=1 n n=1 n 6 1. SETS AND FUNCTIONS Puzzle 1.4. A man has no money, but fortunately he has a silver bar which is 31 inches long. So he enters into the following agreement with his landlord for paying his March rent. He will pay one inch of his silver bar for each of the 31 days of March. The question is: What is the minimum of pieces he can cut his silver bar into in order to fulfil this requirement? The silliest thing would be to cut the bar into 31 pieces and pay one piece each day. A better way to start would be to have 2 one inch pieces and a 3 inch piece, so that he can pay the first two days with the one inch pieces, and on the third day he can give the 3 inch piece and take back the 2 one inch pieces. He can use these to pay off the fourth and fifth days as well. Puzzle 1.5. A shopkeeper has a single weight of 40 kilos. One day, his son mistakenly drops it on the floor, and it breaks into 4 pieces. The shopkeeper is very angry but his clever son shows him that with these 4 pieces, he can weigh on his balance any item whose weight is an integer between 1 to 40 (both inclusive). What are these 4 weights? 1.2. Functions s:func When we talk of sets, we also need to talk of ways to relate them. This is the notion of a function. We focus mainly on real-valued functions of a real variable. We discuss bounded, monotone, convex functions. We also informally recall many familiar examples; some of them are formalized later in Section 5.3. For functions of more than one real variable, see Section 6.2. 1.2.1. Functions between sets. We specify a function as f : A → B. Here A and B are sets. We say A is the domain of f , and B is the codomain of f. To every element a ∈ A, we have f (a) = b ∈ B. A B f a f (a) domain of f codomain of f We write f (A) for the range of f. It is the set of values taken by f. It is a subset of B. For f : A → B and g : B → C, define composite function g ◦f : A → C by (g ◦ f )(a) := g(f (a)) for a ∈ A. A B C f g a f (a) g(f (a)) domain of f codomain of f codomain of g = domain of g 1.2. FUNCTIONS 7 1.2.2. Graph of a function. The graph of f : A → B is the subset of A×B defined by {(a, f (a)) : a ∈ A}. A schematic illustration is shown below. B f (a) (a,f (a)) A a 1.2.3. Functions between real numbers. If the codomain of f is R, that is, f : A → R, then we say f is real-valued. For example, for A = set of dogs in iitb campus, consider f (a) = weight of dog a, for B = set of students in MA 105, consider f (B) = IQ of student b. We will mainly deal with functions f whose domain is A ⊆ R. For functions on intervals, consider f : [0, 1] → R, f (x) = x2 + 5, g : [0, 1] → (3, 10), g(x) = x2 + 5. Note very carefully: f and g are different functions because their codomains are different! 1.2.4. Absolute value function. An important real-valued function on R is the absolute value function. It is defined by f : R → R, f (x) = |x|, the absolute value of x. Its graph is shown below. y x The absolute value function satisfies the following properties. (i) |x| ≥ 0 with equality iff x = 0. Thus, the range of f is [0, ∞). (ii) |x| = |−x|. (iii) |xy| = |x||y|. (iv) −|x| ≤ x ≤ |x|. (v) |x + y| ≤ |x| + |y|. This is known as the triangle inequality. 1.2.5. Sine and cosine functions. The graphs of the functions f (x) = sin x and f (x) = cos x are shown below. y y x x 8 1. SETS AND FUNCTIONS The graph of the function f (x) = sin(1/x), for x > 0, is shown below. y x (For clarity, we have stretched the x-axis.) The graph oscillates rapidly as it approaches the y-axis. The graph of the function f (x) = x sin(1/x), for x > 0, is shown below. y x The graph oscillates exactly as before, but now the amplitude of the oscilla- tions goes to zero as it approaches the y-axis. 1.2.6. Exponential and logarithm functions. The graphs of the func- tions f (x) = ex and f (x) = log x are shown below. y y x x 1.2.7. Integer part function. The integer part [x] of a real number x is the greatest integer which is less than or equal to x. For example, [.5] = 0, = 2, [2.1] = 2. The graph of the integer part function f (x) = [x] is shown below. y x 1.2. FUNCTIONS 9 ss:poly-func 1.2.8. Polynomial functions. Polynomials in one variable are functions which are finite linear combinations of 1, x, x2 and so on. Each polynomial has a degree. Polynomials of degree zero are constants p(x) = c, degree one are linear functions p(x) = ax + b with a ̸= 0, degree two are quadratic functions p(x) = ax2 + bx + c with a ̸= 0, and so on. The graph of a degree one polynomial (linear) looks as follows. The graph of a degree two polynomial (quadratic) looks as follows. The graph of a degree three polynomial (cubic) looks as follows. The graph of a degree four polynomial (quartic) looks as follows. For each degree, we have drawn two graphs depending on the sign of the leading coefficient. Also, the above pictures show the generic case. They may degenerate in specific cases. For example, compare the graph of f (x) = x3 with the left picture shown above for a cubic. 1.2.9. Bounded and monotone functions. There are properties which a given function may or may not have. For example, for a function f , we can ask whether f is injective (into) or surjective (onto) or bijective (into and onto). Some other important properties are listed below. d:func-bdd Definition 1.6. A function f : A → R is (i) bounded above if there is a real number M (upper bound) such that f (x) ≤ M for x ∈ A, (ii) bounded below if there is a real number M (lower bound) such that M ≤ f (x) for x ∈ A, (iii) bounded if it is bounded above and bounded below, that is, if there are real numbers M1 , M2 such that M1 ≤ f (x) ≤ M2 for x ∈ A. 10 1. SETS AND FUNCTIONS A bounded function can be visualized as follows. y M2 x M1 The global maximum of f is its least upper bound, and the global mini- mum of f is its greatest lower bound. By completeness property of R, these necessarily exist, but they may not be attained at any point of A. d:func-mono Definition 1.7. Let I be an interval, and let f : I → R. We say f is (i) (monotonically) increasing on I if for x1 , x2 ∈ I, x1 < x2 =⇒ f (x1 ) ≤ f (x2 ). (ii) (monotonically) decreasing on I if for x1 , x2 ∈ I, x1 < x2 =⇒ f (x1 ) ≥ f (x2 ). (iii) monotonic on I if it increasing on I, or it is decreasing on I. We use the terms strictly increasing and strictly decreasing if the inequal- ities ≤ and ≥ above can be replaced by < and >. Note: The constant function f (x) = 3 is both increasing and decreasing, but it is not strictly increasing or strictly decreasing. ss:func-conv 1.2.10. Convex functions. Definition 1.8. For I an interval, let f : I → R be a function. (i) f is convex if for p < q in I and t ∈ (0, 1), f (tp + (1 − t)q) ≤ tf (p) + (1 − t)f (q). We use the term strictly convex if the above inequality is strict. (ii) f is (strictly) concave if −f is (strictly) convex. This can also be defined directly by reversing the inequality. 1.2. FUNCTIONS 11 The different points involved in the definition of a convex function are illustrated below. f (q) tf (p) + (1 − t)f (q) f (p) p tp + (1 − t)q q In geometric terms, a function f is convex if the chord joining any pair of points (p, f (p)) and (q, f (q)) on the graph of f lies on or above the graph of f. This is illustrated below. y x Exercise 1.9. Show: The convexity condition can be equivalently written as: For any p < x < q in I, f (q) − f (p) f (x) ≤ f (p) + (x − p). q−p For strict convexity, we replace ≤ by < above. For (strictly) concave, we use ≥ and >. The graph of a typical convex function on R is shown below on the left. The graph of a function on R which is convex but not strictly convex is shown below on the right. A concrete example is the absolute value function. y y x x Mention convex sets, and the fact that a convex set in R is the same as an interval. CHAPTER 2 Sequences 2.1. Sequences s:seq We introduce sequences of real numbers, and define the notion of conver- gence of such a sequence. We connect convergence to the property of being monotone and bounded. This is related to completeness property of R. 2.1.1. Sequences. Definition 2.1. A sequence of real numbers is a function f : N+ → R from the set of positive natural numbers to the set of real numbers. Put f (n) = an. Thus specifying the function f is the same as specifying a1 , a2 , a3 ,.... We shall use the notation {an } for short. We call an the n-th term of the sequence. eg:seq Example 2.2. Here are a few sample examples of sequences. (1) an = 1/n. 1, 1/2, 1/3, 1/4,... (2) an = n. 1, 2, 3, 4,... n (3) an = (−1). −1, 1, −1, 1,... (4) an = n2. 1, 4, 9, 16,... √ (5) an = 2. √ √ √ 2, 2, 2,... This is a constant sequence. (6) an = 2n. 2, 4, 8, 16,... (7) a1 = 1, a2 = 1 and an = an−1 + an−2 for n ≥ 3. 1, 1, 2, 3, 5, 8, 13, 21, 34,... This is the Fibonacci sequence. 12 2.1. SEQUENCES 13 2.1.2. Visualizing a sequence. A sequence may be visualized on the real line as follows by marking its terms a1 , a2 , a3 ,.... a5 a6 a4 a1 a3 a2 It may also be visualized as the graph of the function N+ → R. In the picture below, we have marked the first 5 terms of the sequence. a1 a5 a2 1 2 3 4 5 a4 a3 Remark 2.3. We make some remarks related to the notion of a sequence. (1) A sequence is always infinite. For example, a1 , a2 , a3 , a4 which is a tuple of four real numbers is not a sequence. (2) A sequence need not be given by an algebraic formula. For example, we √ can define a sequence using the digits in the decimal expansion of 2. We can also do something like −1, 3, 4, 5, 2, 2, 2, 2,... , that is, the sequence is constant barring the first few terms. (3) ∞ is not a real number. Thus, −1, 2, ∞, 1/5,... 1 is not a sequence. Similarly, { n−1 } does not define a sequence since it is not defined at n = 1. 1 (4) The formula an = n−5 does not define a sequence (since it is not defined at n = 5). (5) The following is not a sequence.... , a−3 , a−2 , a−1 , a0 , a1 , a2 , a3 ,.... It arises from a function f : Z → R. (6) An example of a sequence which contains each integer exactly once is 0, 1, −1, 2, −2, 3, −3,.... (7) If {an } and {bn } are two sequences, then interleaving gives a third sequence a1 , b1 , a2 , b2 , a3 , b3 , a4 , b4 ,... For example, the sequence an = (−1)n arises by interleaving the con- stant −1 sequence and constant 1 sequence. Exercise 2.4. Construct a sequence which contains all rational numbers. (One way is to use Cantor’s famous diagonalization argument.) 14 2. SEQUENCES 2.1.3. Bounded and monotone sequences. We now define some proper- ties which a given sequence may or may not have. d:seq-bdd Definition 2.5. A sequence {an } of real numbers is (i) bounded above if there is a real number M such that an ≤ M for n ≥ 1, (ii) bounded below if there is a real number M such that M ≤ an for n ≥ 1, (iii) bounded if it is bounded above and bounded below, that is, if there are real numbers M1 , M2 such that M1 ≤ an ≤ M2 for n ≥ 1. A bounded sequence can be visualized on the real line as follows. M1 a5 a6 a4 a1 a3 a2 M2 Definition 2.5 is the special case A := N+ of Definition 1.6. d:seq-mono Definition 2.6. A sequence {an } of real numbers is (i) (monotonically) increasing if a1 ≤ a2 ≤ a3 ≤... , (ii) (monotonically) decreasing if a1 ≥ a2 ≥ a3 ≥... , (iii) monotonic if it is either (monotonically) increasing or decreasing. Exercise 2.7. For sequences in Example 2.2, which of the bounded and monotone properties hold? 2.1.4. Convergence of sequences. Where is a sequence heading? d:seq-conv Definition 2.8 (ϵ–n0 ). Let {an } be a sequence of real numbers. We say {an } is convergent if there is a ∈ R such that the following condition holds. For every ϵ > 0, there is n0 ∈ N+ such that |an − a| < ϵ for n ≥ n0. In this case, we say {an } converges to a, or a is the limit of {an }, and write lim an = a or an → a (as n → ∞). n→∞ If a sequence does not converge, we say the sequence diverges or is diver- gent. Example 2.9. Let us look at convergence in some of our examples. 2.1. SEQUENCES 15 (1) The sequence an = 1/n converges to 0, or equivalently, lim 1 = 0. n→∞ n Why? Let ϵ > 0. By archimedean property, there is n0 ∈ N+ such that 1 n0 < ϵ. Therefore, |an − a| = | n1 | ≤ 1 n0 11. Note very carefully: The definition of convergence only requires us to find one n0 , not necessarily the smallest one. However, it is a good practice to specify the smallest n0 for a given ϵ whenever possible. (5) Many times, we will be dealing with two convergent sequences an → a and bn → b at the same time. In such cases: For ϵ > 0, the sequence {an } will have its n0 , and {bn } will have its n0. By taking the larger of the two, we will have an n0 which works for both. 2.1.5. Uniqueness of a limit. Let {an } be any sequence of real numbers. Parvati says that {an } converges to 10, while Shankar says that {an } converges to 20. Can both of them be right? 10 20 No. Give ϵ = 4 to both of them, and ask them to provide n0. Both cannot succeed since the open intervals (6, 14) and (16, 24) are disjoint as shown in the picture. This argument generalizes to yield the following. Lemma 2.11. Limit of a sequence of real numbers is unique whenever it exists. Proof. Let {an } be such a sequence. Suppose an → a and an → b with a ̸= b. Take ϵ = |a − b|/2 > 0. Let n0 ∈ N+ be such that |an − a| < ϵ and |an − b| < ϵ 16 2. SEQUENCES for n ≥ n0. Then |a − b| ≤ |a − an0 | + |an0 − b| < ϵ + ϵ = |a − b|, which is a contradiction. Hence a = b. □ 2.1.6. Convergent implies bounded. We now relate convergence of a se- quence to its property of being bounded. p:conv-to-bdd Proposition 2.12. Let {an } be a sequence of real numbers. If {an } con- verges, then it is bounded. Equivalently, if {an } is not bounded, then it does not converge. a Proof idea. A finite set of real numbers is always bounded. The problem is that a sequence contains infinitely many real numbers. But if {an } converges, then some tail of this sequence lies in a finite neighbourhood of the limit a. In the above picture, only finitely many terms of the sequence will be outside the blue interval. □ For example: The sequences {n}, {n2 }, {2n } are not bounded, and hence are divergent. The converse of Proposition 2.12 is false. For example, take an = (−1)n. This sequence is bounded but it does not converge. 2.1.7. Algebra of sequences. One can add two sequences, multiply two sequences, scalar multiply a sequence (by a real number). These operations are compatible with the notion of convergence in the following sense. l:seq-lim Lemma 2.13 (Limit theorems). Suppose an → a and bn → b are two convergent sequences of real numbers. Then (i) an + bn → a + b, (ii) ran → ra for r ∈ R, (iii) an bn → ab, (iv) 1/an → 1/a if a ̸= 0. Proof. For item (i): Let ϵ > 0. Since an → a and bn → b, there is n0 ∈ N+ such that |an − a| < ϵ/2 and |bn − b| < ϵ/2 for n ≥ n0. Now using triangle inequality, |(an + bn ) − (a + b)| ≤ |an − a| + |bn − b| < ϵ/2 + ϵ/2 = ϵ for n ≥ n0. Thus, an + bn → a + b. Proofs of items (ii), (iii), (iv) use similar ideas. □ Remark 2.14. For item (iv), strictly speaking, we must require an ̸= 0 for 1/an to make sense. However, since an → a and a ̸= 0, from some point on, the an are indeed nonzero (and convergence of a sequence is not affected if we change finitely many of its terms). l:seq-sand Lemma 2.15 (Sandwich lemma). If an ≤ bn ≤ cn , and an → a and cn → a, then bn → a. 2.1. SEQUENCES 17 Proof. Let ϵ > 0. Since an → a and cn → a, there is n0 ∈ N+ such that a − ϵ < an < a + ϵ and a − ϵ < cn < a + ϵ for n ≥ n0. Since an ≤ bn ≤ cn , a − ϵ < bn < a + ϵ for n ≥ n0. □ Example 2.16. Let us illustrate the sandwich lemma. n3 +3n2 +2 (1) Let an = n4 +7n2 +5. Then an → 0 since 1 3 2 0 ≤ an ≤ n + n2 + n4 → 0. 1 (2) Let an = n sin( n1 ). Then an → 0 since − n1 ≤ an ≤ 1 n and 1 n → 0. 2.1.8. Completeness property. We now give two sufficient conditions for a sequence to converge. This is a partial converse to Proposition 2.12. p:bdd-inc-to-conv Proposition 2.17. Let {an } be a sequence of real numbers. Then: (i) If {an } is increasing and bounded above, then {an } is convergent. (ii) If {an } is decreasing and bounded below, then {an } is convergent. This result can be deduced using completeness property of R. Since we have not discussed the latter, we take the above result for granted. Note: Items (i) and (ii) imply each other by replacing a sequence by its negative. Example 2.18. Let us illustrate the completeness property. (1) The sequence an = 1/n is decreasing and bounded below by 0, hence it converges. (2) Let a1 = 1 and an = 3an−1 6 +2 = 12 an−1 + 13 for n ≥ 2. This sequence is bounded below by 0. Is it decreasing? The first few values are a1 = 1, a2 = 5/6, a3 = 3/4. Now 1 1 2 an ≤ an−1 ⇐⇒ 2 an−1 + 3 ≤ an−1 ⇐⇒ 3 ≤ an−1 for n ≥ 2. Note: a1 ≥ 2/3. If an−1 ≥ 2/3 for some n ≥ 2, then an ≥ 12 ( 23 ) + 1 2 3 = 3. So by induction, an ≥ 2/3 for n ≥ 1. Hence {an } is decreasing. By completeness property, {an } converges (say to a). To compute a, we may proceed as follows. In an = 21 an−1 + 13 , lhs goes to a and rhs goes to 12 a + 31. So a = 12 a + 13 , and hence a = 2/3. Exercise 2.19. Give an example of a sequence {an } of real numbers which is strictly decreasing in absolute value, that is, |an | > |an+1 | for n ≥ 1, but which does not converge. Remark 2.20. Proposition 2.17 fails for Q. For example, we can take the sequence of rational √ numbers 1, 1.4, 1.41, 1.414,... arising from the decimal expansion of 2. This sequence is increasing and bounded above by say the rational number 1.5. But it does not converge in Q. What we are seeing here is the fact that the set of rational numbers Q is not complete. 18 2. SEQUENCES 2.1.9. Important limits. We mention a couple of important limits. Lemma 2.21. Let a ∈ R. Then: (i) If |a| < 1, then limn→∞ an = 0. (ii) If a > 0, then limn→∞ a1/n = 1. Proof. For item (i): The result is clear if a = 0. Let 0 < |a| < 1. Then 1 1 |a| > 1. Write |a| = 1 + h for h > 0. Then 1 = (1 + h)n = 1 + nh + · · · + hn ≥ 1 + nh ≥ nh. |a|n Therefore, 1 0 ≤ |a|n ≤ → 0. nh Result follows by sandwich lemma. For item (ii): The result is clear if a = 1. Let a > 1. Then a1/n > 1. Write a1/n = 1 + hn for hn > 0. Now a = (1 + hn )n ≥ nhn. Therefore, 0 ≤ hn ≤ na. So hn → 0, and a1/n → 1. Finally, let 0 < a < 1. Then a1 > 1. So by previous case, ( a1 )1/n → 1. Therefore, a1/n → 1. □ 2.1.10. Convergence to infinity. Suppose a sequence {an } diverges. Then it makes sense to ask whether {an } is converging to ∞ or −∞ as explained below. We emphasize again that ±∞ are not real numbers. Definition 2.22. Let {an } be a sequence of real numbers. (i) We say {an } converges to ∞ or limn→∞ an = ∞ or an → ∞ if the following condition holds. For every α ∈ R, there is n0 ∈ N+ such that an > α for n ≥ n0. (ii) We say {an } converges to −∞ or limn→∞ an = −∞ or an → −∞ if the following condition holds. For every β ∈ R, there is n0 ∈ N+ such that an < β for n ≥ n0. For example: The sequence an = n2 → ∞ and an = −n3 → −∞. The sequence an = (−1)n n is unbounded but does not converge either to ∞ or to −∞. r:metric Remark 2.23 (Metric spaces). We have focussed on sequences of real num- bers. More generally, a sequence can take values in any set A. However, to define convergence, one needs a notion of distance in A. Such a set A is called a metric space. For A = R, the distance is defined by dist(x, y) := |x − y|, and convergence as in Definition 2.8. This example generalizes to A = Rm. The case m = 2 is explained in Definition 6.5. CHAPTER 3 Continuity 3.1. Continuity s:func-cts The intuitive idea of a continuous function f is that the graph of f has no “breaks”. We now formalize this notion. 3.1.1. Continuous functions. d:func-cts Definition 3.1 (ϵ–δ). Let f : A → R. We say f is continuous at c ∈ A if the following condition holds. For every ϵ > 0, there is δ > 0 such that |x − c| < δ =⇒ |f (x) − f (c)| < ϵ. We say f is continuous on A if f is continuous at each point of A. Example 3.2. Let us illustrate the notion of continuity. (1) Let f (x) = x. Then f is continuous at all c ∈ R. Take δ = ϵ. (2) Let f (x) = 3x − 5. Then f is continuous at all c ∈ R. Take δ = ϵ/3. Then |x − c| < δ implies |(3x − 5) − (3c − 5)| = 3|x − c| < ϵ. (3) Let f (x) = [x]. Then f is continuous at non-integer points and discon- tinuous at integer points. c is a non-integer point. Pick δ > 0 which avoids the adjacent integer points. c is an integer point. Give ϵ = 1/2. No choice of δ works. (4) Consider the Dirichlet function ( 1 if x ∈ Q, f : [0, 1] → R, f (x) = 0 if x ∈ R \ Q. It is discontinuous at all points. Give ϵ = 1/2. No choice of δ works because in any open interval there is always a rational and an irrational by Lemma 1.3. Exercise 3.3. Let f : A → R be continuous at c ∈ A, and f (c) > 0. Then there is an open interval I containing c such that f (x) > 0 for all c ∈ I. 19 20 3. CONTINUITY 3.1.2. Algebra of continuous functions. One can add two functions, mul- tiply two functions, scalar multiply a function (by a real number). These op- erations are compatible with the notion of continuity in the following sense. l:cts-alg Lemma 3.4. Suppose f, g : A → R are continuous at c ∈ A. Then so are (i) f + g, (ii) rf for r ∈ R, (iii) f g, (iv) 1/f if f (c) ̸= 0. Proof. For item (i): Let ϵ > 0. Since f and g are continuous at c, there is δ > 0 such that |x − c| < δ =⇒ |f (x) − f (c)| < ϵ/2 and |g(x) − g(c)| < ϵ/2. Now using triangle inequality, |(f + g)(x) − (f + g)(c)| ≤ |f (x) − f (c)| + |g(x) − g(c)| < ϵ/2 + ϵ/2 = ϵ. Proofs of items (ii), (iii), (iv) use similar ideas. For item (iv): It suffices to prove that the function 1/x is continuous, and use Lemma 3.5 below. □ l:cts-comp Lemma 3.5. Let f : A → B and g : B → R. If f is continuous at c ∈ A and g is continuous at f (c) ∈ B, then the composite g ◦ f is continuous at c ∈ A. Proof idea. Given ϵ > 0, pick δ ′ > 0 using continuity of g at f (c). Now taking δ ′ > 0 as the ϵ, pick the required δ > 0 using continuity of f at c. □ As a consequence: polynomials in x such as p(x) = x2 , p(x) = 2x3 − 3x + 1 are continuous. rational functions in x, that is r(x) = p(x)/q(x), where p and q are polynomials, is continuous at c ∈ R if q(c) ̸= 0. A function such as x3 sin|x| + cos x2 is continuous. Example 3.6. Define f : R → R by ( x sin(1/x) if x ̸= 0, f (x) = 0 if x = 0. Then f is continuous at c ̸= 0 since it formed out of continuous functions. Let us see what happens at c = 0. Given ϵ > 0, let δ = ϵ. Then |x − 0| < δ =⇒ |f (x) − f (0)| ≤ |x| < δ = ϵ. Hence f is continuous at 0. Exercise 3.7. Define f as above but with x sin(1/x) replaced by sin(1/x). Show: f is not continuous at 0. 3.1.3. Characterization using sequences. We now characterize continu- ity of a function using sequences. This forges a connection between Defini- tion 3.1 and Definition 2.8. p:cts-seq Proposition 3.8. Let f : A → R. Then f is continuous at c ∈ A iff the following condition holds. For any sequence {xn } in A with xn → c, we have f (xn ) → f (c). 3.1. CONTINUITY 21 Proof. Suppose f is continuous at c ∈ A, and xn → c. We want to show f (xn ) → f (c). Let ϵ > 0. Continuity of f at c yields a δ. Using this δ, we find a n0 for xn → c. Thus for n ≥ n0 , we have |xn − c| < δ, and hence |f (xn ) − f (c)| < ϵ as required. Conversely, suppose the condition holds. We prove f is continuous at c by contradiction. So suppose f is not continuous at c. Then there is ϵ > 0 for which no δ works. This gives a sequence xn → c for which |f (xn ) − f (c)| > ϵ for n ≥ 1. This is a contradiction. □ Example 3.9. Let us use Proposition 3.8 to show that certain functions are not continuous at a point. (1) Consider the integer part function f (x) = [x]. At c = 5, f (c) = 5. Let xn = 5 − n1. Then xn → 5, but [xn ] = 4 and so [xn ] ̸→ 5. Thus, f is not continuous at c = 5. (2) Define ( sin(1/x) if x ̸= 0, f (x) = r if x = 0. Then f is continuous at c ̸= 0 since it formed out of continuous func- 2 tions. Let us see what happens at c = 0. Let xn = (2n+1)π. Then xn → 0, but f (xn ) = sin( (2n+1)π 2 ) = (−1)n does not converge. So f is not continuous at c = 0, no matter what r is. 3.1.4. Further properties of continuous functions. t:ivp Theorem 3.10 (Intermediate value property). Let I be an interval, and f : I → R be a continuous function. Let r ∈ R be such that f (x1 ) < r < f (x2 ) for some x1 < x2 in I. Then there is x ∈ (x1 , x2 ) such that f (x) = r. The proof uses completeness property of R, and is omitted. eg:quartic-root Example 3.11. Let us show that the function f (x) = x4 + 2x3 − 2 has a root in (0, 1). Its graph is shown below. The red point is x = 1. y x Since f is a polynomial, it is continuous. Now f (0) = −2 and f (1) = 1. So by IVP, f attains every value between −2 and 1 in the interval (0, 1), and in particular, the value 0. Corollary 3.12. Let f : A → R be a continuous function, and I ⊆ A be an interval. Then f (I) is an interval. Exercise 3.13. Is there a continuous function from [0, 1] onto [2, 3]? onto [2, 3] ∪ [4, 5]? onto (0, ∞)? onto [−1, 1]? 22 3. CONTINUITY Corollary 3.14. Let f : I → R be continuous and injective. Then f is either increasing or decreasing. Also, f −1 : f (I) → R is continuous. Proof. Exercise. □ Let us use the above result to deduce the existence of the square root function √ g : [0, ∞) → [0, ∞), g(x) = x. Take f : [0, ∞) → [0, ∞) with f (x) = x2. This function is continuous and injective. Also f ([0, ∞)) = [0, ∞). Put g = f −1. The graph of f on (0, 2) and of g on (0, 4) are shown below. y y x x t:cts-cpt Theorem 3.15. Let f : [a, b] → R be continuous. Then f is bounded on [a, b] and attains its global maximum and global minimum on [a, b]. Further, f ([a, b]) is a closed and bounded interval. The proof is omitted. Example 3.16. Let us see what can go wrong if the domain is an interval but not a closed interval. (1) Take f : (0, 1) → R with f (x) = x1. Then f is continuous but not bounded. (2) Take f : [0, ∞) → R with f (x) = x. Then f is continuous but not bounded. (3) Take f : (0, 1) → R with f (x) = x. Then f is continuous and bounded, but does not attain its global maximum or global minimum. Exercise 3.17. Construct a continuous function f : R → R such that f takes every value exactly three times. Exercise 3.18. Define the function f : R → R by ( 0 if x is irrational, f (x) = 1/q if x = p/q in lowest terms. Show: f is continuous at all irrational points, but discontinuous at all rational points. Puzzle 3.19. A pilgrim wants to go to a temple on the top of a mountain. He starts from the bottom at 8 in the morning, and reaches the top at 12. He stays there for a week. While coming down, he again starts at 8 in the morning, and reaches the bottom at 11. Show that there is a time between 8 and 11 when the pilgrim was at the same point on the mountain while ascending and descending. 3.2. LIMIT OF A FUNCTION 23 3.2. Limit of a function s:func-lim 3.2.1. Limit of a function. Let f : A → R and c ∈ R be such that there is r > 0 with (c − r, c) ∪ (c, c + r) ⊆ A. In other words, A contains all points within distance r of c, except perhaps the point c. d:func-lim Definition 3.20. We say limx→c f (x) exists if there is ℓ ∈ R such that for every sequence {xn } in A with xn ̸= c and xn → c, we have f (xn ) → ℓ. In this case, we write ℓ = lim f (x), x→c and say f has a limit at c. Example 3.21. Let us illustrate the notion of limit. (1) Define f : R → R by ( 3x + 5 if x = ̸ 0, f (x) = 1 if x = 0. Let xn → 0, xn ̸= 0 for n ≥ 1. Then f (xn ) = 3xn + 5 → 5. Hence limx→0 f (x) = 5. (2) Let f (x) = [x]. Let xn = 5 + (1/n), so xn → 5. Also f (xn ) = 5, so f (xn ) → 5. Let xn = 5 − (1/n), so xn → 5. Also f (xn ) = 4, so f (xn ) → 4. Thus limx→5 f (x) does not exist. (3) Let f (x) = sin(1/x) for x ∈ R \ {0}. 2 Let xn = (2n+1)π , so xn → 0, but f (xn ) = sin( (2n+1)π 2 ) = (−1)n does not converge. Thus limx→0 f (x) does not exist. Remark 3.22 (ϵ–δ). Equivalently, similar to Definition 3.1 for continuity, we say: lim f (x) = ℓ x→c if the following condition holds. For every ϵ > 0, there is δ > 0 such that 0 < |x − c| < δ =⇒ |f (x) − ℓ| < ϵ. It is possible to take this as a definition, and deduce Definition 3.20 as a consequence. 3.2.2. Algebra of limits of functions. The operations of addition, mul- tiplication, scalar multiplication on functions are compatible with the notion of taking limits in the following sense. l:lim-alg Lemma 3.23 (Limit theorems). Suppose limx→c f (x) and limx→c g(x) ex- ist. Then (i) lim (f + g)(x) = lim f (x) + lim g(x), x→c x→c x→c (ii) lim rf (x) = r lim f (x) for r ∈ R. x→c x→c 24 3. CONTINUITY (iii) lim (f g)(x) = ( lim f (x))( lim g(x)), x→c x→c x→c (iv) 1 1 lim (x) = (if denominator ̸= 0). x→c f limx→c f (x) Proof. Follows from Lemma 2.13 for sequences. □ t:func-sand Lemma 3.24 (Sandwich lemma). If f (x) ≤ g(x) ≤ h(x), and limx→c f (x) = ℓ and limx→c h(x) = ℓ, then limx→c g(x) = ℓ. Proof. Follows from sandwich Lemma 2.15 for sequences. □ 3.2.3. Continuity and limit. We say c ∈ R is an interior point of A ⊆ R if there is r > 0 such that (c − r, c + r) ⊆ A. p:cty-lim Proposition 3.25. Let f : A → R, and c be an interior point of A. Then f is continuous at c iff limx→c f (x) exists and is equal to f (c). Proof idea. We use characterization of continuity given by Proposition 3.8. Forward implication is straightforward. For backward implication: Let xn → c. Break {xn } into two subsequences: One contains terms not equal to c, and other contains terms equal to c. Both subsequences, after applying f , converge to f (c). Hence, f (xn ) → f (c), as required. (Ignore either of the two subsequences if it is finite.) □ 3.2.4. Left and right limits. Definition 3.26. We build on Definition 3.20. (i) We say limx→c− f (x) exists if there is ℓ ∈ R such that for every sequence {xn } in A with xn < c and xn → c, we have f (xn ) → ℓ. In this case, we say f has a left limit at c. (ii) We say limx→c+ f (x) exists if there is ℓ ∈ R such that for every sequence {xn } in A with xn > c and xn → c, we have f (xn ) → ℓ. In this case, we say f has a right limit at c. Proposition 3.27. We have: f has a limit at c iff f has a left limit and right limit at c, and they are equal. 3.2.5. Types of discontinuities. Suppose f : A → R is discontinuous at an interior point c ∈ A. Then one of the following happens. limx→c f (x) does not exist. – Either left limit or right limit of f (x) at c does not exist (essential discontinuity). – Left and right limits of f (x) at c exist, but are not equal (jump discontinuity). limx→c f (x) exists, but is not equal to f (c) (removable discontinuity). 3.2. LIMIT OF A FUNCTION 25 3.2.6. Convergence to and at infinity of a function. We mention that it is possible to make sense of the limits lim f (x) = ℓ, lim f (x) = ℓ, x→∞ x→−∞ and also of lim f (x) = ∞, lim f (x) = −∞. x→c x→c The latter two can also be applied to left and right limits. For example, 1 1 1 1 lim = 0, lim = 0, lim = ∞, lim = −∞. x→∞ x x→−∞ x x→0+ x x→0− x r:metric-2 Remark 3.28 (Metric spaces). We build on Remark 2.23. Let X and Y be metric spaces. It makes sense to define a continuous function f : X → Y as in Definition 3.1, with |x − c| replaced by dist(x, c) (distance in X), and |f (x) − f (c)| replaced by dist(f (x), f (c)) (distance in Y ). For the example of f : R2 → R, see Definition 6.7. An even more general context for continuous functions is that of topolog- ical spaces (in which there is a qualitative rather than quantitative notion of what it means for two points to be close to each other). For more details, see Munkres [12, Chapter 2]. CHAPTER 4 Differentiability 4.1. Differentiability s:func-diff The intuitive idea of a differentiable function f is that the graph of f has tangents which are not vertical (that is, of finite slope). See illustration below. We now formalize this notion. y f (x) x 4.1.1. Differentiable functions. Let A ⊆ R, and c be an interior point of A. Definition 4.1. A function f : A → R is differentiable at c if the limit f (c + h) − f (c) lim h→0 h exists. We denote it by f ′ (c), and call it the derivative of f at c. Equivalently, a function f : A → R is differentiable at c if there is a real number α such that f (c + h) − f (c) − αh eq:diff (4.1) lim = 0. h→0 h In this case, we say that α is the derivative of f at c. Note: One may also replace h by |h| in the denominator in (4.1). Example 4.2. Let us illustrate the notion of differentiability. (1) Let f : R → R be a constant function. Then f is differentiable and f ′ (c) = 0 for all c ∈ R. (2) Let f1 , f2 , f3 : R → R be f1 (x) = x, f2 (x) = x2 , f3 (x) = x2/3. Their graphs are shown below. y y y x x x 26 4.1. DIFFERENTIABILITY 27 We have: f1 is differentiable and f1′ (c) = 1 for all c ∈ R. f2 is differentiable and f2′ (c) = 2c for all c ∈ R. f3 is differentiable at c = ̸ 0, but, it is not differentiable at 0 since f3 (0 + h) − f3 (0) 1 = 1/3 h h whose limit does not exist as h → 0. (3) Let f (0) = 0 and f (x) = x sin(1/x) for x ∈ R \ {0}. Then f is not differentiable at 0 since f (0 + h) − f (0) 1 = sin h h whose limit does not exist as h → 0. 4.1.2. Left and right derivatives. Let f : A → R. (i) Suppose c ∈ A is such that [c, c + r) ⊆ A for some r > 0. If the limit f (c + h) − f (c) lim h→0+ h exists, then we call it the right derivative of f at c, and denote it by ′ f+ (c). (ii) Suppose c ∈ A is such that (c − r, c] ⊆ A for some r > 0. If the limit f (c + h) − f (c) lim h→0− h exists, then we call it the left derivative of f at c, and denote it by ′ f− (c). Lemma 4.3. If c is an interior point of A, then f : A → R is differentiable ′ ′ at c iff f+ (c) and f− (c) both exist and are equal. ′ ′ Example 4.4. Let f (x) = |x|. Then f− (0) = −1 and f+ (0) = 1. Hence f is not differentiable at 0. 4.1.3. Derivative function. Let us now focus on the case when the domain of f is an interval I. We say f : (a, b) → R is differentiable on (a, b) if f is differentiable at every c ∈ (a, b). In this case, define f ′ : (a, b) → R, c 7→ f ′ (c). We call f ′ the derivative of f. We make a similar definition when the domain of f is (a, ∞), (−∞, b), R. We say f : [a, b] → R is differentiable on [a, b] if f is differentiable on ′ ′ (a, b), and f+ (a) and f− (b) exist. In this case, define f ′ : [a, b] → R, ′ a 7→ f+ (a), c 7→ f ′ (c), b 7→ f− ′ (b) for c ∈ (a, b). We make a similar definition when the domain of f is [a, b), (a, b], [a, ∞), (−∞, b]. 28 4. DIFFERENTIABILITY 4.1.4. Increment function. l:car Lemma 4.5 (Caratheodory lemma). A function f : A → R is differen- tiable at an interior point c of A iff there is a function f1 : A → R which is continuous at c such that f (x) − f (c) = (x − c)f1 (x) ′ for x ∈ A. Moreover, f (c) = f1 (c). We call f1 : A → R the increment function. Note very carefully: f1 depends on the point c. Proof. We make use of Proposition 6.18. Forward implication. Let f be differentiable at c. Define ( f (x)−f (c) x−c if x ∈ A \ {c}, f1 (x) := ′ f (c) if x = c. Then f1 is continuous at c since limx→c f1 (x) = f ′ (c) = f1 (c). Backward implication. Let f1 be as stated. Then f (c + h) − f (c) lim = lim f1 (c + h) = lim f1 (x) = f1 (c) h→0 h h→0 x→c since f1 is continuous at c. Hence f is differentiable at c. □ In other words, the increment function f1 keeps track of slopes of all secants drawn from (c, f (c)). More precisely, f1 (x) is the slope of the line segment joining (c, f (c)) to (x, f (x)) for x ̸= c, and f1 (c) is the slope of the tangent line at (c, f (c)). y c x c:diff-cts Corollary 4.6. If f is differentiable at c, then f is continuous at c. Proof. Let f be differentiable at c. Using Caratheodory Lemma 4.5, write f (x) = f (c) + (x − c)f1 (x). Since f1 is continuous, so is f by Lemma 3.4. Alternatively, lim f (x) = lim f (c) + (x − c)f1 (x) = f (c) x→c x→c by Lemma 3.23. Now use Proposition 6.18. □ Remark 4.7. If f is not continuous at c, then it is not differentiable at c. For example: The function f (x) = [x] is not continuous at 5, hence it is not differentiable at 5. 4.1. DIFFERENTIABILITY 29 The converse of Corollary 4.6 is false. For example: The function f (x) = |x| is continuous at 0, but it is not differentiable at 0. Remark 4.8. Here is an alternative way to phrase Caratheodory lemma. A function f : A → R is differentiable at an interior point c of A iff there is a real number α such that f (c + h) = f (c) + α h + ϵ(h) h where ϵ(h) is defined for small h, and ϵ(h) → 0 as h → 0. Moreover, f ′ (c) = α. 4.1.5. Algebra of differentiable functions. The operations of addition, multiplication, scalar multiplication on functions are compatible with the no- tion of differentiability in the following sense. l:diff-alg Lemma 4.9. Suppose f, g : A → R are differentiable at c ∈ A. Then (i) f + g is differentiable at c, and (f + g)′ (c) = f ′ (c) + g ′ (c), (ii) rf is differentiable at c, and (rf )′ (c) = rf ′ (c) for r ∈ R, (iii) f g is differentiable at c, and (f g)′ (c) = f ′ (c)g(c) + f (c)g ′ (c), (iv) 1/f is differentiable at c, and −f ′ (c) (1/f )′ (c) = f (c)2 if f (c) ̸= 0. Proof. For item (i): Write f (x) = f (c) + (x − c)f1 (x) and g(x) = g(c) + (x − c)g1 (x). Then f (x) + g(x) = f (c) + g(c) + (x − c)[f1 (x) + g1 (x)]. Thus, (f + g)(x) = (f + g)(c) + (x − c)(f1 + g1 )(x). Since f1 and g1 are both continuous at c, so is f1 +g1. It serves as the increment function for f + g at the point c. Thus by Caratheodory Lemma 4.5, f + g is differentiable at c. Moreover, (f + g)′ (c) = (f + g)1 (c) = (f1 + g1 )(c) = f1 (c) + g1 (c) = f ′ (c) + g ′ (c). Proofs of items (ii), (iii), (iv) use similar ideas. □ Lemma 4.10 (Chain rule). Let f : A → B and g : B → R. Let c be an interior point of A, and f (c) be an interior point of B. If f is differentiable at c, and g is differentiable at f (c), then the composite g ◦ f : A → R is differentiable at c, and (g ◦ f )′ (c) = g ′ (f (c))f ′ (c). Proof. Exercise. □ 30 4. DIFFERENTIABILITY Example 4.11. Let φ(x) = (4x3 + 3)7 + 2. Define f (x) = 4x3 + 3 and g(y) = y 7 + 2. Then φ = g ◦ f. Hence, φ′ (c) = g ′ (f (c))f ′ (c) = 7(4c3 + 3)6 (12c2 ). l:der-inv Lemma 4.12. Let f : (a, b) → (p, q) be continuous, and a bijection. Let f −1 : (p, q) → (a, b) be the inverse function. Let f be differentiable at c ∈ (a, b), and f ′ (c) ̸= 0. Then f −1 is differentiable at f (c) ∈ (p, q), and 1 (f −1 )′ (f (c)) =. f ′ (c) Proof. Put g = f −1. Then φ = g ◦ f is the identity function on (a, b). By the chain rule, 1 = φ′ (c) = g ′ (f (c))f ′ (c). Therefore, g ′ (f (c)) = 1/f ′ (c). □ Draw a picture. Example 4.13. Let us illustrate Lemma 4.12. (1) Let π π f : (− , ) → (−1, 1), f (x) = sin(x). 2 2 Then f is continuous and a bijection. Its inverse function is denoted sin−1. Put f (c) = d. Thus, 1 1 1 1 (sin−1 )′ (d) = (f −1 )′ (d) = = =p =√. f ′ (c) cos(c) 2 1 − sin c 1 − d2 (2) Fix a positive natural number n ≥ 1. Let f : (0, ∞) → (0, ∞), f (x) = xn. Then f is continuous and a bijection. Put f (c) = d. Thus, 1 1 1 1 (1/n)−1 (f −1 )′ (d) = = = = d. f ′ (c) ncn−1 nd(n−1)/n n Remark 4.14. The derivative of a trigonometric function is again a trigono- metric function. However, the derivative of an inverse trigonometric function is algebraic involving rational functions and square roots. This is because the relations among different trigonometric functions are algebraic, and usually quadratic. For instance, in the above calculation of the derivative of sin−1 , we used the quadratic relation sin2 θ + cos2 θ = 1. 4.2. Maxima and minima s:func-maxmin The derivative provide an effective tool to solve maxima and minima (optimization) problems. Conversely, one can use these ideas to prove results about the derivative such that the mean value theorem. This establishes a clear connection between sign of the first derivative and increasing/decreasing functions. Going one step further, there is a connection between sign of the second derivative and convex/concave functions. 4.2. MAXIMA AND MINIMA 31 4.2.1. Global and local maxima/minima. Let f : A → R be a function. Definition 4.15. We say: (i) f has a global maximum at c if f (x) ≤ f (c) for x ∈ A. In this case, f (c) is the least upper bound of f , and it is attained at c. (ii) f has a global minimum at c if f (x) ≥ f (c) for x ∈ A. In this case, f (c) is the greatest lower bound of f , and it is attained at c. Definition 4.16. We say: (i) f has a local maximum at c if there is δ > 0 such that |x−c| < δ implies f (x) ≤ f (c). (ii) f has a local minimum at c if there is δ > 0 such that |x−c| < δ implies f (x) ≥ f (c). Note: Global maximum (minimum) implies local maximum (minimum), but the converse is false. Note: A constant function has both a global maximum and a global minimum at all points. We say f has a global (local) extremum at c if it has either a global (local) maximum at c, or a global (local) minimum at c. 4.2.2. Local maxima/minima: necessary condition. l:ext-diff-zero Lemma 4.17. Let c be an interior point of A. If f : A → R is differentiable at c, and has either a local maximum or a local minimum at c, then f ′ (c) = 0. See illustrations below. Proof. Suppose f has a local minimum at c. Thus, for small h, f (c + h) − f (c) ≥ 0. f (c + h) − f (c) ′ h>0: ≥ 0. Hence, f+ (c) ≥ 0. h f (c + h) − f (c) ′ h 0, and f1 (c + h) ≤ 0 for h < 0. And f1 (x) is continuous at c, so f1 (c) = 0. We can make a similar argument when f has a local maximum at c. □ Remark 4.18. We make some remarks related to the above result. (1) Let f : [−1, 1] → R with f (x) = x2. Then f has a local minimum at the interior point 0, and indeed f ′ (0) = 0 as claimed by Lemma 4.17. (2) Let f : [0, 1] → R with f (x) = x. Then f has a local minimum at 0 ′ ′ and local maximum at 1. But f+ (0) ̸= 0 and f− (1) ̸= 0. This does not contradict Lemma 4.17 since 0 and 1 are not interior points. (3) Let f : [−1, 1] → R with f (x) = x3. Then f ′ (0) = 0, but f does not have a local maximum or a local minimum at 0. Thus, the converse to Lemma 4.17 is false. 32 4. DIFFERENTIABILITY 4.2.3. Rolle’s theorem and mean value theorem. We now discuss Rolle’s theorem and the mean value theorem. The former is a special case of the lat- ter. The latter is attributed to Lagrange. t:rolle Theorem 4.19 (Rolle’s theorem). Let f : [a, b] → R be such that (i) f is continuous on [a, b], (ii) f is differentiable on (a, b), (iii) f (a) = f (b). Then there is c ∈ (a, b) such that f ′ (c) = 0. See illustration below. y f (a) = f (b) a x b Proof. We consider two cases. f is constant. Then f ′ (c) = 0 for all c ∈ (a, b). f is not a constant. Then the global minimum of f is strictly smaller than the global maximum of f. Since f is continuous, by Theorem 3.15, both are attained on [a, b]. Both cannot be at a and b since f (a) = f (b). Hence, there is c ∈ (a, b) such that f has either a global maximum or a global minimum at c. Global maximum/minimum implies local maximum/minimum, so by Lemma 4.17, f ′ (c) = 0. □ eg:quartic-root-r Example 4.20. Let us return to Example 3.11. We saw by IVP that the function f (x) = x4 + 2x3 − 2 has a root in (0, 1). Now let us show that f (x) = x4 + 2x3 − 2 has exactly one root in (0, 1). Suppose there are two roots in (0, 1). Say f (a) = 0 = f (b) for 0 < a < b < 1. Then by Rolle’s theorem, f ′ (c) = 0 for some c ∈ (a, b). Now f ′ (x) = 4x3 + 6x2 = 2x2 (2x + 3) ̸= 0 for x ∈ (0, 1). This is a contradiction. y y x x The graphs of f and f ′ are shown above. The red point is x = 1. t:mvt Theorem 4.21 (Mean value theorem). Let f : [a, b] → R be such that (i) f is continuous on [a, b], (ii) f is differentiable on (a, b). 4.2. MAXIMA AND MINIMA 33 Then there is c ∈ (a, b) such that f (b) − f (a) f ′ (c) =. b−a See illustration below. y f (b) f (a) a c x b Proof. For x ∈ [a, b], define f (b) − f (a) F (x) := f (x) − (x − a). b−a Then F is continuous on [a, b], differentiable on (a, b) and F (a) = f (a) = F (b). By Rolle’s theorem, there is c ∈ (a, b) such that F ′ (c) = 0, that is, f ′ (c) = f (b)−f b−a (a). □ Remark 4.22 (Physical interpretation). Let f (t) denote the displace- ment of a particle at time t for a ≤ t ≤ b. Then the average speed is f (b)−f b−a (a) , and speed at time c is f ′ (c). Thus, MVT says that there is a time c such that the speed at time c equals the average speed. Remark 4.23. Note very carefully: Rolle’s theorem and the mean value theorem are results about the derivative, and make no direct reference to the notions of minima and maxima. Then why are they in this section, and not in Section 4.1? The reason is that the proof of Rolle’s theorem uses a result about minima and maxima. Rolle’s theorem is a corollary of the mean value theorem obtained by imposing the additional hypothesis f (a) = f (b). Then why is it stated earlier rather than later? The reason is that Rolle’s theorem is used in the proof of the mean value theorem. 4.2.4. Mean value inequality. Lemma 4.24. Let f : [a, b] → R be such that f is continuous on [a, b], and differentiable on (a, b). If m ≤ f ′ (x) ≤ M for all x ∈ (a, b), then m(b − a) ≤ f (b) − f (a) ≤ M (b − a). This is the mean value inequality. Proof. This follows from Theorem 4.21 (MVT). □ √ Example 4.25. Fix n. Define f : [n, n + 1] → R by f (x) = x. Then f ′ (x) = 2√ 1 x. Moreover, 1 1 √ ≤ f ′ (x) ≤ √. 2 n+1 2 n 34 4. DIFFERENTIABILITY Therefore, by the mean value inequality, 1 √ √ 1 √ (n + 1 − n) ≤ n + 1 − n ≤ √ (n + 1 − n). 2 n+1 2 n 1 √ √ For n = 1, we get 2√ 2 ≤ 2 − 1 ≤ 12. Therefore, 2 < 32. To get a lower √ √ bound, we use √12 > 23. So 21 23 < 2 − 1 which yields 43 < 2. Thus, 4 √ 3 < 2<. 3 2 4.2.5. Increasing and decreasing functions. l:inc-diff Lemma 4.26. Let f : [a, b] → R be such that f is continuous on [a, b], and differentiable on (a, b). (1) If f ′ (x) = 0 for x ∈ (a, b), then f is constant on [a, b]. (Converse true). (2) (i) If f ′ (x) ≥ 0 for x ∈ (a, b), then f is increasing on [a, b]. (Converse true). (ii) If f ′ (x) ≤ 0 for x ∈ (a, b), then f is decreasing on [a, b]. (Converse true). (iii) If f ′ (x) > 0 for x ∈ (a, b), then f is strictly increasing on [a, b]. (Converse false). (iv) If f ′ (x) < 0 for x ∈ (a, b), then f is strictly decreasing on [a, b]. (Converse false). Proof. These can be deduced from Theorem 4.21 (MVT). □ eg:quad Example 4.27. Define f : R → R by f (x) = x(1 − x). Its graph is shown below. y x Then f ′ (x) = 1 − 2x. Thus, f ′ (x) > 0 if x < 12 , and f ′ (x) < 0 if x > 21. So, f is strictly increasing on (−∞, 12 ), and strictly decreasing on ( 12 , ∞). 4.2.6. Convex functions. Recall convex functions from Section 1.2.10. We now relate them to differentiability. l:conv-diff Lemma 4.28. Let I be an interval and f : I → R be differentiable. Then (i) f′ is increasing on I iff f is convex on I. (ii) f′ is decreasing on I iff f is concave on I. (iii) f′ is strictly increasing on I iff f is strictly convex on I. (iv) f′ is strictly decreasing on I iff f is strictly concave on I. Proof. See [8, Proposition 4.31]. Note: Items (i) and (ii) imply each other, while items (iii) and (iv) imply each other. □ 4.2. MAXIMA AND MINIMA 35 An illustration of item (i) is shown below. y x Note how the slopes of the tangents increase as we move from left to right. c:conv-diff Corollary 4.29. Let I be an interval and f : I → R be twice differentiable. Then (i) f ′′ ≥ 0 on I iff f is convex on I. (ii) f ′′ ≤ 0 on I iff f is concave on I. (iii) If f ′′ > 0 on I, then f is strictly convex on I. (iv) If f ′′ < 0 on I, then f is strictly concave on I. Example 4.30. This result gives a test for convexity as illustrated below. (1) The function f (x) = x2 is strictly convex since f ′′ (x) = 2 > 0 at all points. Its graph is shown below on the left. The function f (x) = x4 is convex since f ′′ (x) = 12x2 ≥ 0. Its graph is shown below on the right. In fact, it is strictly convex, even though the second derivative is not strictly positive at all points. y y x x (2) The exponential function f (x) = ex is strictly convex since f ′′ (x) = ex > 0. Its graph is shown below on the left. The logarithm function f (x) = log x is strictly concave since f ′′ (x) = −1/x2 < 0. Its graph is shown below on the right. y y x x 4.2.7. Critical points and global maxima/minima. Let f : A → R. An interior point c of A is a critical point of f if either f is not differentiable at c, or if f is differentiable at c and f ′ (c) = 0. l:global-ext Lemma 4.31. Let f : [a, b] → R be continuous. Then the global minimum and global maximum of f are attained at points which are either critical points of f or endpoints of [a, b]. Proof. Suppose f (c) is a global maximum. We consider two cases. c = a or c = b. Then c is an endpoint of [a, b]. 36 4. DIFFERENTIABILITY c ∈ (a, b). We consider two subcases. – f is not differentiable at c. Then c is a critical point of f. – f is differentiable at c. Since f has a global maximum at c, it has a local maximum at c. Hence f ′ (c) = 0 by Lemma 4.17, and c is a critical point of f. The argument for a global minimum is similar. □ Thus, to find the global maximum and global minimum of f , we first find the critical points of f. Then we evaluate f at these critical points, and at endpoints of [a, b]. Among these values, the largest is the global maximum, and smallest is the global minimum. Example 4.32. Let f : [−1, 2] → R be defined by ( −x if − 1 ≤ x ≤ 0, f (x) = 3 2 2x − 4x + 2x if 0 ≤ x ≤ 2. It is continuous everywhere. Its graph is shown below. y 1 x −1 3 1 2 Observe ( −1 if − 1 ≤ x < 0, f ′ (x) = 2(3x − 1)(x − 1) if 0 < x ≤ 2. Note: f is not differentiable at x = 0. Also, f ′ (x) = 0 iff x = 31 or x = 1. So x = 0, 13 , 1 are the critical points. By computing f at the critical points, and at the endpoints, we deduce: f has global maximum 4 attained at x = 2, f has global minimum 0 attained at x = 0 and x = 1. 4.2.8. Local maxima/minima: sufficient conditions. Let f : A → R. Let c be an interior point of A with (c − δ, c + δ) ⊆ A. l:loc-ext-1 Lemma 4.33 (First derivative test). Let f be differentiable on (c − δ, c) and (c, c + δ). Then: (i) If f ′ ≥ 0 on (c − δ, c) and f ′ ≤ 0 on (c, c + δ), and f is continuous at c, then f has a local maximum at c. (ii) If f ′ ≤ 0 on (c − δ, c) and f ′ ≥ 0 on (c, c + δ), and f is continuous at c, then f has a local minimum at c. Proof. For item (i): Since f ′ ≥ 0 on (c − δ, c), f is increasing on (c − δ, c). Similarly, since f ′ ≤ 0 on (c, c + δ), f is decreasing on (c, c + δ). Finally, since f is continuous at c, we deduce that f (c) ≥ f (x) for x ∈ (c − δ, c + δ). Argument for item (ii) is similar. □