Calculus Lecture Notes PDF
Document Details
Uploaded by SupportiveCthulhu
Swapneel Mahajan
Tags
Summary
These lecture notes cover calculus, focusing on functions of one and several real variables. They include chapters on sequences, limits, continuity, differentiability, and integration. The notes are likely intended for university-level mathematics study.
Full Transcript
Calculus Swapneel Mahajan Contents Contents iii References 1 Remarks on the conditional 1 Pattern of mathematical writing 1 Patte...
Calculus Swapneel Mahajan Contents Contents iii References 1 Remarks on the conditional 1 Pattern of mathematical writing 1 Pattern of mathematical learning 2 Part I. Functions of one real variable 3 Chapter 1. Sets and functions 4 1.1. Sets 4 1.1.1. Sets 4 1.1.2. Number systems 4 1.1.3. Set of real numbers 4 1.1.4. Properties of R 5 1.1.5. Intervals 5 1.2. Functions 6 1.2.1. Functions between sets 6 1.2.2. Graph of a function 7 1.2.3. Functions between real numbers 7 1.2.4. Absolute value function 7 1.2.5. Sine and cosine functions 7 1.2.6. Exponential and logarithm functions 8 1.2.7. Integer part function 8 1.2.8. Polynomial functions 9 1.2.9. Bounded and monotone functions 9 1.2.10. Convex functions 10 Chapter 2. Sequences 12 2.1. Sequences 12 2.1.1. Sequences 12 2.1.2. Visualizing a sequence 13 2.1.3. Bounded and monotone sequences 14 2.1.4. Convergence of sequences 14 2.1.5. Uniqueness of a limit 15 2.1.6. Convergent implies bounded 16 2.1.7. Algebra of sequences 16 2.1.8. Completeness property 17 2.1.9. Important limits 18 2.1.10. Convergence to infinity 18 iii iv CONTENTS Chapter 3. Continuity 19 3.1. Continuity 19 3.1.1. Continuous functions 19 3.1.2. Algebra of continuous functions 20 3.1.3. Characterization using sequences 20 3.1.4. Further properties of continuous functions 21 3.2. Limit of a function 23 3.2.1. Limit of a function 23 3.2.2. Algebra of limits of functions 23 3.2.3. Continuity and limit 24 3.2.4. Left and right limits 24 3.2.5. Types of discontinuities 24 3.2.6. Convergence to and at infinity of a function 25 Chapter 4. Differentiability 26 4.1. Differentiability 26 4.1.1. Differentiable functions 26 4.1.2. Left and right derivatives 27 4.1.3. Derivative function 27 4.1.4. Increment function 28 4.1.5. Algebra of differentiable functions 29 4.2. Maxima and minima 30 4.2.1. Global and local maxima/minima 31 4.2.2. Local maxima/minima: necessary condition 31 4.2.3. Rolle’s theorem and mean value theorem 32 4.2.4. Mean value inequality 33 4.2.5. Increasing and decreasing functions 34 4.2.6. Convex functions 34 4.2.7. Critical points and global maxima/minima 35 4.2.8. Local maxima/minima: sufficient conditions 36 4.2.9. Points of inflection 37 4.2.10. Asymptotes 40 Chapter 5. Integration 42 5.1. Riemann integral 42 5.1.1. Riemann integrable functions 42 5.1.2. Riemann integral 43 5.1.3. Riemann sums 43 5.1.4. Domain additivity 43 5.1.5. Monotone functions 44 5.1.6. Continuous functions 44 5.1.7. Algebra of Riemann integrable functions 45 5.1.8. Further properties of the Riemann integral 45 5.1.9. Application: computing limits 46 5.2. Fundamental theorem of calculus 46 5.2.1. FTC. Part I 47 5.2.2. FTC. Part II 48 5.2.3. Integration by parts 48 CONTENTS v 5.2.4. Integration by substitution 48 5.3. Defining functions using the Riemann integral 49 5.3.1. Logarithmic function 49 5.3.2. Exponential function 50 5.3.3. Real powers of positive real numbers 50 5.3.4. Inverse trigonometric and trigonometric functions 51 5.4. Lengths, areas, volumes 51 5.4.1. Areas between curves 51 5.4.2. Volumes of solids 53 5.4.3. Arc length of a parametrized curve 55 5.4.4. Area of surface of revolution 57 Part II. Functions of several real variables 59 Chapter 6. Continuity 60 6.1. Real vector space 60 6.1.1. Real vector space 60 6.1.2. Dot product 60 6.1.3. Norm 61 6.1.4. Ball around a point 61 6.2. Functions of two real variables 62 6.2.1. Natural domain 62 6.2.2. Interior and boundary points 62 6.2.3. Bounded region 63 6.2.4. Graph of a function 63 6.2.5. Level curves and contour lines 63 6.3. Sequences 64 6.3.1. Sequences in R2 64 6.3.2. Bounded sequences 65 6.3.3. Convergence of sequences 65 6.4. Continuity 66 6.4.1. Continuous functions 66 6.4.2. Algebra of continuous functions 66 6.4.3. Characterization using sequences 67 6.4.4. Further properties of continuous functions 68 6.5. Limit of a function 69 6.5.1. Limit of a function 69 6.5.2. Algebra of limits of functions 69 6.5.3. Continuity and limit 70 Chapter 7. Differentiability 71 7.1. Differentiability 71 7.1.1. Partial derivatives 71 7.1.2. Directional derivatives 73 7.1.3. Differentiability 73 7.1.4. Pair of increment functions 74 7.1.5. Algebra of differentiable functions 75 7.1.6. Geometric interpretation of the gradient 78 vi CONTENTS 7.1.7. Higher partial derivatives 79 7.2. Tangent plane to a surface 79 7.2.1. Tangent line to a curve 80 7.2.2. Tangent plane to a surface 80 7.3. Maxima and minima 81 7.3.1. Global and local maxima/minima 81 7.3.2. Saddle points 82 7.3.3. Local maxima/minima: necessary condition 83 7.3.4. Local maxima/minima, saddle points: sufficient condition 84 7.3.5. Critical points and global maxima/minima 86 7.3.6. Contrained extrema 87 Chapter 8. Integration 88 8.1. Riemann integral on a rectangle 88 8.1.1. Riemann integrable functions 88 8.1.2. Riemann integral on a rectangle 89 8.1.3. Riemann sums 89 8.1.4. Monotone functions 90 8.1.5. Continuous functions 90 8.1.6. Fubini’s theorem 90 8.1.7. Application: computing limits 92 8.2. Riemann integral in the plane 92 8.2.1. Riemann integral on a general region 92 8.2.2. Algebra of Riemann integrable functions 93 8.2.3. Elementary regions 93 8.2.4. Area of a general region 94 8.3. Change of variables 95 8.3.1. Jacobian matrix 95 8.3.2. Change of variables formula 96 8.3.3. Polar coordinates 97 8.4. Riemann integral in space 98 8.4.1. Riemann integral on a cuboid 98 8.4.2. Riemann integral on a general region 98 8.4.3. Change of variables 99 8.4.4. Cylindrical coordinates 99 8.4.5. Spherical coordinates 100 Chapter 9. Differential forms 102 9.1. Scalar and vector fields 102 9.1.1. Scalar fields 102 9.1.2. Vector fields 103 9.2. Gradient, curl, divergence 105 9.2.1. Gradient in three dimensions 105 9.2.2. Curl 106 9.2.3. Divergence in three dimensions 106 9.2.4. Gradient, curl, divergence 106 9.2.5. When is a vector field a gradient field? 107 9.3. Line integrals and FTC 108 CONTENTS vii 9.3.1. Parametrized curve 108 9.3.2. Length of a parametrized curve 108 9.3.3. Line integral of a scalar field 109 9.3.4. Differential notation 109 9.3.5. Invariance under reparametrization 109 9.3.6. Arc length parametrization 109 9.3.7. Line integral of a vector field 110 9.3.8. Differential notation 110 9.3.9. Relating ds and |ds| 111 9.3.10. Line integral of a gradient field 111 9.3.11. Path independence of line integrals 112 9.3.12. Invariance under reparametrization up to sign 113 9.3.13. Geometric curve 113 9.4. Green’s theorem 114 9.4.1. Orienting the boundary curve 114 9.4.2. Green’s theorem 114 9.4.3. Principle of deformation 116 9.4.4. Area calculation 117 9.5. Surface integrals 119 9.5.1. Parametrized surface 119 9.5.2. Fundamental vector product 119 9.5.3. Area of a parametrized surface 120 9.5.4. Surface integral of a scalar field 121 9.5.5. Invariance under reparametrization 122 9.5.6. Surface integral of a vector field 122 9.5.7. Differential notation 123 9.5.8. Invariance under reparametrization up to sign 123 9.5.9. Relating dS and |dS| 124 9.5.10. Geometric surface 124 9.6. Gauss’s divergence theorem 125 9.6.1. Orienting the boundary surface 125 9.6.2. Gauss’s divergence theorem 125 9.6.3. Principle of deformation 126 9.6.4. Volume calculation 127 9.7. Stokes theorem 128 9.7.1. Orienting the boundary curve 128 9.7.2. Stokes theorem 128 9.7.3. Principle of deformation 130 9.7.4. Curl probe 131 9.8. Differential forms 131 9.8.1. Orientations 132 9.8.2. Differential forms 133 9.8.3. Exterior derivative 133 9.8.4. Stokes theorem 134 Bibliography 136 CONTENTS 1 References. Here is a list of useful general references, which is by no means exhaustive. For set theory: Halmos , Munkres [20, Chapter 1]. For category theory: Mac Lane. For calculus: Apostol [1, 2], Ghorpade-Limaye , Marsden, Tromba, Weinstein. For analysis: Rudin , Pugh , Browder , Munkres. Also useful are Apostol , Simmons , Tao [28, 29]. For geometry on surfaces: Pressley , Thorpe , do Carmo. For manifolds and differential forms: do Carmo , Boothby , Morita , Lee , Spivak. Wikipedia is a good online source for getting a birds-eye-view of many concepts discussed in these notes. Blogs are also useful. Pick a book that suits you. To understand the subject matter, it is not necessary to understand each and every sentence written in a particular book. Remarks on the conditional. Consider the statements. (1) If A, then B. (2) If not B, then not A. (3) If B, then A. (4) If not A, then not B. Statements (1) and (2) imply each other. Similarly, statements (3) and (4) imply each other. Statements (1) and (3) are converses of each other. It is possible that one is true, while the other is false. Similarly, statements (2) and (4) are converses of each other. Avoid/minimize usage of the symbol =⇒. Note very carefully: The statement “A =⇒ B.” means “If A, then B.”. The statement “A. =⇒ B.” means “A. Hence B.”. The two are different. If we write using =⇒ , then the two statements only differ in a fullstop which can be easily missed. So it is better not to use it. Now appreciate the difference in the statements. A =⇒ B. =⇒ C. Better to say: A implies B. Hence C. A. =⇒ B. =⇒ C. Better to say: A. Hence B. Hence C. A. =⇒ B =⇒ C. Better to say: A. Hence B implies C. The terms ‘necessary condition’ and ‘sufficient condition’ also appear of- ten in mathematical writing. Their precise relation to a conditional is as follows. Let us go back to the statement ‘If A, then B.’ Here B is a necessary condition for A, while A is a sufficient condition for B. Pattern of mathematical writing. While writing mathematics, one makes use of some technical constructs. They are as follows, and usually appear in the order given below. Definitions, Lemmas, Propositions, Theorems, Corollaries, 2 CONTENTS Examples. Many times, in mathematical discovery, it is the right definition that one is searching for to explain a bunch of phenomena that are known/believed to be true. Pattern of mathematical learning. Many times, it is hard to immediately comprehend a definition. So one goes ahead, and reads the subsequent lem- mas, theorems, examples. Then one again goes back to the definition followed by the lemmas and so on. This time round, things makes more sense. Then we repeat this process again, and again. Eventually everything makes sense. This process is called “rote learning” which is seeped in the indian tradition of learning. Part I Functions of one real variable CHAPTER 1 Sets and functions 1.1. Sets s:set Sets are the building blocks of modern mathematics. We recall them briefly, focussing on number systems, particularly on the set of real numbers. 1.1.1. Sets. A set consists of elements. Let us begin with a couple of exam- ples of sets. A = set of dogs in iitb campus, B = set of students in MA 105. You can write down many similar examples. 1.1.2. Number systems. Now let us look at some standard sets related to number systems. (1) N = {0, 1, 2, 3,...} = set of natural numbers (2) N+ = {1, 2, 3,...} = set of positive natural numbers (3) Z = {... , −3, −2, −1, 0, 1, 2, 3,... } = set of integers (4) Q = {m/n : m, n ∈ Z, n ̸= 0} = set of rational numbers (5) R = set of real numbers (6) R \ Q = set of irrational numbers Lemma 1.1. There is no rational number whose square is 2. Proof. Suppose (p/q)2 = 2, that is, p2 = 2q 2 for some integers p, q such that q ̸= 0, and p and q have no common factor. Now 2 divides p2 , and hence also divides p. So p = 2r for an integer r. Then 2q 2 = p2 = (2r)2 = 4r2 , and so q 2 = 2r2. Now 2 divides q 2 , and hence also divides q. Thus 2 is a common factor of p and q, which is a contradiction. □ The above result motivates the consideration of number systems which are larger than Q such as R. Remark 1.2 (Algebraic structures). There is no formal definition of a number system. However, the above considerations led to abstract concepts such as monoids, groups, rings, fields (in the later part of the nineteenth and early part of the twentieth century). For example, Z is an example of a ring, while Q and R are examples of fields. For more details, see Artin , Dummit-Foote. 1.1.3. Set of real numbers. It is customary to represent the set of real numbers R as a line as follows. √ 0 1 2 2 4 1.1. SETS 5 Elements of R are points on the line. Have we filled all the “holes” in the line? The set of rational numbers does not achieve this goal, but we believe that the set of real numbers does. There are two standard ways to pass from Q to R, namely, Dedekind cuts, Cauchy sequences. These two √ constructions were made in the nineteenth century around 1870. Note: 2 ∈ R. 1.1.4. Properties of R. We mention that the set of real numbers satisfies the following properties. algebraic properties (related to addition and multiplication). order properties (related to greater than and less than). completeness property. archimedean property (implied by completeness property). The archimedean property says that for any x ∈ R, there is a natural number n ∈ N such that n > x. Let us use this property to prove that between any two distinct real numbers, there is a rational number and an irrational number: l:int-rat-irrat Lemma 1.3. Let a, b ∈ R with a < b. Then there is r ∈ Q and s ∈ R \ Q such that a < r, s < b. Proof. Let us do this in two steps. (i) Let [x] denote the integer part of x, that is, x − 1 < [x] ≤ x. Pick 1 n > b−a , and put m = [na] + 1. Then a < m n√< b. Now take√r := m/n. √ (i), find r ∈ Q such that (ii) Using item √ a + 2 < r < b + 2. Then a < r − 2 < b. Now take s := r − 2. □ 1.1.5. Intervals. We say I ⊆ R is an interval if a, b ∈ I and a < x < b , then x ∈ I. Some standard examples of intervals are given below. For a ≤ b ∈ R, define (a, b) := {x ∈ R : a < x < b} and [a, b] := {x ∈ R : a ≤ x ≤ b}. These are the open interval and closed interval, respectively, from a to b. See illustrations below. Similarly, define (a, ∞) := {x ∈ R : a < x} and [a, ∞) := {x ∈ R : a ≤ x}, and (−∞, b) := {x ∈ R : x < b} and (−∞, b] := {x ∈ R : x ≤ b}. Note: The empty set ∅ and R are also intervals. Observe: ∞ ∞ \ 1 [ 1 (a, b + ) = (a, b] and [a, b − ] = [a, b). n=1 n n=1 n 6 1. SETS AND FUNCTIONS Puzzle 1.4. A man has no money, but fortunately he has a silver bar which is 31 inches long. So he enters into the following agreement with his landlord for paying his March rent. He will pay one inch of his silver bar for each of the 31 days of March. The question is: What is the minimum of pieces he can cut his silver bar into in order to fulfil this requirement? The silliest thing would be to cut the bar into 31 pieces and pay one piece each day. A better way to start would be to have 2 one inch pieces and a 3 inch piece, so that he can pay the first two days with the one inch pieces, and on the third day he can give the 3 inch piece and take back the 2 one inch pieces. He can use these to pay off the fourth and fifth days as well. Puzzle 1.5. A shopkeeper has a single weight of 40 kilos. One day, his son mistakenly drops it on the floor, and it breaks into 4 pieces. The shopkeeper is very angry but his clever son shows him that with these 4 pieces, he can weigh on his balance any item whose weight is an integer between 1 to 40 (both inclusive). What are these 4 weights? 1.2. Functions s:func When we talk of sets, we also need to talk of ways to relate them. This is the notion of a function. We focus mainly on real-valued functions of a real variable. We discuss bounded, monotone, convex functions. We also informally recall many familiar examples; some of them are formalized later in Section 5.3. For functions of more than one real variable, see Section 6.2. 1.2.1. Functions between sets. We specify a function as f : A → B. Here A and B are sets. We say A is the domain of f , and B is the codomain of f. To every element a ∈ A, we have f (a) = b ∈ B. A B f a f (a) domain of f codomain of f We write f (A) for the range of f. It is the set of values taken by f. It is a subset of B. For f : A → B and g : B → C, define composite function g ◦f : A → C by (g ◦ f )(a) := g(f (a)) for a ∈ A. A B C f g a f (a) g(f (a)) domain of f codomain of f codomain of g = domain of g 1.2. FUNCTIONS 7 1.2.2. Graph of a function. The graph of f : A → B is the subset of A×B defined by {(a, f (a)) : a ∈ A}. A schematic illustration is shown below. B f (a) (a,f (a)) A a 1.2.3. Functions between real numbers. If the codomain of f is R, that is, f : A → R, then we say f is real-valued. For example, for A = set of dogs in iitb campus, consider f (a) = weight of dog a, for B = set of students in MA 105, consider f (B) = IQ of student b. We will mainly deal with functions f whose domain is A ⊆ R. For functions on intervals, consider f : [0, 1] → R, f (x) = x2 + 5, g : [0, 1] → (3, 10), g(x) = x2 + 5. Note very carefully: f and g are different functions because their codomains are different! 1.2.4. Absolute value function. An important real-valued function on R is the absolute value function. It is defined by f : R → R, f (x) = |x|, the absolute value of x. Its graph is shown below. y x The absolute value function satisfies the following properties. (i) |x| ≥ 0 with equality iff x = 0. Thus, the range of f is [0, ∞). (ii) |x| = |−x|. (iii) |xy| = |x||y|. (iv) −|x| ≤ x ≤ |x|. (v) |x + y| ≤ |x| + |y|. This is known as the triangle inequality. 1.2.5. Sine and cosine functions. The graphs of the functions f (x) = sin x and f (x) = cos x are shown below. y y x x 8 1. SETS AND FUNCTIONS The graph of the function f (x) = sin(1/x), for x > 0, is shown below. y x (For clarity, we have stretched the x-axis.) The graph oscillates rapidly as it approaches the y-axis. The graph of the function f (x) = x sin(1/x), for x > 0, is shown below. y x The graph oscillates exactly as before, but now the amplitude of the oscilla- tions goes to zero as it approaches the y-axis. 1.2.6. Exponential and logarithm functions. The graphs of the func- tions f (x) = ex and f (x) = log x are shown below. y y x x 1.2.7. Integer part function. The integer part [x] of a real number x is the greatest integer which is less than or equal to x. For example, [.5] = 0, = 2, [2.1] = 2. The graph of the integer part function f (x) = [x] is shown below. y x 1.2. FUNCTIONS 9 ss:poly-func 1.2.8. Polynomial functions. Polynomials in one variable are functions which are finite linear combinations of 1, x, x2 and so on. Each polynomial has a degree. Polynomials of degree zero are constants p(x) = c, degree one are linear functions p(x) = ax + b with a ̸= 0, degree two are quadratic functions p(x) = ax2 + bx + c with a ̸= 0, and so on. The graph of a degree one polynomial (linear) looks as follows. The graph of a degree two polynomial (quadratic) looks as follows. The graph of a degree three polynomial (cubic) looks as follows. The graph of a degree four polynomial (quartic) looks as follows. For each degree, we have drawn two graphs depending on the sign of the leading coefficient. Also, the above pictures show the generic case. They may degenerate in specific cases. For example, compare the graph of f (x) = x3 with the left picture shown above for a cubic. 1.2.9. Bounded and monotone functions. There are properties which a given function may or may not have. For example, for a function f , we can ask whether f is injective (into) or surjective (onto) or bijective (into and onto). Some other important properties are listed below. d:func-bdd Definition 1.6. A function f : A → R is (i) bounded above if there is a real number M (upper bound) such that f (x) ≤ M for x ∈ A, (ii) bounded below if there is a real number M (lower bound) such that M ≤ f (x) for x ∈ A, (iii) bounded if it is bounded above and bounded below, that is, if there are real numbers M1 , M2 such that M1 ≤ f (x) ≤ M2 for x ∈ A. 10 1. SETS AND FUNCTIONS A bounded function can be visualized as follows. y M2 x M1 The global maximum of f is its least upper bound, and the global mini- mum of f is its greatest lower bound. By completeness property of R, these necessarily exist, but they may not be attained at any point of A. d:func-mono Definition 1.7. Let I be an interval, and let f : I → R. We say f is (i) (monotonically) increasing on I if for x1 , x2 ∈ I, x1 < x2 =⇒ f (x1 ) ≤ f (x2 ). (ii) (monotonically) decreasing on I if for x1 , x2 ∈ I, x1 < x2 =⇒ f (x1 ) ≥ f (x2 ). (iii) monotonic on I if it increasing on I, or it is decreasing on I. We use the terms strictly increasing and strictly decreasing if the inequal- ities ≤ and ≥ above can be replaced by < and >. Note: The constant function f (x) = 3 is both increasing and decreasing, but it is not strictly increasing or strictly decreasing. ss:func-conv 1.2.10. Convex functions. Definition 1.8. For I an interval, let f : I → R be a function. (i) f is convex if for p < q in I and t ∈ (0, 1), f (tp + (1 − t)q) ≤ tf (p) + (1 − t)f (q). We use the term strictly convex if the above inequality is strict. (ii) f is (strictly) concave if −f is (strictly) convex. This can also be defined directly by reversing the inequality. 1.2. FUNCTIONS 11 The different points involved in the definition of a convex function are illustrated below. f (q) tf (p) + (1 − t)f (q) f (p) p tp + (1 − t)q q In geometric terms, a function f is convex if the chord joining any pair of points (p, f (p)) and (q, f (q)) on the graph of f lies on or above the graph of f. This is illustrated below. y x Exercise 1.9. Show: The convexity condition can be equivalently written as: For any p < x < q in I, f (q) − f (p) f (x) ≤ f (p) + (x − p). q−p For strict convexity, we replace ≤ by < above. For (strictly) concave, we use ≥ and >. The graph of a typical convex function on R is shown below on the left. The graph of a function on R which is convex but not strictly convex is shown below on the right. A concrete example is the absolute value function. y y x x Mention convex sets, and the fact that a convex set in R is the same as an interval. CHAPTER 2 Sequences 2.1. Sequences s:seq We introduce sequences of real numbers, and define the notion of conver- gence of such a sequence. We connect convergence to the property of being monotone and bounded. This is related to completeness property of R. 2.1.1. Sequences. d:seq Definition 2.1. A sequence of real numbers is a function f : N+ → R from the set of positive integers to the set of real numbers. Put f (n) = an. Thus specifying the function f is the same as specifying a1 , a2 , a3 ,.... We shall use the notation {an } for short. We call an the n-th term of the sequence. eg:seq Example 2.2. Here are a few sample examples of sequences. (1) an = 1/n. 1, 1/2, 1/3, 1/4,... (2) an = n. 1, 2, 3, 4,... n (3) an = (−1). −1, 1, −1, 1,... (4) an = n2. 1, 4, 9, 16,... √ (5) an = 2. √ √ √ 2, 2, 2,... This is a constant sequence. (6) an = 2n. 2, 4, 8, 16,... (7) a1 = 1, a2 = 1 and an = an−1 + an−2 for n ≥ 3. 1, 1, 2, 3, 5, 8, 13, 21, 34,... This is the Fibonacci sequence. 12 2.1. SEQUENCES 13 2.1.2. Visualizing a sequence. A sequence may be visualized on the real line as follows by marking its terms a1 , a2 , a3 ,.... a5 a6 a4 a1 a3 a2 It may also be visualized as the graph of the function N+ → R. In the picture below, we have marked the first 5 terms of the sequence. a1 a5 a2 1 2 3 4 5 a4 a3 Remark 2.3. We make some remarks related to the notion of a sequence. (1) A sequence is always infinite. For example, a1 , a2 , a3 , a4 which is a tuple of four real numbers is not a sequence. (2) A sequence need not be given by an algebraic formula. For example, we √ can define a sequence using the digits in the decimal expansion of 2. We can also do something like −1, 3, 4, 5, 2, 2, 2, 2,... , that is, the sequence is constant barring the first few terms. (3) ∞ is not a real number. Thus, −1, 2, ∞, 1/5,... 1 is not a sequence. Similarly, { n−1 } does not define a sequence since it is not defined at n = 1. 1 (4) The formula an = n−5 does not define a sequence (since it is not defined at n = 5). (5) The following is not a sequence.... , a−3 , a−2 , a−1 , a0 , a1 , a2 , a3 ,.... It arises from a function f : Z → R. (6) An example of a sequence which contains each integer exactly once is 0, 1, −1, 2, −2, 3, −3,.... (7) If {an } and {bn } are two sequences, then interleaving gives a third sequence a1 , b1 , a2 , b2 , a3 , b3 , a4 , b4 ,... For example, the sequence an = (−1)n arises by interleaving the con- stant −1 sequence and constant 1 sequence. Exercise 2.4. Construct a sequence which contains all rational numbers. (One way is to use Cantor’s famous diagonalization argument.) 14 2. SEQUENCES 2.1.3. Bounded and monotone sequences. We now define some proper- ties which a given sequence may or may not have. d:seq-bdd Definition 2.5. A sequence {an } of real numbers is (i) bounded above if there is a real number M such that an ≤ M for n ≥ 1, (ii) bounded below if there is a real number M such that M ≤ an for n ≥ 1, (iii) bounded if it is bounded above and bounded below, that is, if there are real numbers M1 , M2 such that M1 ≤ an ≤ M2 for n ≥ 1. A bounded sequence can be visualized on the real line as follows. M1 a5 a6 a4 a1 a3 a2 M2 Definition 2.5 is the special case A := N+ of Definition 1.6. d:seq-mono Definition 2.6. A sequence {an } of real numbers is (i) (monotonically) increasing if a1 ≤ a2 ≤ a3 ≤... , (ii) (monotonically) decreasing if a1 ≥ a2 ≥ a3 ≥... , (iii) monotonic if it is either (monotonically) increasing or decreasing. Exercise 2.7. For sequences in Example 2.2, which of the bounded and monotone properties hold? 2.1.4. Convergence of sequences. Where is a sequence heading? d:seq-conv Definition 2.8 (ϵ–n0 ). Let {an } be a sequence of real numbers. We say {an } is convergent if there is a ∈ R such that the following condition holds. For every ϵ > 0, there is n0 ∈ N+ such that |an − a| < ϵ for n ≥ n0. In this case, we say {an } converges to a, or a is the limit of {an }, and write lim an = a or an → a (as n → ∞). n→∞ If a sequence does not converge, we say the sequence diverges or is diver- gent. Example 2.9. Let us look at convergence in some of our examples. 2.1. SEQUENCES 15 (1) The sequence an = 1/n converges to 0, or equivalently, lim 1 = 0. n→∞ n Why? Let ϵ > 0. By archimedean property, there is n0 ∈ N+ such that 1 n0 < ϵ. Therefore, |an − a| = | n1 | ≤ 1 n0 11. Note very carefully: The definition of convergence only requires us to find one n0 , not necessarily the smallest one. However, it is a good practice to specify the smallest n0 for a given ϵ whenever possible. (5) Many times, we will be dealing with two convergent sequences an → a and bn → b at the same time. In such cases: For ϵ > 0, the sequence {an } will have its n0 , and {bn } will have its n0. By taking the larger of the two, we will have an n0 which works for both. 2.1.5. Uniqueness of a limit. Let {an } be any sequence of real numbers. Parvati says that {an } converges to 10, while Shankar says that {an } converges to 20. Can both of them be right? 10 20 No. Give ϵ = 4 to both of them, and ask them to provide n0. Both cannot succeed since the open intervals (6, 14) and (16, 24) are disjoint as shown in the picture. This argument generalizes to yield the following. Lemma 2.11. Limit of a sequence of real numbers is unique whenever it exists. Proof. Let {an } be such a sequence. Suppose an → a and an → b with a ̸= b. Take ϵ = |a − b|/2 > 0. Let n0 ∈ N+ be such that |an − a| < ϵ and |an − b| < ϵ 16 2. SEQUENCES for n ≥ n0. Then |a − b| ≤ |a − an0 | + |an0 − b| < ϵ + ϵ = |a − b|, which is a contradiction. Hence a = b. □ 2.1.6. Convergent implies bounded. We now relate convergence of a se- quence to its property of being bounded. p:conv-to-bdd Proposition 2.12. Let {an } be a sequence of real numbers. If {an } con- verges, then it is bounded. Equivalently, if {an } is not bounded, then it does not converge. a Proof idea. A finite set of real numbers is always bounded. The problem is that a sequence contains infinitely many real numbers. But if {an } converges, then some tail of this sequence lies in a finite neighborhood of the limit a. In the above picture, only finitely many terms of the sequence will be outside the blue interval. □ For example: The sequences {n}, {n2 }, {2n } are not bounded, and hence are divergent. The converse of Proposition 2.12 is false. For example, take an = (−1)n. This sequence is bounded but it does not converge. 2.1.7. Algebra of sequences. One can add two sequences, multiply two sequences, scalar multiply a sequence (by a real number). These operations are compatible with the notion of convergence in the following sense. l:seq-lim Lemma 2.13 (Limit theorems). Suppose an → a and bn → b are two convergent sequences of real numbers. Then (i) an + bn → a + b, (ii) ran → ra for r ∈ R, (iii) an bn → ab, (iv) 1/an → 1/a if a ̸= 0. Proof. For item (i): Let ϵ > 0. Since an → a and bn → b, there is n0 ∈ N+ such that |an − a| < ϵ/2 and |bn − b| < ϵ/2 for n ≥ n0. Now using triangle inequality, |(an + bn ) − (a + b)| ≤ |an − a| + |bn − b| < ϵ/2 + ϵ/2 = ϵ for n ≥ n0. Thus, an + bn → a + b. Proofs of items (ii), (iii), (iv) use similar ideas. □ Remark 2.14. For item (iv), strictly speaking, we must require an ̸= 0 for 1/an to make sense. However, since an → a and a ̸= 0, from some point on, the an are indeed nonzero (and convergence of a sequence is not affected if we change finitely many of its terms). l:seq-sand Lemma 2.15 (Sandwich lemma). If an ≤ bn ≤ cn , and an → a and cn → a, then bn → a. 2.1. SEQUENCES 17 Proof. Let ϵ > 0. Since an → a and cn → a, there is n0 ∈ N+ such that a − ϵ < an < a + ϵ and a − ϵ < cn < a + ϵ for n ≥ n0. Since an ≤ bn ≤ cn , a − ϵ < bn < a + ϵ for n ≥ n0. □ Example 2.16. Let us illustrate the sandwich lemma. n3 +3n2 +2 (1) Let an = n4 +7n2 +5. Then an → 0 since 1 3 2 0 ≤ an ≤ n + n2 + n4 → 0. 1 (2) Let an = n sin( n1 ). Then an → 0 since − n1 ≤ an ≤ 1 n and 1 n → 0. 2.1.8. Completeness property. We now give two sufficient conditions for a sequence to converge. This is a partial converse to Proposition 2.12. p:bdd-inc-to-conv Proposition 2.17. Let {an } be a sequence of real numbers. Then: (i) If {an } is increasing and bounded above, then {an } is convergent. (ii) If {an } is decreasing and bounded below, then {an } is convergent. This result can be deduced using completeness property of R. Since we have not discussed the latter, we take the above result for granted. Note: Items (i) and (ii) imply each other by replacing a sequence by its negative. Example 2.18. Let us illustrate the completeness property. (1) The sequence an = 1/n is decreasing and bounded below by 0, hence it converges. (2) Let a1 = 1 and an = 3an−1 6 +2 = 12 an−1 + 13 for n ≥ 2. This sequence is bounded below by 0. Is it decreasing? The first few values are a1 = 1, a2 = 5/6, a3 = 3/4. Now 1 1 2 an ≤ an−1 ⇐⇒ 2 an−1 + 3 ≤ an−1 ⇐⇒ 3 ≤ an−1 for n ≥ 2. Note: a1 ≥ 32. If an−1 ≥ 32 for some n ≥ 2, then an ≥ 21 ( 23 )+ 13 = 32. So by induction, an ≥ 23 for n ≥ 1. Hence {an } is decreasing. By completeness property, {an } converges (say to a). To compute a, we may proceed as follows. In an = 21 an−1 + 13 , lhs goes to a and rhs goes to 12 a + 31. So a = 12 a + 13 , and hence a = 23. Exercise 2.19. Give an example of a sequence {an } of real numbers which is strictly decreasing in absolute value, that is, |an | > |an+1 | for n ≥ 1, but which does not converge. Remark 2.20. Proposition 2.17 fails for Q. For example, we can take the sequence of rational √ numbers 1, 1.4, 1.41, 1.414,... arising from the decimal expansion of 2. This sequence is increasing and bounded above by say the rational number 1.5. But it does not converge in Q. What we are seeing here is the fact that the set of rational numbers Q is not complete. 18 2. SEQUENCES 2.1.9. Important limits. We mention a couple of important limits. Lemma 2.21. Let a ∈ R. Then: (i) If |a| < 1, then limn→∞ an = 0. (ii) If a > 0, then limn→∞ a1/n = 1. Proof. For item (i): The result is clear if a = 0. Let 0 < |a| < 1. Then 1 1 |a| > 1. Write |a| = 1 + h for h > 0. Then 1 = (1 + h)n = 1 + nh + · · · + hn ≥ 1 + nh ≥ nh. |a|n Therefore, 1 0 ≤ |a|n ≤ → 0. nh Result follows by sandwich lemma. For item (ii): The result is clear if a = 1. Let a > 1. Then a1/n > 1. Write a1/n = 1 + hn for hn > 0. Now a = (1 + hn )n ≥ nhn. Therefore, 0 ≤ hn ≤ na. So hn → 0, and a1/n → 1. Finally, let 0 < a < 1. Then a1 > 1. So by previous case, ( a1 )1/n → 1. Therefore, a1/n → 1. □ 2.1.10. Convergence to infinity. Suppose a sequence {an } diverges. Then it makes sense to ask whether {an } is converging to ∞ or −∞ as explained below. We emphasize again that ±∞ are not real numbers. Definition 2.22. Let {an } be a sequence of real numbers. (i) We say {an } converges to ∞ or limn→∞ an = ∞ or an → ∞ if the following condition holds. For every α ∈ R, there is n0 ∈ N+ such that an > α for n ≥ n0. (ii) We say {an } converges to −∞ or limn→∞ an = −∞ or an → −∞ if the following condition holds. For every β ∈ R, there is n0 ∈ N+ such that an < β for n ≥ n0. For example: The sequence an = n2 → ∞ and an = −n3 → −∞. The sequence an = (−1)n n is unbounded but does not converge either to ∞ or to −∞. r:metric Remark 2.23 (Metric spaces). We have focussed on sequences of real num- bers. More generally, a sequence can take values in any set A. However, to define convergence, one needs a notion of distance in A. Such a set A is called a metric space. For A = R, the distance is defined by dist(x, y) := |x − y|, and convergence as in Definition 2.8. This example generalizes to A = Rm. The case m = 2 is explained in Definition 6.10. CHAPTER 3 Continuity 3.1. Continuity s:func-cts The intuitive idea of a continuous function f is that the graph of f has no “breaks”. We now formalize this notion. 3.1.1. Continuous functions. d:func-cts Definition 3.1 (ϵ–δ). Let f : A → R. We say f is continuous at c ∈ A if the following condition holds. For every ϵ > 0, there is δ > 0 such that |x − c| < δ =⇒ |f (x) − f (c)| < ϵ. We say f is continuous on A if f is continuous at each point of A. Example 3.2. Let us illustrate the notion of continuity. (1) Let f (x) = x. Then f is continuous at all c ∈ R. Take δ = ϵ. (2) Let f (x) = 3x − 5. Then f is continuous at all c ∈ R. Take δ = ϵ/3. Then |x − c| < δ implies |(3x − 5) − (3c − 5)| = 3|x − c| < ϵ. (3) Let f (x) = [x]. Then f is continuous at non-integer points and discon- tinuous at integer points. c is a non-integer point. Pick δ > 0 which avoids the adjacent integer points. c is an integer point. Give ϵ = 1/2. No choice of δ works. (4) Consider the Dirichlet function ( 1 if x ∈ Q, f : [0, 1] → R, f (x) = 0 if x ∈ R \ Q. It is discontinuous at all points. Give ϵ = 1/2. No choice of δ works because in any open interval there is always a rational and an irrational by Lemma 1.3. Exercise 3.3. Let f : A → R be continuous at c ∈ A, and f (c) > 0. Then there is an open interval I containing c such that f (x) > 0 for all c ∈ I. 19 20 3. CONTINUITY 3.1.2. Algebra of continuous functions. One can add two functions, mul- tiply two functions, scalar multiply a function (by a real number). These op- erations are compatible with the notion of continuity in the following sense. l:cts-alg Lemma 3.4. Suppose f, g : A → R are continuous at c ∈ A. Then so are (i) f + g, (ii) rf for r ∈ R, (iii) f g, (iv) 1/f if f (c) ̸= 0. Proof. For item (i): Let ϵ > 0. Since f and g are continuous at c, there is δ > 0 such that |x − c| < δ =⇒ |f (x) − f (c)| < ϵ/2 and |g(x) − g(c)| < ϵ/2. Now using triangle inequality, |(f + g)(x) − (f + g)(c)| ≤ |f (x) − f (c)| + |g(x) − g(c)| < ϵ/2 + ϵ/2 = ϵ. Proofs of items (ii), (iii), (iv) use similar ideas. For item (iv): It suffices to prove that the function 1/x is continuous, and use Lemma 3.5 below. □ l:cts-comp Lemma 3.5. Let f : A → B and g : B → R. If f is continuous at c ∈ A and g is continuous at f (c) ∈ B, then the composite g ◦ f is continuous at c ∈ A. Proof idea. Given ϵ > 0, pick δ ′ > 0 using continuity of g at f (c). Now taking δ ′ > 0 as the ϵ, pick the required δ > 0 using continuity of f at c. □ As a consequence: polynomials in x such as p(x) = x2 and p(x) = 2x3 − 3x + 1 are continuous, a rational function in x, that is r(x) = p(x)/q(x), where p and q are polynomials, is continuous at c if q(c) ̸= 0, a function such as f (x) = x3 sin|x| + cos x2 is continuous. Example 3.6. Define f : R → R by ( x sin(1/x) if x ̸= 0, f (x) = 0 if x = 0. Then f is continuous at c ̸= 0 since it is formed out of continuous functions. Let us see what happens at c = 0. Given ϵ > 0, let δ = ϵ. Then |x − 0| < δ =⇒ |f (x) − f (0)| ≤ |x| < δ = ϵ. Hence f is continuous at 0. Exercise 3.7. Define f as above but with x sin(1/x) replaced by sin(1/x). Show: f is not continuous at 0. 3.1.3. Characterization using sequences. We now characterize continu- ity of a function using sequences. This forges a connection between Defini- tion 3.1 and Definition 2.8. p:cts-seq Proposition 3.8. Let f : A → R. Then f is continuous at c ∈ A iff the following condition holds. For any sequence {xn } in A with xn → c, we have f (xn ) → f (c). 3.1. CONTINUITY 21 Proof. Suppose f is continuous at c ∈ A, and xn → c. We want to show f (xn ) → f (c). Let ϵ > 0. Continuity of f at c yields a δ. Using this δ, we find a n0 for xn → c. Thus for n ≥ n0 , we have |xn − c| < δ, and hence |f (xn ) − f (c)| < ϵ as required. Conversely, suppose the condition holds. We prove f is continuous at c by contradiction. So suppose f is not continuous at c. Then there is ϵ > 0 for which no δ works. This gives a sequence xn → c for which |f (xn ) − f (c)| > ϵ for n ≥ 1. This is a contradiction. □ Example 3.9. Let us use Proposition 3.8 to show that certain functions are not continuous at a point. (1) Consider the integer part function f (x) = [x]. At c = 5, f (c) = 5. Let xn = 5 − n1. Then xn → 5, but [xn ] = 4 and so [xn ] ̸→ 5. Thus, f is not continuous at c = 5. (2) Define ( sin(1/x) if x ̸= 0, f (x) = r if x = 0. Then f is continuous at c ̸= 0 since it is formed out of continuous 2 functions. Let us see what happens at c = 0. Let xn = (2n+1)π. Then xn → 0, but f (xn ) = sin( (2n+1)π 2 ) = (−1)n does not converge. So f is not continuous at c = 0, no matter what r is. 3.1.4. Further properties of continuous functions. t:ivp Theorem 3.10 (Intermediate value property). Let I be an interval, and f : I → R be a continuous function. Let r ∈ R be such that f (x1 ) < r < f (x2 ) for some x1 < x2 in I. Then there is x ∈ (x1 , x2 ) such that f (x) = r. The proof uses completeness property of R, and is omitted. eg:quartic-root Example 3.11. Let us show that the function f (x) = x4 + 2x3 − 2 has a root in (0, 1). Its graph is shown below. The red point is x = 1. y x Since f is a polynomial, it is continuous. Now f (0) = −2 and f (1) = 1. So by IVP, f attains every value between −2 and 1 in the interval (0, 1), and in particular, the value 0. c:cts-int Corollary 3.12. Let f : A → R be a continuous function, and I ⊆ A be an interval. Then f (I) is an interval. Exercise 3.13. Is there a continuous function from [0, 1] onto [2, 3]? onto [2, 3] ∪ [4, 5]? onto (0, ∞)? onto [−1, 1]? 22 3. CONTINUITY Corollary 3.14. Let f : I → R be continuous and injective. Then f is either increasing or decreasing. Also, f −1 : f (I) → R is continuous. Proof. Exercise. □ Let us use the above result to deduce the existence of the square root function √ g : [0, ∞) → [0, ∞), g(x) = x. Take f : [0, ∞) → [0, ∞) with f (x) = x2. This function is continuous and injective. Also f ([0, ∞)) = [0, ∞). Put g = f −1. The graph of f on (0, 2) and of g on (0, 4) are shown below. y y x x t:cts-cpt Theorem 3.15. Let f : [a, b] → R be continuous. Then f is bounded on [a, b] and attains its global maximum and global minimum on [a, b]. Further, f ([a, b]) is a closed and bounded interval. The proof is omitted. Example 3.16. Let us see what can go wrong if the domain is an interval but not a closed interval. (1) Take f : (0, 1) → R with f (x) = x1. Then f is continuous but not bounded. (2) Take f : [0, ∞) → R with f (x) = x. Then f is continuous but not bounded. (3) Take f : (0, 1) → R with f (x) = x. Then f is continuous and bounded, but does not attain its global maximum or global minimum. Exercise 3.17. Construct a continuous function f : R → R such that f takes every value exactly three times. ex:dirichlet Exercise 3.18. Define the function f : R → R by ( 0 if x is irrational, eq:dirichlet (3.1) f (x) = 1/q if x = p/q in lowest terms. Show: f is continuous at all irrational points, but discontinuous at all rational points. Puzzle 3.19. A pilgrim wants to go to a temple on the top of a mountain. He starts from the bottom at 8 in the morning, and reaches the top at 12. He stays there for a week. While coming down, he again starts at 8 in the morning, and reaches the bottom at 11. Show that there is a time between 8 and 11 when the pilgrim was at the same point on the mountain while ascending and descending. 3.2. LIMIT OF A FUNCTION 23 3.2. Limit of a function s:func-lim 3.2.1. Limit of a function. Let f : A → R and c ∈ R be such that there is r > 0 with (c − r, c) ∪ (c, c + r) ⊆ A. In other words, A contains all points within distance r of c, except perhaps the point c. d:func-lim Definition 3.20. We say limx→c f (x) exists if there is ℓ ∈ R such that for every sequence {xn } in A with xn ̸= c and xn → c, we have f (xn ) → ℓ. In this case, we write ℓ = lim f (x), x→c and say f has a limit at c. Example 3.21. Let us illustrate the notion of limit. (1) Define f : R → R by ( 3x + 5 if x = ̸ 0, f (x) = 1 if x = 0. Let xn → 0, xn ̸= 0 for n ≥ 1. Then f (xn ) = 3xn + 5 → 5. Hence limx→0 f (x) = 5. (2) Let f (x) = [x]. Let xn = 5 + (1/n), so xn → 5. Also f (xn ) = 5, so f (xn ) → 5. Let xn = 5 − (1/n), so xn → 5. Also f (xn ) = 4, so f (xn ) → 4. Thus limx→5 f (x) does not exist. (3) Let f (x) = sin(1/x) for x ∈ R \ {0}. 2 Let xn = (2n+1)π , so xn → 0, but f (xn ) = sin( (2n+1)π 2 ) = (−1)n does not converge. Thus limx→0 f (x) does not exist. Remark 3.22 (ϵ–δ). Equivalently, similar to Definition 3.1 for continuity, we say: lim f (x) = ℓ x→c if the following condition holds. For every ϵ > 0, there is δ > 0 such that 0 < |x − c| < δ =⇒ |f (x) − ℓ| < ϵ. It is possible to take this as a definition, and deduce Definition 3.20 as a consequence. 3.2.2. Algebra of limits of functions. The operations of addition, mul- tiplication, scalar multiplication on functions are compatible with the notion of taking limits in the following sense. l:lim-alg Lemma 3.23 (Limit theorems). Suppose limx→c f (x) and limx→c g(x) ex- ist. Then (i) lim (f + g)(x) = lim f (x) + lim g(x), x→c x→c x→c (ii) lim rf (x) = r lim f (x) for r ∈ R. x→c x→c 24 3. CONTINUITY (iii) lim (f g)(x) = ( lim f (x))( lim g(x)), x→c x→c x→c (iv) 1 1 lim (x) = (if denominator ̸= 0). x→c f limx→c f (x) Proof. Follows from Lemma 2.13 for sequences. □ t:func-sand Lemma 3.24 (Sandwich lemma). If f (x) ≤ g(x) ≤ h(x), and limx→c f (x) = ℓ and limx→c h(x) = ℓ, then limx→c g(x) = ℓ. Proof. Follows from sandwich Lemma 2.15 for sequences. □ 3.2.3. Continuity and limit. We say c ∈ R is an interior point of A ⊆ R if there is r > 0 such that (c − r, c + r) ⊆ A. p:cty-lim Proposition 3.25. Let f : A → R, and c be an interior point of A. Then f is continuous at c iff limx→c f (x) exists and is equal to f (c). Proof idea. We use characterization of continuity given by Proposition 3.8. Forward implication is straightforward. For backward implication: Let xn → c. Break {xn } into two subsequences: One contains terms not equal to c, and other contains terms equal to c. Both subsequences, after applying f , converge to f (c). Hence, f (xn ) → f (c), as required. (Ignore either of the two subsequences if it is finite.) □ 3.2.4. Left and right limits. Definition 3.26. We build on Definition 3.20. (i) We say limx→c− f (x) exists if there is ℓ ∈ R such that for every sequence {xn } in A with xn < c and xn → c, we have f (xn ) → ℓ. In this case, we say f has a left limit at c. (ii) We say limx→c+ f (x) exists if there is ℓ ∈ R such that for every sequence {xn } in A with xn > c and xn → c, we have f (xn ) → ℓ. In this case, we say f has a right limit at c. Proposition 3.27. We have: f has a limit at c iff f has a left limit and right limit at c, and they are equal. 3.2.5. Types of discontinuities. Suppose f : A → R is discontinuous at an interior point c ∈ A. Then one of the following happens. limx→c f (x) does not exist. – Either left limit or right limit of f (x) at c does not exist (essential discontinuity). – Left and right limits of f (x) at c exist, but are not equal (jump discontinuity). limx→c f (x) exists, but is not equal to f (c) (removable discontinuity). 3.2. LIMIT OF A FUNCTION 25 3.2.6. Convergence to and at infinity of a function. We mention that it is possible to make sense of the limits lim f (x) = ℓ, lim f (x) = ℓ, x→∞ x→−∞ and also of lim f (x) = ∞, lim f (x) = −∞. x→c x→c The latter two can also be applied to left and right limits. For example, 1 1 1 1 lim = 0, lim = 0, lim = ∞, lim = −∞. x→∞ x x→−∞ x x→0+ x x→0− x r:metric-2 Remark 3.28 (Metric spaces). We build on Remark 2.23. Let X and Y be metric spaces. It makes sense to define a continuous function f : X → Y as in Definition 3.1, with |x − c| replaced by dist(x, c) (distance in X), and |f (x) − f (c)| replaced by dist(f (x), f (c)) (distance in Y ). For the example of f : R2 → R, see Definition 6.12. An even more general context for continuous functions is that of topolog- ical spaces (in which there is a qualitative rather than quantitative notion of what it means for two points to be close to each other). For more details, see Munkres [20, Chapter 2]. CHAPTER 4 Differentiability 4.1. Differentiability s:func-diff The intuitive idea of a differentiable function f is that the graph of f has tangents which are not vertical (that is, of finite slope). See illustration below. We now formalize this notion. y f (x) x 4.1.1. Differentiable functions. Let A ⊆ R, and c be an interior point of A. Definition 4.1. A function f : A → R is differentiable at c if the limit f (c + h) − f (c) lim h→0 h exists. We denote it by f ′ (c), and call it the derivative of f at c. Equivalently, a function f : A → R is differentiable at c if there is a real number α such that f (c + h) − f (c) − αh eq:diff (4.1) lim = 0. h→0 h In this case, we say α is the derivative of f at c. Note: One may also replace h by |h| in the denominator in (4.1). eg:diff Example 4.2. Let us illustrate the notion of differentiability. (1) Let f : R → R be a constant function. Then f is differentiable and f ′ (c) = 0 for all c ∈ R. (2) Let f1 , f2 , f3 : R → R be 2 f1 (x) = x, f2 (x) = x2 , f3 (x) = x 3. Their graphs are shown below. y y y x x x 26 4.1. DIFFERENTIABILITY 27 We have: f1 is differentiable and f1′ (c) = 1 for all c ∈ R. f2 is differentiable and f2′ (c) = 2c for all c ∈ R. f3 is differentiable at c = ̸ 0, but, it is not differentiable at 0 since f3 (0 + h) − f3 (0) 1 = 1/3 h h whose limit does not exist as h → 0. (3) Let f (0) = 0 and f (x) = x sin(1/x) for x ∈ R \ {0}. Then f is not differentiable at 0 since f (0 + h) − f (0) 1 = sin h h whose limit does not exist as h → 0. 4.1.2. Left and right derivatives. Let f : A → R. (i) Suppose c ∈ A is such that [c, c + r) ⊆ A for some r > 0. If the limit f (c + h) − f (c) lim h→0+ h exists, then we call it the right derivative of f at c, and denote it by ′ f+ (c). (ii) Suppose c ∈ A is such that (c − r, c] ⊆ A for some r > 0. If the limit f (c + h) − f (c) lim h→0− h exists, then we call it the left derivative of f at c, and denote it by ′ f− (c). Lemma 4.3. If c is an interior point of A, then f : A → R is differentiable ′ ′ at c iff f+ (c) and f− (c) both exist and are equal. ′ ′ Example 4.4. Let f (x) = |x|. Then f− (0) = −1 and f+ (0) = 1. Hence f is not differentiable at 0. 4.1.3. Derivative function. Let us now focus on the case when the domain of f is an interval I. We say f : (a, b) → R is differentiable on (a, b) if f is differentiable at every c ∈ (a, b). In this case, define f ′ : (a, b) → R, c 7→ f ′ (c). We call f ′ the derivative of f. We make a similar definition when the domain of f is (a, ∞), (−∞, b), R. We say f : [a, b] → R is differentiable on [a, b] if f is differentiable on ′ ′ (a, b), and f+ (a) and f− (b) exist. In this case, define f ′ : [a, b] → R, ′ a 7→ f+ (a), c 7→ f ′ (c), b 7→ f− ′ (b) for c ∈ (a, b). We make a similar definition when the domain of f is [a, b), (a, b], [a, ∞), (−∞, b]. 28 4. DIFFERENTIABILITY 4.1.4. Increment function. l:car Lemma 4.5 (Caratheodory lemma). A function f : A → R is differen- tiable at an interior point c of A iff there is a function f1 : A → R which is continuous at c such that f (x) − f (c) = (x − c)f1 (x) ′ for x ∈ A. Moreover, f (c) = f1 (c). We call f1 : A → R the increment function. Note very carefully: f1 depends on the point c. Proof. We make use of Proposition 3.25. Forward implication. Let f be differentiable at c. Define ( f (x)−f (c) x−c if x ∈ A \ {c}, f1 (x) := ′ f (c) if x = c. Then f1 is continuous at c since limx→c f1 (x) = f ′ (c) = f1 (c). Backward implication. Let f1 be as stated. Then f (c + h) − f (c) lim = lim f1 (c + h) = lim f1 (x) = f1 (c) h→0 h h→0 x→c since f1 is continuous at c. Hence f is differentiable at c. □ In other words, the increment function f1 keeps track of slopes of all secants drawn from (c, f (c)). More precisely, f1 (x) is the slope of the line segment joining (c, f (c)) to (x, f (x)) for x ̸= c, and f1 (c) is the slope of the tangent line at (c, f (c)). y c x c:diff-cts Corollary 4.6. If f is differentiable at c, then f is continuous at c. Proof. Let f be differentiable at c. Using Caratheodory Lemma 4.5, write f (x) = f (c) + (x − c)f1 (x). Since f1 is continuous, so is f by Lemma 3.4. Alternatively, lim f (x) = lim f (c) + (x − c)f1 (x) = f (c) x→c x→c by Lemma 3.23. Now use Proposition 3.25. □ Remark 4.7. If f is not continuous at c, then it is not differentiable at c. For example: The function f (x) = [x] is not continuous at 5, hence it is not differentiable at 5. 4.1. DIFFERENTIABILITY 29 The converse of Corollary 4.6 is false. For example: The function f (x) = |x| is continuous at 0, but it is not differentiable at 0. Remark 4.8. Here is an alternative way to phrase Caratheodory lemma. A function f : A → R is differentiable at an interior point c of A iff there is a real number α such that f (c + h) = f (c) + α h + ϵ(h) h where ϵ(h) is defined for small h, and ϵ(h) → 0 as h → 0. Moreover, f ′ (c) = α. 4.1.5. Algebra of differentiable functions. The operations of addition, multiplication, scalar multiplication on functions are compatible with the no- tion of differentiability in the following sense. l:diff-alg Lemma 4.9. Suppose f, g : A → R are differentiable at c ∈ A. Then (i) f + g is differentiable at c, and (f + g)′ (c) = f ′ (c) + g ′ (c), (ii) rf is differentiable at c, and (rf )′ (c) = rf ′ (c) for r ∈ R, (iii) f g is differentiable at c, and (f g)′ (c) = f ′ (c)g(c) + f (c)g ′ (c), (iv) 1/f is differentiable at c, and −f ′ (c) (1/f )′ (c) = f (c)2 if f (c) ̸= 0. Proof. For item (i): Write f (x) = f (c) + (x − c)f1 (x) and g(x) = g(c) + (x − c)g1 (x). Then f (x) + g(x) = f (c) + g(c) + (x − c)[f1 (x) + g1 (x)]. Thus, (f + g)(x) = (f + g)(c) + (x − c)(f1 + g1 )(x). Since f1 and g1 are both continuous at c, so is f1 +g1. It serves as the increment function for f + g at the point c. Thus by Caratheodory Lemma 4.5, f + g is differentiable at c. Moreover, (f + g)′ (c) = (f + g)1 (c) = (f1 + g1 )(c) = f1 (c) + g1 (c) = f ′ (c) + g ′ (c). Proofs of items (ii), (iii), (iv) use similar ideas. □ Lemma 4.10 (Chain rule). Let f : A → B and g : B → R. Let c be an interior point of A, and f (c) be an interior point of B. If f is differentiable at c, and g is differentiable at f (c), then the composite g ◦ f : A → R is differentiable at c, and eq:chain (4.2) (g ◦ f )′ (c) = g ′ (f (c))f ′ (c). Proof. Exercise. □ 30 4. DIFFERENTIABILITY Example 4.11. Let φ(x) = (4x3 + 3)7 + 2. Define f (x) = 4x3 + 3 and g(y) = y 7 + 2. Then φ = g ◦ f. Hence, φ′ (c) = g ′ (f (c))f ′ (c) = 7(4c3 + 3)6 (12c2 ). l:der-inv Lemma 4.12. Let f : (a, b) → (p, q) be continuous, and a bijection. Let f −1 : (p, q) → (a, b) be the inverse function. Let f be differentiable at c ∈ (a, b), and f ′ (c) ̸= 0. Then f −1 is differentiable at f (c) ∈ (p, q), and 1 (f −1 )′ (f (c)) =. f ′ (c) Proof. Put g = f −1. Then φ = g ◦ f is the identity function on (a, b). By the chain rule, 1 = φ′ (c) = g ′ (f (c))f ′ (c). Therefore, g ′ (f (c)) = 1/f ′ (c). □ Draw a picture. Example 4.13. Let us illustrate Lemma 4.12. (1) Let π π f : (− , ) → (−1, 1), f (x) = sin(x). 2 2 Then f is continuous and a bijection. Its inverse function is denoted sin−1. Put f (c) = d. Thus, 1 1 1 1 (sin−1 )′ (d) = (f −1 )′ (d) = = =p =√. f ′ (c) cos(c) 2 1 − sin c 1 − d2 (2) Fix a positive integer n ≥ 1. Let f : (0, ∞) → (0, ∞), f (x) = xn. Then f is continuous and a bijection. Put f (c) = d. Thus, 1 1 1 1 (1/n)−1 (f −1 )′ (d) = = = = d. f ′ (c) ncn−1 nd(n−1)/n n Remark 4.14. The derivative of a trigonometric function is again a trigono- metric function. However, the derivative of an inverse trigonometric function is algebraic involving rational functions and square roots. This is because the relations among different trigonometric functions are algebraic, and usually quadratic. For instance, in the above calculation of the derivative of sin−1 , we used the quadratic relation sin2 θ + cos2 θ = 1. 4.2. Maxima and minima s:func-maxmin The derivative provide an effective tool to solve maxima and minima (optimization) problems. Conversely, one can use these ideas to prove results about the derivative such that the mean value theorem. This establishes a clear connection between sign of the first derivative and increasing/decreasing functions. Going one step further, there is a connection between sign of the second derivative and convex/concave functions. 4.2. MAXIMA AND MINIMA 31 4.2.1. Global and local maxima/minima. Let f : A → R be a function. Definition 4.15. We say: (i) f has a global maximum at c if f (x) ≤ f (c) for x ∈ A. In this case, f (c) is the least upper bound of f , and it is attained at c. (ii) f has a global minimum at c if f (x) ≥ f (c) for x ∈ A. In this case, f (c) is the greatest lower bound of f , and it is attained at c. Definition 4.16. We say: (i) f has a local maximum at c if there is δ > 0 such that |x−c| < δ implies f (x) ≤ f (c). (ii) f has a local minimum at c if there is δ > 0 such that |x−c| < δ implies f (x) ≥ f (c). Note: Global maximum (minimum) implies local maximum (minimum), but the converse is false. Note: A constant function has both a global maximum and a global minimum at all points. We say f has a global (local) extremum at c if it has either a global (local) maximum at c, or a global (local) minimum at c. 4.2.2. Local maxima/minima: necessary condition. l:ext-diff-zero Lemma 4.17. Let c be an interior point of A. If f : A → R is differentiable at c, and has either a local maximum or a local minimum at c, then f ′ (c) = 0. See illustrations below. Proof. Suppose f has a local minimum at c. Thus, for small h, f (c + h) − f (c) ≥ 0. f (c + h) − f (c) ′ h>0: ≥ 0. Hence, f+ (c) ≥ 0. h f (c + h) − f (c) ′ h 0, and f1 (c + h) ≤ 0 for h < 0. And f1 (x) is continuous at c, so f1 (c) = 0. We can make a similar argument when f has a local maximum at c. □ Remark 4.18. We make some remarks related to the above result. (1) Let f : [−1, 1] → R with f (x) = x2. Then f has a local minimum at the interior point 0, and indeed f ′ (0) = 0 as claimed by Lemma 4.17. (2) Let f : [0, 1] → R with f (x) = x. Then f has a local minimum at 0 ′ ′ and local maximum at 1. But f+ (0) ̸= 0 and f− (1) ̸= 0. This does not contradict Lemma 4.17 since 0 and 1 are not interior points. (3) Let f : [−1, 1] → R with f (x) = x3. Then f ′ (0) = 0, but f does not have a local maximum or a local minimum at 0. Thus, the converse to Lemma 4.17 is false. 32 4. DIFFERENTIABILITY 4.2.3. Rolle’s theorem and mean value theorem. We now discuss Rolle’s theorem and the mean value theorem. The former is a special case of the lat- ter. The latter is attributed to Lagrange. t:rolle Theorem 4.19 (Rolle’s theorem). Let f : [a, b] → R be such that (i) f is continuous on [a, b], (ii) f is differentiable on (a, b), (iii) f (a) = f (b). Then there is c ∈ (a, b) such that f ′ (c) = 0. See illustration below. y f (a) = f (b) a x b Proof. We consider two cases. f is constant. Then f ′ (c) = 0 for all c ∈ (a, b). f is not a constant. Then the global minimum of f is strictly smaller than the global maximum of f. Since f is continuous, by Theorem 3.15, both are attained on [a, b]. Both cannot be at a and b since f (a) = f (b). Hence, there is c ∈ (a, b) such that f has either a global maximum or a global minimum at c. Global maximum/minimum implies local maximum/minimum, so by Lemma 4.17, f ′ (c) = 0. □ eg:quartic-root-r Example 4.20. Let us return to Example 3.11. We saw by IVP that the function f (x) = x4 + 2x3 − 2 has a root in (0, 1). Now let us show that f (x) = x4 + 2x3 − 2 has exactly one root in (0, 1). Suppose there are two roots in (0, 1). Say f (a) = 0 = f (b) for 0 < a < b < 1. Then by Rolle’s theorem, f ′ (c) = 0 for some c ∈ (a, b). Now f ′ (x) = 4x3 + 6x2 = 2x2 (2x + 3) ̸= 0 for x ∈ (0, 1). This is a contradiction. y y x x The graphs of f and f ′ are shown above. The red point is x = 1. t:mvt Theorem 4.21 (Mean value theorem). Let f : [a, b] → R be such that (i) f is continuous on [a, b], (ii) f is differentiable on (a, b). 4.2. MAXIMA AND MINIMA 33 Then there is c ∈ (a, b) such that f (b) − f (a) f ′ (c) =. b−a See illustration below. y f (b) f (a) a c x b Proof. For x ∈ [a, b], define f (b) − f (a) F (x) := f (x) − (x − a). b−a Then F is continuous on [a, b], differentiable on (a, b) and F (a) = f (a) = F (b). By Rolle’s theorem, there is c ∈ (a, b) such that F ′ (c) = 0, that is, f ′ (c) = f (b)−f b−a (a). □ Remark 4.22 (Physical interpretation). Let f (t) denote the displace- ment of a particle at time t for a ≤ t ≤ b. Then the average speed is f (b)−f b−a (a) , and speed at time c is f ′ (c). Thus, MVT says that there is a time c such that the speed at time c equals the average speed. Remark 4.23. Note very carefully: Rolle’s theorem and the mean value theorem are results about the derivative, and make no direct reference to the notions of minima and maxima. Then why are they in this section, and not in Section 4.1? The reason is that the proof of Rolle’s theorem uses a result about minima and maxima. Rolle’s theorem is a corollary of the mean value theorem obtained by imposing the additional hypothesis f (a) = f (b). Then why is it stated earlier rather than later? The reason is that Rolle’s theorem is used in the proof of the mean value theorem. 4.2.4. Mean value inequality. Lemma 4.24. Let f : [a, b] → R be such that f is continuous on [a, b], and differentiable on (a, b). If m ≤ f ′ (x) ≤ M for all x ∈ (a, b), then m(b − a) ≤ f (b) − f (a) ≤ M (b − a). This is the mean value inequality. Proof. This follows from Theorem 4.21 (MVT). □ √ Example 4.25. Fix n. Define f : [n, n + 1] → R by f (x) = x. Then f ′ (x) = 2√ 1 x. Moreover, 1 1 √ ≤ f ′ (x) ≤ √. 2 n+1 2 n 34 4. DIFFERENTIABILITY Therefore, by the mean value inequality, 1 √ √ 1 √ (n + 1 − n) ≤ n + 1 − n ≤ √ (n + 1 − n). 2 n+1 2 n 1 √ √ For n = 1, we get 2√ 2 ≤ 2 − 1 ≤ 12. Therefore, 2 < 32. To get a lower √ √ bound, we use √12 > 23. So 21 23 < 2 − 1 which yields 43 < 2. Thus, 4 √ 3 < 2<. 3 2 4.2.5. Increasing and decreasing functions. l:inc-diff Lemma 4.26. Let f : [a, b] → R be such that f is continuous on [a, b], and differentiable on (a, b). (1) If f ′ (x) = 0 for x ∈ (a, b), then f is constant on [a, b]. (Converse true). (2) (i) If f ′ (x) ≥ 0 for x ∈ (a, b), then f is increasing on [a, b]. (Converse true). (ii) If f ′ (x) ≤ 0 for x ∈ (a, b), then f is decreasing on [a, b]. (Converse true). (iii) If f ′ (x) > 0 for x ∈ (a, b), then f is strictly increasing on [a, b]. (Converse false). (iv) If f ′ (x) < 0 for x ∈ (a, b), then f is strictly decreasing on [a, b]. (Converse false). Proof. These can be deduced from Theorem 4.21 (MVT). □ eg:quad Example 4.27. Define f : R → R by f (x) = x(1 − x). Its graph is shown below. y x Then f ′ (x) = 1 − 2x. Thus, f ′ (x) > 0 if x < 12 , and f ′ (x) < 0 if x > 21. So, f is strictly increasing on (−∞, 12 ), and strictly decreasing on ( 12 , ∞). 4.2.6. Convex functions. Recall convex functions from Section 1.2.10. We now relate them to differentiability. l:conv-diff Lemma 4.28. Let I be an interval and f : I → R be differentiable. Then (i) f′ is increasing on I iff f is convex on I. (ii) f′ is decreasing on I iff f is concave on I. (iii) f′ is strictly increasing on I iff f is strictly convex on I. (iv) f′ is strictly decreasing on I iff f is strictly concave on I. Proof. See [12, Proposition 4.31]. Note: Items (i) and (ii) imply each other, while items (iii) and (iv) i