All the Math You Missed for Graduate School - 2nd Edition PDF

Summary

This book, "All the Math You Missed (But Need to Know for Graduate School)", is a resource for graduate students in mathematical and related fields that need to learn some serious math quickly. It covers key undergraduate math topics like linear algebra and analysis, with an emphasis on intuition and examples, to fill in knowledge gaps. The second edition includes updated chapters.

Full Transcript

All the Math You Missed Beginning graduate students in mathematical sciences and related areas in physical and computer sciences and engineering are expected to be familiar with a daunting breadth of mathematics, but few have such a background. This bestselling book helps students fill in the gaps...

All the Math You Missed Beginning graduate students in mathematical sciences and related areas in physical and computer sciences and engineering are expected to be familiar with a daunting breadth of mathematics, but few have such a background. This bestselling book helps students fill in the gaps in their knowledge. Thomas A. Garrity explains the basic points and a few key results of all the most important undergraduate topics in mathematics, emphasizing the intuitions behind the subject. The explanations are accompanied by numerous examples, exercises and suggestions for further reading that allow the reader to test and develop their understanding of these core topics. Featuring four new chapters and many other improvements, this second edition of All the Math You Missed is an essential resource for advanced undergraduates and beginning graduate students who need to learn some serious mathematics quickly. Thomas A. Garrity is the Webster Atwell Class of 1921 Professor of Mathematics at Williams College, Massachusetts, where he was the director of the Williams College Project for Effective Teaching for many years. Among his awards are Rice University’s Nicolas Salgo Outstanding Teaching award and the Haimo award of the MAA. His other books include Algebraic Geometry: A Problem Solving Approach (2013, co-authored) and Electricity and Magnetism for Mathematicians (2015). All the Math You Missed But Need to Know for Graduate School second edition Thomas A. Garrity Williams College, Williamstown, MA Figures by Lori Pedersen University Printing House, Cambridge CB2 8BS, United Kingdom One Liberty Plaza, 20th Floor, New York, NY 10006, USA 477 Williamstown Road, Port Melbourne, VIC 3207, Australia 314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India 79 Anson Road, #06–04/06, Singapore 079906 Cambridge University Press is part of the University of Cambridge. It furthers the University’s mission by disseminating knowledge in the pursuit of education, learning, and research at the highest international levels of excellence. www.cambridge.org Information on this title: www.cambridge.org/9781316518403 DOI: 10.1017/9781108992879 © Thomas A. Garrity 2001, 2021 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2001 Second Edition 2021 Printed in the United Kingdom by TJ Books Limited, Padstow Cornwall A catalogue record for this publication is available from the British Library. ISBN 978-1-316-51840-3 Hardback ISBN 978-1-009-00919-5 Paperback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication and does not guarantee that any content on such websites is, or will remain, accurate or appropriate. Dedicated to the Memory of Robert Mizner Contents Preface page xiii On the Structure of Mathematics xix Brief Summaries of Topics xxiii 0.1 Linear Algebra xxiii 0.2 Real Analysis xxiii 0.3 Differentiating Vector-Valued Functions xxiii 0.4 Point Set Topology xxiii 0.5 Classical Stokes’ Theorems xxiv 0.6 Differential Forms and Stokes’ Theorem xxiv 0.7 Curvature for Curves and Surfaces xxiv 0.8 Geometry xxiv 0.9 Countability and the Axiom of Choice xxv 0.10 Elementary Number Theory xxv 0.11 Algebra xxv 0.12 Algebraic Number Theory xxv 0.13 Complex Analysis xxv 0.14 Analytic Number Theory xxvi 0.15 Lebesgue Integration xxvii 0.16 Fourier Analysis xxvii 0.17 Differential Equations xxvii 0.18 Combinatorics and Probability Theory xxvii 0.19 Algorithms xxviii 0.20 Category Theory xxviii 1 Linear Algebra 1 1.1 Introduction 1 1.2 The Basic Vector Space Rn 1 1.3 Vector Spaces and Linear Transformations 4 1.4 Bases, Dimension, and Linear Transformations as Matrices 6 1.5 The Determinant 9 1.6 The Key Theorem of Linear Algebra 12 1.7 Similar Matrices 13 1.8 Eigenvalues and Eigenvectors 15 1.9 Dual Vector Spaces 19 1.10 Books 20 Exercises 21 viii Contents 2  and δ Real Analysis 23 2.1 Limits 23 2.2 Continuity 25 2.3 Differentiation 26 2.4 Integration 28 2.5 The Fundamental Theorem of Calculus 31 2.6 Pointwise Convergence of Functions 35 2.7 Uniform Convergence 36 2.8 The Weierstrass M-Test 39 2.9 Weierstrass’ Example 40 2.10 Books 43 Exercises 44 3 Calculus for Vector-Valued Functions 46 3.1 Vector-Valued Functions 46 3.2 Limits and Continuity of Vector-Valued Functions 47 3.3 Differentiation and Jacobians 48 3.4 The Inverse Function Theorem 52 3.5 The Implicit Function Theorem 54 3.6 Books 58 Exercises 59 4 Point Set Topology 61 4.1 Basic Definitions 61 4.2 The Standard Topology on Rn 63 4.3 Metric Spaces 70 4.4 Bases for Topologies 71 4.5 Zariski Topology of Commutative Rings 72 4.6 Books 75 Exercises 76 5 Classical Stokes’ Theorems 78 5.1 Preliminaries about Vector Calculus 79 5.1.1 Vector Fields 79 5.1.2 Manifolds and Boundaries 80 5.1.3 Path Integrals 84 5.1.4 Surface Integrals 88 5.1.5 The Gradient 90 5.1.6 The Divergence 90 5.1.7 The Curl 91 5.1.8 Orientability 92 5.2 The Divergence Theorem and Stokes’ Theorem 92 5.3 A Physical Interpretation of the Divergence Theorem 94 5.4 A Physical Interpretation of Stokes’ Theorem 96 Contents ix 5.5 Sketch of a Proof of the Divergence Theorem 97 5.6 Sketch of a Proof of Stokes’ Theorem 102 5.7 Books 105 Exercises 105 6 Differential Forms and Stokes’ Theorem 107 6.1 Volumes of Parallelepipeds 107 6.2 Differential Forms and the Exterior Derivative 111 6.2.1 Elementary k-Forms 111 6.2.2 The Vector Space of k-Forms 114 6.2.3 Rules for Manipulating k-Forms 115 6.2.4 Differential k-Forms and the Exterior Derivative 118 6.3 Differential Forms and Vector Fields 120 6.4 Manifolds 122 6.5 Tangent Spaces and Orientations 128 6.5.1 Tangent Spaces for Implicit and Parametric Manifolds 128 6.5.2 Tangent Spaces for Abstract Manifolds 129 6.5.3 Orientation of a Vector Space 130 6.5.4 Orientation of a Manifold and Its Boundary 131 6.6 Integration on Manifolds 133 6.7 Stokes’ Theorem 135 6.8 Books 138 Exercises 138 7 Curvature for Curves and Surfaces 141 7.1 Plane Curves 141 7.2 Space Curves 144 7.3 Surfaces 148 7.4 The Gauss–Bonnet Theorem 153 7.5 Books 154 Exercises 154 8 Geometry 156 8.1 Euclidean Geometry 156 8.2 Hyperbolic Geometry 158 8.3 Elliptic Geometry 161 8.4 Curvature 162 8.5 Books 163 Exercises 163 9 Countability and the Axiom of Choice 165 9.1 Countability 165 9.2 Naive Set Theory and Paradoxes 169 9.3 The Axiom of Choice 171 9.4 Non-measurable Sets 172 x Contents 9.5 Gödel and Independence Proofs 173 9.6 Books 174 Exercises 175 10 Elementary Number Theory 176 10.1 Types of Numbers 177 10.2 Prime Numbers 179 10.3 The Division Algorithm and the Euclidean Algorithm 181 10.4 Modular Arithmetic 183 10.5 Diophantine Equations 183 10.6 Pythagorean Triples 185 10.7 Continued Fractions 187 10.8 Books 190 Exercises 191 11 Algebra 192 11.1 Groups 192 11.2 Representation Theory 198 11.3 Rings 200 11.4 Fields and Galois Theory 202 11.5 Books 207 Exercises 208 12 Algebraic Number Theory 210 12.1 Algebraic Number Fields 210 12.2 Algebraic Integers 211 12.3 Units 213 12.4 Primes and Problems with Unique Factorization 214 12.5 Books 215 Exercises 215 13 Complex Analysis 216 13.1 Analyticity as a Limit 217 13.2 Cauchy–Riemann Equations 219 13.3 Integral Representations of Functions 224 13.4 Analytic Functions as Power Series 232 13.5 Conformal Maps 235 13.6 The Riemann Mapping Theorem 238 13.7 Several Complex Variables: Hartog’s Theorem 240 13.8 Books 241 Exercises 242 14 Analytic Number Theory 245 14.1 The Riemann Zeta Function 245 14.2 Riemann’s Insight 247 Contents xi 14.3 The Gamma Function 248 14.4 The Functional Equation: A Hidden Symmetry 249 14.5 Linking π(x) with the Zeros of ζ (s) 250 14.6 Books 253 Exercises 254 15 Lebesgue Integration 255 15.1 Lebesgue Measure 255 15.2 The Cantor Set 258 15.3 Lebesgue Integration 260 15.4 Convergence Theorems 263 15.5 Books 264 Exercises 265 16 Fourier Analysis 266 16.1 Waves, Periodic Functions and Trigonometry 266 16.2 Fourier Series 267 16.3 Convergence Issues 273 16.4 Fourier Integrals and Transforms 275 16.5 Solving Differential Equations 278 16.6 Books 281 Exercises 281 17 Differential Equations 282 17.1 Basics 282 17.2 Ordinary Differential Equations 283 17.3 The Laplacian 287 17.3.1 Mean Value Principle 287 17.3.2 Separation of Variables 288 17.3.3 Applications to Complex Analysis 291 17.4 The Heat Equation 291 17.5 The Wave Equation 294 17.5.1 Derivation 294 17.5.2 Change of Variables 298 17.6 The Failure of Solutions: Integrability Conditions 300 17.7 Lewy’s Example 302 17.8 Books 303 Exercises 303 18 Combinatorics and Probability Theory 304 18.1 Counting 304 18.2 Basic Probability Theory 306 18.3 Independence 308 18.4 Expected Values and Variance 309 18.5 Central Limit Theorem 312 xii Contents 18.6 Stirling’s Approximation for n! 319 18.7 Books 324 Exercises 325 19 Algorithms 327 19.1 Algorithms and Complexity 327 19.2 Graphs: Euler and Hamiltonian Circuits 328 19.3 Sorting and Trees 332 19.4 P=NP? 336 19.5 Numerical Analysis: Newton’s Method 337 19.6 Books 343 Exercises 343 20 Category Theory 345 20.1 The Basic Definitions 345 20.2 Examples 347 20.3 Functors 347 20.3.1 Link with Equivalence Problems 347 20.3.2 Definition of Functor 348 20.3.3 Examples of Functors 349 20.4 Natural Transformations 350 20.5 Adjoints 352 20.6 “There Exists” and “For All” as Adjoints 356 20.7 Yoneda Lemma 357 20.8 Arrow, Arrows, Arrows Everywhere 362 20.9 Books 363 Exercises 364 Appendix Equivalence Relations 365 Bibliography 367 Index 375 Preface Math is Exciting. We are living in the greatest age of mathematics ever seen. In the 1930s, there were some people who feared that the rising abstractions of the early twentieth century would lead either to mathematicians working on sterile, silly intellectual exercises or to mathematics splitting into sharply distinct subdisciplines, similar to the way natural philosophy split into physics, chemistry, biology and geology. But the very opposite has happened. Since World War II, it has become increasingly clear that mathematics is one unified discipline. What were separate areas now feed off each other. Learning and creating mathematics is indeed a worthwhile way to spend one’s life. Math is Hard. Unfortunately, people are just not that good at mathematics. While intensely enjoyable, it also requires hard work and self-discipline. I know of no serious mathematician who finds math easy. In fact, most, after a few beers, will confess how stupid and slow they are. This is one of the personal hurdles that a beginning graduate student must face, namely how to deal with the profundity of mathematics in stark comparison to our own shallow understandings of mathematics. This is in part why the attrition rate in graduate school is so high. At the best schools, with the most successful retention rates, usually only about half of the people who start eventually get their Ph.D. Even schools that are in the top twenty have at times had eighty percent of their incoming graduate students not finish. This is in spite of the fact that most beginning graduate students are, in comparison to the general population, amazingly good at mathematics. Most have found that math is one area in which they could shine. Suddenly, in graduate school, they are surrounded by people who are just as good (and who seem even better). To make matters worse, mathematics is a meritocracy. The faculty will not go out of their way to make beginning students feel good (this is not the faculty’s job; their job is to discover new mathematics). The fact is that there are easier (though, for a mathematician, less satisfying) ways to make a living. There is truth in the statement that you must be driven to become a mathematician. Mathematics is exciting, though. The frustrations should more than be compen- sated for by the thrills of learning and eventually creating (or discovering) new mathematics. That is, after all, the main goal for attending graduate school, to become a research mathematician. As with all creative endeavors, there will be emotional highs and lows. Only jobs that are routine and boring will not have these peaks and valleys. Part of the difficulty of graduate school is learning how to deal with the low times. xiv Preface Goal of the Book. The goal of this book is to give people at least a rough idea of the many topics that beginning graduate students at the best graduate schools are assumed to know. Since there is unfortunately far more that is needed to be known for graduate school and for research than it is possible to learn in a mere four years of college, few beginning students know all of these topics, but hopefully all will know at least some. Different people will know different topics. This strongly suggests the advantage of working with others. There is another goal. Many non-mathematicians suddenly find that they need to know some serious math. The prospect of struggling with a text will legitimately seem to be daunting. Each chapter of this book will provide for these folks a place where they can get a rough idea and outline of the topic they are interested in. As for general hints for helping sort out some mathematical field, certainly one should always, when faced with a new definition, try to find a simple example and a simple non-example. A non-example, by the way, is an example that almost, but not quite, satisfies the definition. But beyond finding these examples, one should examine the reason why the basic definitions were given. This leads to a split into two streams of thought for how to do mathematics. One can start with reasonable, if not naive, definitions and then prove theorems about these definitions. Frequently the statements of the theorems are complicated, with many different cases and conditions, and the proofs are quite convoluted, full of special tricks. The other, more mid-twentieth-century approach, is to spend quite a bit of time on the basic definitions, with the goal of having the resulting theorems clearly stated and having straightforward proofs. Under this philosophy, any time there is a trick in a proof, it means more work needs to be done on the definitions. It also means that the definitions themselves take work to understand, even at the level of figuring out why anyone would care. But now the theorems can be cleanly stated and proved. In this approach the role of examples becomes key. Usually there are basic examples whose properties are already known. These examples will shape the abstract definitions and theorems. The definitions in fact are made in order for the resulting theorems to give, for the examples, the answers we expect. Only then can the theorems be applied to new examples and cases whose properties are unknown. For example, the correct notion of a derivative and thus of the slope of a tangent line is somewhat complicated. But whatever definition is chosen, the slope of a horizontal line (and hence the derivative of a constant function) must be zero. If the definition of a derivative does not yield that a horizontal line has zero slope, it is the definition that must be viewed as wrong, not the intuition behind the example. For another example, consider the definition of the curvature of a plane curve, which is given in Chapter 7. The formulas are somewhat ungainly. But whatever the definitions, they must yield that a straight line has zero curvature, that at every point of a circle the curvature is the same and that the curvature of a circle with small radius must be greater than the curvature of a circle with a larger radius (reflecting the fact that it is easier to balance on the Earth than on a basketball). If a definition of curvature does not do this, we would reject the definitions, not the examples. Preface xv Thus it pays to know the key examples. When trying to undo the technical maze of a new subject, knowing these examples will not only help explain why the theorems and definitions are what they are but will even help in predicting what the theorems must be. Of course this is vague and ignores the fact that first proofs are almost always ugly and full of tricks, with the true insight usually hidden. But in learning the basic material, look for the key idea, the key theorem, and then see how these shape the definitions. Caveats for Critics. This book is far from a rigorous treatment of any topic. There is a deliberate looseness in style and rigor. I am trying to get the point across and to write in the way that most mathematicians talk to each other. The level of rigor in this book would be totally inappropriate in a research paper. Consider that there are three tasks for any intellectual discipline: 1. Coming up with new ideas. 2. Verifying new ideas. 3. Communicating new ideas. How people come up with new ideas in mathematics (or in any other field) is overall a mystery. There are at best a few heuristics in mathematics, such as asking if something is unique or if it is canonical. It is in verifying new ideas that mathematicians are supreme. Our standard is that there must be a rigorous proof. Nothing else will do. This is why the mathematical literature is so trustworthy (not that mistakes never creep in, but they are usually not major errors). In fact, I would go as far as to say that if any discipline has as its standard of verification rigorous proof, then that discipline must be a part of mathematics. Certainly the main goal for a math major in the first few years of college is to learn what a rigorous proof is. Unfortunately, we do a poor job of communicating mathematics. Every year there are millions of people who take math courses. A large number of people who you meet on the street or on an airplane have taken college-level mathematics. How many enjoyed it? How many saw no real point to it? While this book is not addressed to that random airplane person, it is addressed to beginning graduate students, people who already enjoy mathematics but who all too frequently get blown out of the mathematical water by mathematics presented in an unmotivated, but rigorous, manner. There is no problem with being non-rigorous, as long as you know and clearly label when you are being non-rigorous. Comments on the Bibliography. There are many topics in this book. While I would love to be able to say that I thoroughly know the literature on each of these topics, that would be a lie. The bibliography has been cobbled together from recommendations from colleagues, from books that I have taught from and books that I have used. I am confident that there are excellent texts that I do not know about. If you have a favorite, please let me know at [email protected]. xvi Preface While the first edition of this book was being written, Paulo Ney De Souza and Jorge-Nuno Silva wrote Berkeley Problems in Mathematics , which is an excellent collection of problems that have appeared over the years on qualifying exams (usually taken in the first or second year of graduate school) in the math department at Berkeley. In many ways, their book is the complement of this one, as their work is the place to go to when you want to test your computational skills, while this book concentrates on underlying intuitions. For example, say you want to learn about complex analysis. You should first read Chapter 13 of this book to get an overview of the basics about complex analysis. Then choose a good complex analysis book and work most of its exercises. Then use the problems in De Souza and Silva as a final test of your knowledge. The book Mathematics, Form and Function by Mac Lane is excellent. It provides an overview of much of mathematics. I am listing it here because there was no other place where it could be naturally referenced. Second- and third-year graduate students should seriously consider reading this book. After the first edition of this book appeared, the amazing The Princeton Companion to Mathematics was published. Timothy Gowers, along with June Barrow-Green and Imre Leader, got many of the world’s best mathematicians to write seemingly about all of modern mathematics. This volume is historically important. It is also a great read. Every mathematician should own a copy. A few years later came The Princeton Companion to Applied Mathematics , edited by Nigel Higham, along with Mark Dennis, Paul Glendinning, Paul Martin, Fadil Santosa and Jared Tanner. This is another great resource. Acknowledgments First, I would like to thank Lori Pedersen for a wonderful job of creating the illustrations and diagrams for this book. Many people have given feedback and ideas over the years. Nero Budar, Chris French and Richard Haynes were student readers of one of the early versions of this manuscript. Ed Dunne gave much needed advice and help. In the spring semester of 2000 at Williams, Tegan Cheslack-Postava, Ben Cooper and Ken Dennison went over the book line-by-line. Others who have given ideas have included Bill Lenhart, Frank Morgan, Cesar Silva, Colin Adams, Ed Burger, David Barrett, Sergey Fomin, Peter Hinman, Smadar Karni, Dick Canary, Jacek Miekisz, David James and Eric Schippers. During the final rush to finish this book, Trevor Arnold, Yann Bernard, Bill Correll, Jr., Bart Kastermans, Christopher Kennedy, Elizabeth Klodginski, Alex Köronya, Scott Kravitz, Steve Root and Craig Westerland have provided amazing help. Marissa Barschdorff texed a very early version of this manuscript. The Williams College Department of Mathematics and Statistics has been a wonderful place to write the bulk of this book; I thank all of my Williams colleagues. The final revisions for the first edition were done while I was on sabbatical at the University of Michigan, another great place to do mathematics. Preface xvii I would like to thank my editor at Cambridge, Lauren Cowles, and also Caitlin Doggart at Cambridge. For the second edition, I would like to thank Tom Harris and David Tranah from Cambridge. Gary Knapp has throughout provided moral support and gave a close, detailed reading to an early version of the manuscript. My wife, Lori, has also given much needed encouragement and has spent many hours catching many of my mistakes. To all I owe thanks. A few years after the first edition appeared, I was talking to David Dorman, when I suddenly remembered a conversation that David and I had had in graduate school. We were talking about the many and seemingly heavy expectations that the faculty assumed on the knowledge of incoming graduate students. I seem to recall that David and I called it the “Miracle Summer,” that three months before graduate school started, when the mathematical gods were supposed to have bestowed upon us all of this mathematical knowledge. Now David and I were clearly just making jokes, that long ago day in the coffee room of Brown University. Still, I suspect that the kernel of the idea for this book started then. For that, I thank David. Finally, near the completion of the first edition of this work, Bob Mizner passed away at an early age. I still miss him. It is in his memory that I dedicate this book (though no doubt he would have disagreed with most of my presentations and choices of topics; he definitely would have made fun of the lack of rigor). On the Structure of Mathematics If you look at articles in current journals, the range of topics seems immense. How could anyone even begin to make sense out of all of these topics? And indeed there is a glimmer of truth in this. People cannot effortlessly switch from one research field to another. But not all is chaos. There are at least two ways of placing some type of structure on all of mathematics. Equivalence Problems Mathematicians want to know when things are the same, or, when they are equivalent. What is meant by the same is what distinguishes one branch of mathematics from another. For example, a topologist will consider two geometric objects (technically, two topological spaces) to be the same if one can be twisted and bent, but not ripped, into the other. Thus for a topologist, we have To a differential topologist, two geometric objects are the same if one can be smoothly bent and twisted into the other. By smooth we mean that no sharp edges can be introduced. Then The four sharp corners of the square are what prevent it from being equivalent to the circle. For a differential geometer, the notion of equivalence is even more restrictive. Here two objects are the same not only if one can be smoothly bent and twisted into the other but also if the curvatures agree. Thus for the differential geometer, the circle is no longer equivalent to the ellipse: xx On the Structure of Mathematics As a first pass to placing structure on mathematics, we can view an area of mathematics as consisting of certain Objects, coupled with the notion of Equivalence between these objects. We can explain equivalence by looking at the allowed Maps, or functions, between the objects. At the beginning of most chapters, we will list the Objects and the Maps between the objects that are key for that subject. The Equivalence Problem is of course the problem of determining when two objects are the same, using the allowable maps. If the equivalence problem is easy to solve for some class of objects, then the corresponding branch of mathematics will no longer be active. If the equivalence problem is too hard to solve, with no known ways of attacking the problem, then the corresponding branch of mathematics will again not be active, though of course for opposite reasons. The hot areas of mathematics are precisely those for which there are rich partial but not complete answers to the equivalence problem. But what could we mean by a partial answer? Here enters the notion of invariance. Start with an example. Certainly the circle, as a topological space, is different from two circles, since a circle has only one connected component and two circles have two connected components. We map each topological space to a positive integer, namely the number of connected components of the topological space. Thus we have: Topological Spaces → Positive Integers. The key is that the number of connected components for a space cannot change under the notion of topological equivalence (under bendings and twistings). We say that the number of connected components is an invariant of a topological space. Thus if the spaces map to different numbers, meaning that they have different numbers of connected components, then the two spaces cannot be topologically equivalent. Of course, two spaces can have the same number of connected components and still be different. For example, both the circle and the sphere have only one connected component, but they are different. (These can be distinguished by looking at the dimension of each space, which is another topological invariant.) The goal of topology is to find enough invariants to be On the Structure of Mathematics xxi able to always determine when two spaces are different or the same. This has not come close to being done. Much of algebraic topology maps each space not to invariant numbers but to other types of algebraic objects, such as groups and rings. Similar techniques show up throughout mathematics. This provides for tremendous interplay between different branches of mathematics. The Study of Functions The mantra that we should all chant each night before bed is: Functions Describe the World. To a large extent what makes mathematics so useful to the world is that seemingly disparate real-world situations can be described by the same type of function. For example, think of how many different problems can be recast as finding the maximum or minimum of a function. Different areas of mathematics study different types of functions. Calculus studies differentiable functions from the real numbers to the real numbers, algebra studies polynomials of degree one and two (in high school) and permutations (in college), linear algebra studies linear functions, or matrix multiplication. Thus in learning a new area of mathematics, you should always “find the function” of interest. Hence at the beginning of most chapters we will state the type of function that will be studied. Equivalence Problems in Physics Physics is an experimental science. Hence any question in physics must eventually be answered by performing an experiment. But experiments come down to making observations, which usually are described by certain computable numbers, such as velocity, mass or charge. Thus the experiments in physics are described by numbers that are read off in the lab. More succinctly, physics is ultimately: Numbers in Boxes where the boxes are various pieces of lab machinery used to make measurements. But different boxes (different lab set-ups) can yield different numbers, even if the underlying physics is the same. This happens even at the trivial level of choice of units. More deeply, suppose you are modeling the physical state of a system as the solution of a differential equation. To write down the differential equation, a coordinate system must be chosen. The allowed changes of coordinates are determined by the physics. For example, Newtonian physics can be distinguished from Special Relativity in that each has different allowable changes of coordinates. xxii On the Structure of Mathematics Thus while physics is “Numbers in Boxes,” the true questions come down to when different numbers represent the same physics. But this is an equivalence problem; mathematics comes to the fore. (This explains in part the heavy need for advanced mathematics in physics.) Physicists want to find physics invariants. Usually, though, physicists call their invariants “Conservation Laws.” For example, in classical physics the conservation of energy can be recast as the statement that the function that represents energy is an invariant function. Brief Summaries of Topics 0.1 Linear Algebra.............................................................................. Linear algebra studies linear transformations and vector spaces, or in another language, matrix multiplication and the vector space Rn. You should know how to translate between the language of abstract vector spaces and the language of matrices. In particular, given a basis for a vector space, you should know how to represent any linear transformation as a matrix. Further, given two matrices, you should know how to determine if these matrices actually represent the same linear transformation, but under different choices of bases. The key theorem of linear algebra is a statement that gives many equivalent descriptions for when a matrix is invertible. These equivalences should be known cold. You should also know why eigenvectors and eigenvalues occur naturally in linear algebra. 0.2 Real Analysis.............................................................................. The basic definitions of a limit, continuity, differentiation and integration should be known and understood in terms of s and δs. Using this  and δ language, you should be comfortable with the idea of uniform convergence of functions. 0.3 Differentiating Vector-Valued Functions.............................................................................. The goal of the Inverse Function Theorem is to show that a differentiable function f : Rn → Rn is locally invertible if and only if the determinant of its derivative (the Jacobian) is non-zero. You should be comfortable with what it means for a vector- valued function to be differentiable, why its derivative must be a linear map (and hence representable as a matrix, the Jacobian) and how to compute the Jacobian. Further, you should know the statement of the Implicit Function Theorem and see why it is closely related to the Inverse Function Theorem. 0.4 Point Set Topology.............................................................................. You should understand how to define a topology in terms of open sets and how to express the idea of continuous functions in terms of open sets. The standard xxiv Brief Summaries of Topics topology on Rn must be well understood, at least to the level of the Heine–Borel Theorem. Finally, you should know what a metric space is and how a metric can be used to define open sets and hence a topology. 0.5 Classical Stokes’ Theorems.............................................................................. You should know about the calculus of vector fields. In particular, you should know how to compute, and know the geometric interpretations behind, the curl and the divergence of a vector field, the gradient of a function and the path integral along a curve. Then you should know the classical extensions of the Fundamental Theorem of Calculus, namely the Divergence Theorem and Stokes’ Theorem. You should especially understand why these are indeed generalizations of the Fundamental Theorem of Calculus. 0.6 Differential Forms and Stokes’ Theorem.............................................................................. Manifolds are naturally occurring geometric objects. Differential k-forms are the tools for doing calculus on manifolds. You should know the various ways of defining a manifold, how to define and to think about differential k-forms, and how to take the exterior derivative of a k-form. You should also be able to translate from the language of k-forms and exterior derivatives to the language from Chapter 5 on vector fields, gradients, curls and divergences. Finally, you should know the statement of Stokes’ Theorem, understand why it is a sharp quantitative statement about the equality of the integral of a k-form on the boundary of a (k + 1)-dimensional manifold with the integral of the exterior derivative of the k-form on the manifold, and how this Stokes Theorem has as special cases the Divergence Theorem and the Stokes Theorem from Chapter 5. 0.7 Curvature for Curves and Surfaces.............................................................................. Curvature, in all of its manifestations, attempts to measure the rate of change of the directions of tangent spaces of geometric objects. You should know how to compute the curvature of a plane curve, the curvature and the torsion of a space curve and the two principal curvatures, in terms of the Hessian, of a surface in space. 0.8 Geometry.............................................................................. Different geometries are built out of different axiomatic systems. Given a line l and a point p not on l, Euclidean geometry assumes that there is exactly one line containing p parallel to l, hyperbolic geometry assumes that there is more than one line containing p parallel to l, and elliptic geometries assume that there is Brief Summaries of Topics xxv no line parallel to l. You should know models for hyperbolic geometry, single elliptic geometry and double elliptic geometry. Finally, you should understand why the existence of such models implies that all of these geometries are mutually consistent. 0.9 Countability and the Axiom of Choice.............................................................................. You should know what it means for a set to be countably infinite. In particular, you should know that the integers and rationals are countably infinite while the real numbers are uncountably infinite. The statement of the Axiom of Choice and the fact that it has many seemingly bizarre equivalences should also be known. 0.10 Elementary Number Theory.............................................................................. You should know the basics of modular arithmetic. Further, you should know why there are infinitely many primes, what is a Diophantine equation, what is the Euclidean algorithm and how the Euclidean algorithm is linked to continued fractions. 0.11 Algebra.............................................................................. Groups, the basic objects of study in abstract algebra, are the algebraic interpreta- tions of geometric symmetries. One should know the basics about groups (at least to the level of the Sylow Theorem, which is a key tool for understanding finite groups), rings and fields. You should also know Galois Theory, which provides the link between finite groups and the finding of the roots of a polynomial and hence shows the connections between high-school and abstract algebra. Finally, you should know the basics behind representation theory, which is how one relates abstract groups to groups of matrices. 0.12 Algebraic Number Theory.............................................................................. You should know what an algebraic number field is and a few examples. Further, you should know that each algebraic number field contains an analog of the integers in it, but that these “integers” need not have unique factorization. 0.13 Complex Analysis.............................................................................. The main point is to recognize and understand the many equivalent ways for describing when a function can be analytic. Here we are concerned with functions xxvi Brief Summaries of Topics f : U → C, where U is an open set in the complex numbers C. You should know that such a function f (z) is said to be analytic if it satisfies any of the following equivalent conditions. (a) For all z0 ∈ U , f (z) − f (z0 ) lim z→z0 z − z0 exists. (b) The real and imaginary parts of the function f satisfy the Cauchy–Riemann equations: ∂Ref ∂Imf = ∂x ∂y and ∂Ref ∂Imf =−. ∂y ∂x (c) If γ is any counterclockwise simple loop in C = R2 and if z0 is any complex number in the interior of γ , then  1 f (z) f (z0 ) = dz. 2π i γ z − z0 This is the Cauchy Integral formula. (d) For any complex number z0 , there is an open neighborhood in C = R2 of z0 on which ∞  f (z) = ak (z − z0 )k, k=0 is a uniformly converging series.  Further, if f : U → C is analytic and if f (z0 ) = 0, then at z0 , the function f is conformal (i.e., angle-preserving), viewed as a map from R2 to R2. 0.14 Analytic Number Theory.............................................................................. You should know what the Riemann zeta function is and how it is related to prime numbers. You should also know the statement of the Riemann Hypothesis. Brief Summaries of Topics xxvii 0.15 Lebesgue Integration.............................................................................. You should know the basic ideas behind Lebesgue measure and integration, at least to the level of the Lebesgue Dominating Convergence Theorem, and the concept of sets of measure zero. 0.16 Fourier Analysis.............................................................................. You should know how to find the Fourier series of a periodic function, the Fourier integral of a function, the Fourier transform, and how Fourier series relate to Hilbert spaces. Further, you should see how Fourier transforms can be used to simplify differential equations. 0.17 Differential Equations.............................................................................. Much of physics, economics, mathematics and other sciences comes down to trying to find solutions to differential equations. One should know that the goal in differential equations is to find an unknown function satisfying an equation involving derivatives. Subject to mild restrictions, there are always solutions to ordinary differential equations. This is most definitely not the case for partial differential equations, where even the existence of solutions is frequently unknown. You should also be familiar with the three traditional classes of partial differential equations: the heat equation, the wave equation and the Laplacian. 0.18 Combinatorics and Probability Theory.............................................................................. Both elementary combinatorics and basic probability theory reduce to problems in counting. You should know that   n n! = k k! (n − k)! isnthe number of ways of choosing k elements from n elements. The relation of k to the binomial theorem for polynomials is useful to have handy for many computations. Basic probability theory should be understood. In particular one should understand the terms: sample space, random variable (both its intuitions and its definition as a function), expected value and variance. One should definitely understand why counting arguments are critical for calculating probabilities of finite sample spaces. The link between probability and integral calculus can be seen in the various versions of the Central Limit Theorem, the ideas of which should be known. xxviii Brief Summaries of Topics 0.19 Algorithms.............................................................................. You should understand what is meant by the complexity of an algorithm, at least to the level of understanding the question P=NP. Basic graph theory should be known; for example, you should see why a tree is a natural structure for understanding many algorithms. Numerical analysis is the study of algorithms for approximating the answer to computations in mathematics. As an example, you should understand Newton’s method for approximating the roots of a polynomial. 0.20 Category Theory.............................................................................. You should know that category theory is a method for thinking about mathematics. In particular, each area of mathematics can be put into the language of categories, which means that it can be described by the relevant mathematical objects and the morphism (crudely the functions) between the objects. Further, the results can be described by diagrams of arrows. 1 Linear Algebra Basic Object: Vector Spaces Basic Map: Linear Transformations Basic Goal: Equivalences for the Invertibility of Matrices 1.1 Introduction.............................................................................. Though a bit of an exaggeration, it can be said that a mathematical problem can be solved only if it can be reduced to a calculation in linear algebra. And a calculation in linear algebra will reduce ultimately to the solving of a system of linear equations, which in turn comes down to the manipulation of matrices. Throughout this text and, more importantly, throughout mathematics, linear algebra is a key tool (or more accurately, a collection of intertwining tools) that is critical for doing calculations. The power of linear algebra lies not only in our ability to manipulate matrices in order to solve systems of linear equations. The abstraction of these concrete objects to the ideas of vector spaces and linear transformations allows us to see the common conceptual links between many seemingly disparate subjects. (Of course, this is the advantage of any good abstraction.) For example, the study of solutions to linear differential equations has, in part, the same feel as trying to model the hood of a car with cubic polynomials, since both the space of solutions to a linear differential equation and the space of cubic polynomials that model a car hood form vector spaces. The key theorem of linear algebra, discussed in Section 1.6, gives many equivalent ways of telling when a system of n linear equations in n unknowns has a solution. Each of the equivalent conditions is important. What is remarkable and what gives linear algebra its oomph is that they are all the same. 1.2 The Basic Vector Space Rn.............................................................................. The quintessential vector space is Rn , the set of all n-tuples of real numbers {(x1,... ,xn ) : xi ∈ Rn }. 2 Linear Algebra As we will see in the next section, what makes this a vector space is that we can add together two n-tuples to get another n-tuple (x1,... ,xn ) + (y1,... ,yn ) = (x1 + y1,... ,xn + yn ) and that we can multiply each n-tuple by a real number λ λ(x1,... ,xn ) = (λx1,... ,λxn ) to get another n-tuple. Of course each n-tuple is usually called a vector and the real numbers λ are called scalars. When n = 2 and when n = 3 all of this reduces to the vectors in the plane and in space that most of us learned in high school. The natural map from some Rn to an Rm is given by matrix multiplication. Write a vector x ∈ Rn as a column vector: ⎛ ⎞ x1 ⎜.. ⎟ x = ⎝. ⎠. xn Similarly, we can write a vector in Rm as a column vector with m entries. Let A be an m × n matrix ⎛ ⎞ a11 a12... a1n ⎜.. ⎟. A = ⎝.......... ⎠ am1...... amn Then Ax is the m-tuple: ⎛ ⎞⎛ ⎞ ⎛ ⎞ a11 a12... a1n x1 a11 x1 + · · · + a1n xn ⎜.. ⎟ ⎜.. ⎟ = ⎜ ⎟ Ax = ⎝.......... ⎠⎝. ⎠ ⎝... ⎠. am1...... amn xn am1 x1 + · · · + amn xn For any two vectors x and y in Rn and any two scalars λ and μ, we have A(λx + μy) = λAx + μAy. In the next section we will use the linearity of matrix multiplication to motivate the definition for a linear transformation between vector spaces. Now to relate all of this to the solving of a system of linear equations. Suppose we are given numbers b1,... ,bm and numbers a11,... ,amn. Our goal is to find n numbers x1,... ,xn that solve the following system of linear equations: a11 x1 + · · · + a1n xn = b1... am1 x1 + · · · + amn xn = bm. 1.2 The Basic Vector Space Rn 3 Calculations in linear algebra will frequently reduce to solving a system of linear equations. When there are only a few equations, we can find the solutions by hand, but as the number of equations increases, the calculations quickly turn from enjoyable algebraic manipulations into nightmares of notation. These nightmarish complications arise not from any single theoretical difficulty but instead stem solely from trying to keep track of the many individual minor details. In other words, it is a problem in bookkeeping. Write ⎛ ⎞ ⎛ ⎞ b1 a11 a12... a1n ⎜ ⎟ ⎜.. ⎟ b = ⎝... ⎠ , A = ⎝.......... ⎠ bm am1...... amn and our unknowns as ⎛ ⎞ x1 ⎜.. ⎟ x = ⎝. ⎠. xn Then we can rewrite our system of linear equations in the more visually appealing form of Ax = b. When m > n (when there are more equations than unknowns), we expect there to be, in general, no solutions. For example, when m = 3 and n = 2, this corresponds geometrically to the fact that three lines in a plane will usually have no common point of intersection. When m < n (when there are more unknowns than equations), we expect there to be, in general, many solutions. In the case when m = 2 and n = 3, this corresponds geometrically to the fact that two planes in space will usually intersect in an entire line. Much of the machinery of linear algebra deals with the remaining case when m = n. Thus we want to find the n × 1 column vector x that solves Ax = b, where A is a given n × n matrix and b is a given n × 1 column vector. Suppose that the square matrix A has an inverse matrix A−1 (which means that A−1 is also n × n and more importantly that A−1 A = I , with I the identity matrix). Then our solution will be x = A−1 b since Ax = A(A−1 b) = I b = b. Thus solving our system of linear equations comes down to understanding when the n × n matrix A has an inverse. (If an inverse matrix exists, then there are algorithms for its calculation.) 4 Linear Algebra The key theorem of linear algebra, stated in Section 1.6, is in essence a list of many equivalences for when an n × n matrix has an inverse. It is thus essential to understanding when a system of linear equations can be solved. 1.3 Vector Spaces and Linear Transformations.............................................................................. The abstract approach to studying systems of linear equations starts with the notion of a vector space. Definition 1.3.1 A set V is a vector space over the real numbers1 R if there are maps: 1. R × V → V , denoted by a · v or av for all real numbers a and elements v in V , 2. V × V → V , denoted by v + w for all elements v and w in the vector space V , with the following properties. (a) There is an element 0, in V such that 0 + v = v for all v ∈ V. (b) For each v ∈ V , there is an element (−v) ∈ V with v + (−v) = 0. (c) For all v,w ∈ V , v + w = w + v. (d) For all a ∈ R and for all v,w ∈ V , we have that a(v + w) = av + aw. (e) For all a,b ∈ R and all v ∈ V , a(bv) = (a · b)v. (f) For all a,b ∈ R and all v ∈ V , (a + b)v = av + bv. (g) For all v ∈ V , 1 · v = v. As a matter of notation, and to agree with common usage, the elements of a vector space are called vectors and the elements of R (or whatever field is being used) are called scalars. Note that the space Rn given in the last section certainly satisfies these conditions. The natural map between vector spaces is that of a linear transformation. Definition 1.3.2 A linear transformation T : V → W is a function from a vector space V to a vector space W such that for any real numbers a1 and a2 and any vectors v1 and v2 in V , we have T (a1 v1 + a2 v2 ) = a1 T (v1 ) + a2 T (v2 ). Matrix multiplication from an Rn to an Rm gives an example of a linear transformation. Definition 1.3.3 A subset U of a vector space V is a subspace of V if U is itself a vector space............................................................................... 1 The real numbers can be replaced by the complex numbers and in fact by any field (which will be defined in Chapter 11 on algebra). 1.3 Vector Spaces and Linear Transformations 5 In practice, it is usually easy to see if a subset of a vector space is in fact a subspace, by the following proposition, whose proof is left to the reader. Proposition 1.3.4 A subset U of a vector space V is a subspace of V if U is closed under addition and scalar multiplication. Given a linear transformation T : V → W , there are naturally occurring subspaces of both V and W. Definition 1.3.5 If T : V → W is a linear transformation, then the kernel of T is: ker(T ) = {v ∈ V : T (v) = 0} and the image of T is Im(T ) = {w ∈ W : there exists a v ∈ V with T (v) = w}. The kernel is a subspace of V , since if v1 and v2 are two vectors in the kernel and if a and b are any two real numbers, then T (av1 + bv2 ) = aT (v1 ) + bT (v2 ) =a·0+b·0 = 0. In a similar way we can show that the image of T is a subspace of W , which we leave for one of the exercises. If the only vector spaces that ever occurred were column vectors in Rn , then even this mild level of abstraction would be silly. This is not the case. Here we look at only one example. Let C k [0,1] be the set of all real-valued functions with domain the unit interval [0,1]: f : [0,1] → R such that the kth derivative of f exists and is continuous. Since the sum of any two such functions and a multiple of any such function by a scalar will still be in C k [0,1], we have a vector space. Though we will officially define dimension in the next section, C k [0,1] will be infinite dimensional (and thus definitely not some Rn ). We can view the derivative as a linear transformation from C k [0,1] to those functions with one less derivative, C k−1 [0,1]: d : C k [0,1] → C k−1 [0,1]. dx d The kernel of dx consists of those functions with df dx = 0, namely constant functions. 6 Linear Algebra Now consider the differential equation d2 f df 2 +3 + 2f = 0. dx dx Let T be the linear transformation: d2 d T = + 3 + 2I : C 2 [0,1] → C 0 [0,1]. dx 2 dx The problem of finding a solution f (x) to the original differential equation can now be translated to finding an element of the kernel of T. This suggests the possibility (which indeed is true) that the language of linear algebra can be used to understand solutions to (linear) differential equations. 1.4 Bases, Dimension, and Linear Transformations as Matrices.............................................................................. Our next goal is to define the dimension of a vector space. Definition 1.4.1 A set of vectors (v1,... ,vn ) form a basis for the vector space V if given any vector v in V , there are unique scalars a1,... ,an ∈ R with v = a1 v1 + · · · + an vn. Definition 1.4.2 The dimension of a vector space V , denoted by dim(V ), is the number of elements in a basis. As it is far from obvious that the number of elements in a basis will always be the same, no matter which basis is chosen, in order to make the definition of the dimension of a vector space well defined we need the following theorem (which we will not prove). Theorem 1.4.3 All bases of a vector space V have the same number of elements. For Rn , the usual basis is   (1,0,... ,0),(0,1,0,... ,0),... ,(0,... ,0,1). Thus Rn is n-dimensional. Of course if this were not true, the above definition of dimension would be wrong and we would need another. This is an example of the principle mentioned in the introduction. We have a good intuitive understanding of what dimension should mean for certain specific examples: a line needs to be one dimensional, a plane two dimensional and space three dimensional. We then come up with a sharp definition. If this definition gives the “correct” answer for our three already understood examples, we are somewhat confident that the definition has 1.4 Bases, Dimension, and Linear Transformations 7 indeed captured what is meant by, in this case, dimension. Then we can apply the definition to examples where our intuitions fail. Linked to the idea of a basis is the idea of linear independence. Definition 1.4.4 Vectors (v1,... ,vn ) in a vector space V are linearly independent if whenever a1 v1 + · · · + an vn = 0, it must be the case that the scalars a1,... ,an must all be zero. Intuitively, a collection of vectors are linearly independent if they all point in different directions. A basis consists then in a collection of linearly independent vectors that span the vector space, where by span we mean the following. Definition 1.4.5 A set of vectors (v1,... ,vn ) span the vector space V if given any vector v in V , there are scalars a1,... ,an ∈ R with v = a1 v1 + · · · + an vn. Our goal now is to show how all linear transformations T : V → W between finite-dimensional spaces can be represented as matrix multiplication, provided we fix bases for the vector spaces V and W. First fix a basis {v1,... ,vn } for V and a basis {w1,... ,wm } for W. Before looking at the linear transformation T , we need to show how each element of the n-dimensional space V can be represented as a column vector in Rn and how each element of the m-dimensional space W can be represented as a column vector of Rn. Given any vector v in V , by the definition of basis, there are unique real numbers a1,... ,an with v = a 1 v1 + · · · + a n vn. We thus represent the vector v with the column vector: ⎛ ⎞ a1 ⎜.. ⎟ ⎝. ⎠. an Similarly, for any vector w in W , there are unique real numbers b1,... ,bm with w = b 1 w1 + · · · + b m wm. Here we represent w as the column vector ⎛ ⎞ b1 ⎜.. ⎟ ⎝. ⎠. bm 8 Linear Algebra Note that we have established a correspondence between vectors in V and W and column vectors Rn and Rm , respectively. More technically, we can show that V is isomorphic to Rn (meaning that there is a one-to-one, onto linear transformation from V to Rn and that W is isomorphic to Rm , though it must be emphasized that the actual correspondence only exists after a basis has been chosen (which means that while the isomorphism exists, it is not canonical; this is actually a big deal, as in practice it is unfortunately often the case that no basis is given to us). We now want to represent a linear transformation T : V → W as an m×n matrix A. For each basis vector vi in the vector space V , T (vi ) will be a vector in W. Thus there will exist real numbers a1i ,... ,ami such that T (vi ) = a1i w1 + · · · + ami wm. We want to see that the linear transformation T will correspond to the m × n matrix ⎛ ⎞ a11 a12... a1n ⎜.. A=⎝....... ⎟.... ⎠ am1...... amn Given any vector v in V , with v = a1 v1 + · · · + an vn , we have T (v) = T (a1 v1 + · · · + an vn ) = a1 T (v1 ) + · · · + an T (vn ) = a1 (a11 w1 + · · · + am1 wm ) + · · · + an (a1n w1 + · · · + amn wm ). But under the correspondences of the vector spaces with the various column spaces, this can be seen to correspond to the matrix multiplication of A times the column vector corresponding to the vector v: ⎛ ⎞⎛ ⎞ ⎛ ⎞ a11 a12... a1n a1 b1 ⎜........ ⎟ ⎜.. ⎟ = ⎜.. ⎟. ⎝.... ⎠⎝. ⎠ ⎝. ⎠ am1...... amn an bm Note that if T : V → V is a linear transformation from a vector space to itself, then the corresponding matrix will be n × n, a square matrix. Given different bases for the vector spaces V and W , the matrix associated to the linear transformation T will change. A natural problem is to determine when two matrices actually represent the same linear transformation, but under different bases. This will be the goal of Section 1.7. 1.5 The Determinant 9 1.5 The Determinant.............................................................................. Our next task is to give a definition for the determinant of a matrix. In fact, we will give three alternative descriptions of the determinant. All three are equivalent; each has its own advantages. Our first method is to define the determinant of a 1 × 1 matrix and then to define recursively the determinant of an n × n matrix. Since 1×1 matrices are just numbers, the following should not at all be surprising. Definition 1.5.1 The determinant of a 1 × 1 matrix (a) is the real-valued function det(a) = a. This should not yet seem significant. Before giving the definition of the determinant for a general n × n matrix, we need a little notation. For an n × n matrix ⎛ ⎞ a11 a12... a1n ⎜.. ⎟ , A = ⎝.......... ⎠ an1...... ann denote by Aij the (n − 1) × (n − 1) matrixobtained from A by deleting the ith row a a and the j th column. For example, if A = 11 12 , then A12 = (a21 ). Similarly a21 a22 ⎛ ⎞ 2 3 5   ⎝ ⎠ 6 9 if A = 6 4 9 , then A12 =. 7 8 7 1 8 Since we have a definition for the determinant for 1 × 1 matrices, we will now assume by induction that we know the determinant of any (n − 1) × (n − 1) matrix and use this to find the determinant of an n × n matrix. Definition 1.5.2 Let A be an n × n matrix. Then the determinant of A is  n det(A) = (−1)k+1 a1k det(A1k ). k=1   a11 a12 Thus for A = , we have a21 a22 det(A) = a11 det(A11 ) − a12 det(A12 ) = a11 a22 − a12 a21, 10 Linear Algebra which is what most of us think of as the determinant. The determinant of our above 3 × 3 matrix is: ⎛ ⎞ 2 3 5       ⎝ ⎠ 4 9 6 9 6 4 det 6 4 9 = 2 det − 3 det + 5 det. 1 8 7 8 7 1 7 1 8 While this definition is indeed an efficient means to describe the determinant, it obscures most of the determinant’s uses and intuitions. The second way we can describe the determinant has built into it the key algebraic properties of the determinant. It highlights function-theoretic properties of the determinant. Denote the n × n matrix A as A = (A1,... ,An ), where Ai denotes the ith column: ⎛ ⎞ a1i ⎜a2i ⎟ ⎜ ⎟ Ai = ⎜.. ⎟. ⎝. ⎠ ani Definition 1.5.3 The determinant of A is defined as the unique real-valued function det : Matrices → R satisfying: (a) det(A1,... ,λAk,... ,An ) = λ det(A1,... ,Ak ). (b) det(A1,... ,Ak + λAi ,... ,An ) = det(A1,... ,An ) for k = i. (c) det(Identity matrix) = 1. Thus, treating each column vector of a matrix as a vector in Rn , the determinant can be viewed as a special type of function from Rn × · · · × Rn to the real numbers. In order to be able to use this definition, we would have to prove that such a function on the space of matrices, satisfying conditions (a) through (c), even exists and then that it is unique. Existence can be shown by checking that our first (inductive) definition for the determinant satisfies these conditions, though it is a painful calculation. The proof of uniqueness can be found in almost any linear algebra text, and comes down to using either elementary column operations or elementary row operations. The third definition for the determinant is the most geometric but is also the most vague. We must think of an n×n matrix A as a linear transformation from Rn to Rn. Then A will map the unit cube in Rn to some different object (a parallelepiped). The unit cube has, by definition, a volume of one. 1.5 The Determinant 11 Definition 1.5.4 The determinant of the matrix A is the signed volume of the image of the unit cube. This is not well defined, as the very method of defining the volume of the image has not been described. In fact, most would define the signed volume of the image to be the number given by the determinant using one of the two earlier definitions. But this can all be made rigorous, though at the price of losing much of the geometric insight.   2 0 Let us look at some examples: the matrix A = takes the unit square to 0 1   2 0 0 1 1 1 1 1 2 Since the area is doubled, we must have det(A) = 2. Signed volume means that if the orientations of the edges of the unit cube are changed, then we must have  a negative  sign in front of the volume. For example, −2 0 consider the matrix A =. Here the image is 0 1   −2 0 0 1 1 1 1 −2 −1 12 Linear Algebra Note that the orientations of the sides are flipped. Since the area is still doubled, the definition will force det(A) = −2. To rigorously define orientation is somewhat tricky (we do it in Chapter 6), but its meaning is straightforward. The determinant has many algebraic properties. Lemma 1.5.5 If A and B are n × n matrices, then det(AB) = det(A) det(B). This can be proven either by a long calculation or by concentrating on the definition of the determinant as the change of volume of a unit cube. 1.6 The Key Theorem of Linear Algebra.............................................................................. Here is the key theorem of linear algebra. (Note: we have yet to define eigenvalues and eigenvectors, but we will in Section 1.8.) Theorem 1.6.1 (Key Theorem) Let A be an n × n matrix. Then the following are equivalent. 1. A is invertible. 2. det(A) = 0. 3. ker(A) = 0. 4. If b is a column vector in Rn , there is a unique column vector x in Rn satisfying Ax = b. 5. The columns of A are linearly independent n × 1 column vectors. 6. The rows of A are linearly independent 1 × n row vectors. 7. The transpose At of A is invertible. (Here, if A = (aij ), then At = (aj i )). 8. All of the eigenvalues of A are non-zero. We can restate this theorem in terms of linear transformations. Theorem 1.6.2 (Key Theorem) Let T : V → V be a linear transformation. Then the following are equivalent. 1. T is invertible. 2. det(T ) = 0, where the determinant is defined by a choice of basis on V. 3. ker(T ) = 0. 4. If b is a vector in V , there is a unique vector v in V satisfying T (v) = b. 1.7 Similar Matrices 13 5. For any basis v1,... ,vn of V , the image vectors T (v1 ),... ,T (vn ) are linearly independent. 6. For any basis v1,... ,vn of V , if S denotes the transpose linear transformation of T , then the image vectors S(v1 ),... ,S(vn ) are linearly independent. 7. The transpose of T is invertible. (Here the transpose is defined by a choice of basis on V.) 8. All of the eigenvalues of T are non-zero. In order to make the correspondence between the two theorems clear, we must worry about the fact that we only have definitions of the determinant and the transpose for matrices, not for linear transformations. While we do not show it, both notions can be extended to linear transformations, provided a basis is chosen. But note that while the actual value det(T ) will depend on a fixed basis, the condition that det(T ) = 0 does not. Similar statements hold for conditions (6) and (7). A proof is the goal of exercise 8, where you are asked to find any linear algebra book and then fill in the proof. It is unlikely that the linear algebra book will have this result as it is stated here. The act of translating is in fact part of the purpose of making this an exercise. Each of the equivalences is important. Each can be studied on its own merits. It is remarkable that they are the same. 1.7 Similar Matrices.............................................................................. Recall that given a basis for an n-dimensional vector space V , we can represent a linear transformation T:V →V as an n×n matrix A. Unfortunately, if you choose a different basis for V , the matrix representing the linear transformation T will be quite different from the original matrix A. The goal of this section is to find a clean criterion for when two matrices actually represent the same linear transformation but under a different choice of bases. Definition 1.7.1 Two n × n matrices A and B are similar if there is an invertible matrix C such that A = C −1 BC. We want to see that two matrices are similar precisely when they represent the same linear transformation. Choose two bases for the vector space V , say {v1,... ,vn } (the v basis) and {w1,... ,wn } (the w basis). Let A be the matrix representing the linear transformation T for the v basis and let B be the matrix 14 Linear Algebra representing the linear transformation for the w basis. We want to construct the matrix C so that A = C −1 BC. Recall that given the v basis, we can write each vector z ∈ V as an n × 1 column vector as follows: we know that there are unique scalars a1,... ,an with z = a 1 v1 + · · · + a n vn. We then write z, with respect to the v basis, as the column vector: ⎛ ⎞ a1 ⎜.. ⎟ ⎝. ⎠. an Similarly, there are unique scalars b1,... ,bn so that z = b1 w1 + · · · + bn wn, meaning that with respect to the w basis, the vector z is the column vector: ⎛ ⎞ b1 ⎜.. ⎟ ⎝. ⎠. bn The desired matrix C will be the matrix such that ⎛ ⎞ ⎛ ⎞ a1 b1 ⎜.. ⎟ ⎜.. ⎟ C ⎝. ⎠ = ⎝. ⎠. an bn If C = (cij ), then the entries cij are precisely the numbers which yield: wi = ci1 v1 + · · · + cin vn. Then, for A and B to represent the same linear transformation, we need the diagram: A Rn → Rn C ↓ ↓ C → Rn B Rn to commute, meaning that CA = BC or A = C −1 BC, as desired. 1.8 Eigenvalues and Eigenvectors 15 Determining when two matrices are similar is a type of result that shows up throughout math and physics. Regularly you must choose some coordinate system (some basis) in order to write down anything at all, but the underlying math or physics that you are interested in is independent of the initial choice. The key question becomes: what is preserved when the coordinate system is changed? Similar matrices allow us to start to understand these questions. 1.8 Eigenvalues and Eigenvectors.............................................................................. In the last section we saw that two matrices represent the same linear transformation, under different choices of bases, precisely when they are similar. This does not tell us, though, how to choose a basis for a vector space so that a linear transformation has a particularly decent matrix representation. For example, the diagonal matrix ⎛ ⎞ 1 0 0 A = ⎝ 0 2 0⎠ 0 0 3 is similar to the matrix ⎛ ⎞ −1 −2 −2 B = ⎝ 12 7 4 ⎠, −9 −3 0 but all recognize the simplicity of A as compared to B. (By the way, it is not obvious that A and B are similar; I started with A, chose a non-singular matrix C and then computed C −1 AC to get B. I did not just suddenly “see” that A and B are similar. No, I rigged it to be so.) One of the purposes behind the following definitions for eigenvalues and eigenvectors is to give us tools for picking out good bases. There are, though, many other reasons to understand eigenvalues and eigenvectors. Definition 1.8.1 Let T : V → V be a linear transformation. Then a non-zero vector v ∈ V will be an eigenvector of T with eigenvalue λ, a scalar, if T (v) = λv. For an n × n matrix A, a non-zero column vector x ∈ Rn will be an eigenvector with eigenvalue λ, a scalar, if Ax = λx. Geometrically, a vector v is an eigenvector of the linear transformation T with eigenvalue λ if T stretches v by a factor of λ. 16 Linear Algebra For example,      −2 −2 1 1 =2 , 6 5 −2 −2   1 and thus 2 is an eigenvalue and an eigenvector for the linear transformation −2   −2 −2 represented by the 2 × 2 matrix. 6 5 Luckily there is an easy way to describe the eigenvalues of a square matrix, which will allow us to see that the eigenvalues of a matrix are preserved under a similarity transformation. Proposition 1.8.2 A number λ will be an eigenvalue of a square matrix A if and only if λ is a root of the polynomial P (t) = det(tI − A). The polynomial P (t) = det(tI − A) is called the characteristic polynomial of the matrix A. Proof: Suppose that λ is an eigenvalue of A, with eigenvector v. Then Av = λv, or λv − Av = 0, where the zero on the right-hand side is the zero column vector. Then, putting in the identity matrix I , we have 0 = λv − Av = (λI − A)v. Thus the matrix λI − A has a non-trivial kernel, v. By the key theorem of linear algebra, this happens precisely when det(λI − A) = 0, which means that λ is a root of the characteristic polynomial P (t) = det(tI − A). Since all of these directions can be reversed, we have our theorem.  Theorem 1.8.3 Let A and B be similar matrices. Then the characteristic polynomial of A is equal to the characteristic polynomial of B. 1.8 Eigenvalues and Eigenvectors 17 Proof: For A and B to be similar, there must be an invertible matrix C with A = C −1 BC. Then det(tI − A) = det(tI − C −1 BC) = det(tC −1 C − C −1 BC) = det(C −1 ) det(tI − B) det(C) = det(tI − B) using that 1 = det(C −1 C) = det(C −1 ) det(C).  Since the characteristic polynomials for similar matrices are the same, this means that the eigenvalues must be the same. Corollary 1.8.4 The eigenvalues for similar matrices are equal. Thus to see if two matrices are similar, one can compute to see if the eigenvalues are equal. If they are not, the matrices are not similar. Unfortunately, in general, having equal eigenvalues does not force matrices to be similar. For example, the matrices   2 −7 A= 0 2 and   2 0 B= 0 2 both have eigenvalue 2 with multiplicity two, but they are not similar. (This can be shown by assuming that there is an invertible 2 × 2 matrix C with C −1 AC = B and then showing that det(C) = 0, contradicting the invertibility of C.) Since the characteristic polynomial P (t) does not change under a similarity transformation, the coefficients of P (t) will also not change under a similarity transformation. But since the coefficients of P (t) will themselves be (complicated) polynomials of the entries of the matrix A, we now have certain special polynomials of the entries of A that are invariant under a similarity transformation. One of these coefficients we have already seen in another guise, namely the determinant of A, as the following theorem shows. This theorem will more importantly link the eigenvalues of A to the determinant of A. Theorem 1.8.5 Let λ1,... ,λn be the eigenvalues, counted with multiplicity, of a matrix A. Then det(A) = λ1 · · · λn. 18 Linear Algebra Before proving this theorem, we need to discuss the idea of counting eigenvalues “with multiplicity.” The difficulty is that a polynomial can have a root that must be counted more than once (e.g., the polynomial (x − 2)2 has the single root 2 which we want to count twice). This can happen in particular to the characteristic polynomial. For example, consider the matrix ⎛ ⎞ 5 0 0 ⎝0 5 0⎠ 0 0 4 which has as its characteristic polynomial the cubic (t − 5)(t − 5)(t − 4). For the above theorem, we would list the eigenvalues as 4, 5, and 5, hence counting the eigenvalue 5 twice. Proof: Since the eigenvalues λ1,... ,λn are the (complex) roots of the character- istic polynomial det(tI − A), we have (t − λ1 ) · · · (t − λn ) = det(tI − A). Setting t = 0, we have (−1)n λ1 · · · λn = det(−A). In the matrix (−A), each column of A is multiplied by (−1). Using the second definition of a determinant, we can factor out each of these (−1), to get (−1)n λ1 · · · λn = (−1)n det(A) and our result.  Now finally to turn back to determining a “good” basis for representing a linear transformation. The measure of “goodness” is how close the matrix is to being a diagonal matrix. We will restrict ourselves to a special, but quite prevalent, class: symmetric matrices. By symmetric, we mean that if A = (aij ), then we require that the entry at the ith row and j th column (aij ) must equal the entry at the j th row and the ith column (aj i ). Thus ⎛ ⎞ 5 3 4 ⎝3 5 2⎠ 4 2 4 is symmetric but ⎛ ⎞ 5 2 3 ⎝6 5 3 ⎠ 2 18 4 is not. 1.9 Dual Vector Spaces 19 Theorem 1.8.6 If A is a symmetric matrix, then there is a matrix B similar to A which is not only diagonal but has the entries along the diagonal being precisely the eigenvalues of A. Proof: The proof basically rests on showing that the eigenvectors for A form a basis in which A becomes our desired diagonal matrix. We will assume that the eigenvalues for A are distinct, as technical difficulties occur when there are eigenvalues with multiplicity. Let v1,v2,... ,vn be the eigenvectors for the matrix A, with corresponding eigenvalues λ1,λ2,... ,λn. Form the matrix C = (v1,v2,... ,vn ), where the ith column of C is the column vector vi. We will show that the matrix C −1 AC will satisfy our theorem. Thus we want to show that C −1 AC equals the diagonal matrix ⎛ ⎞ λ1 0 · · · 0 ⎜.. ⎟. B = ⎝..........⎠ 0 0 · · · λn Denote ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 1 0 0 ⎜0⎟ ⎜1⎟ ⎜0⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ e1 = ⎜.. ⎟ , e2 = ⎜.. ⎟ ,... ,en = ⎜.. ⎟. ⎝.⎠ ⎝.⎠ ⎝.⎠ 0 0 1 Then the above diagonal matrix B is the unique matrix with Bei = λi ei , for all i. Our choice for the matrix C now becomes clear as we observe that for all i, Cei = vi. Then we have C −1 ACei = C −1 Avi = C −1 (λi vi ) = λi C −1 vi = λi ei , giving us the theorem.  This is of course not the end of the story. For non-symmetric matrices, there are other canonical ways of finding “good” similar matrices, such as the Jordan canonical form, the upper triangular form and rational canonical form. 1.9 Dual Vector Spaces.............................................................................. It pays to study functions. In fact, functions appear at times to be more basic than their domains. In the context of linear algebra, the natural class of functions is 20 Linear Algebra linear transformations, or linear maps from one vector space to another. Among all real vector spaces, there is one that seems simplest, namely the one-dimensional vector space of the real numbers R. This leads us to examine a special type of linear transformation on a vector space, those that map the vector space to the real numbers, the set of which we will call the dual space. Dual spaces regularly show up in mathematics. Let V be a vector space. The dual vector space, or dual space, is: V ∗ = {linear maps from V to the real numbers R} = {v ∗ : V → R | v ∗ is linear}. One of the exercises is to show that the dual space V ∗ is itself a vector space. Let T : V → W be a linear transformation. Then we can define a natural linear transformation T ∗: W∗ → V ∗ from the dual of W to the dual of V as follows. Let w ∗ ∈ W ∗. Then given any vector w in the vector space W , we know that w ∗ (w) will be a real number. We need to define T ∗ so that T ∗ (w ∗ ) ∈ V ∗. Thus given any vector v ∈ V , we need T ∗ (w ∗ )(v) to be a real number. Simply define T ∗ (w ∗ )(v) = w ∗ (T (v)). By the way, note that the direction of the linear transformation T : V → W is indeed reversed to T ∗ : W ∗ → V ∗. Also by “natural” we do not mean that the map T ∗ is “obvious” but instead that it can be uniquely associated to the original linear transformation T. Such a dual map shows up in many different contexts. For example, if X and Y are topological spaces with a continuous map F : X → Y and if C(X) and C(Y ) denote the sets of continuous real-valued functions on X and Y , then here the dual map F ∗ : C(Y ) → C(X) is defined by F ∗ (g)(x) = g(F (x)), where g is a continuous map on Y. Attempts to abstractly characterize all such dual maps were a major theme of mid-twentieth-century mathematics and can be viewed as one of the beginnings of category theory. 1.10 Books.............................................................................. Mathematicians have been using linear algebra since they have been doing mathematics, but the styles, methods and terminologies have shifted. For example, if you look in a college course catalog in 1900, or probably even 1950, there will Exercises 21 be no undergraduate course called linear algebra. Instead th

Use Quizgecko on...
Browser
Browser