Lecture Notes Linear Algebra PDF
Document Details
Uploaded by Deleted User
2023
Jolijn Cottaar, Marc Vorstermans
Tags
Summary
These lecture notes cover linear algebra topics, including linear equations, systems of linear equations, and matrix algebra. They provide definitions, examples, and exercises related to these key concepts and algorithms. The lecture notes are part of the course 'Programming and Linear Algebra'.
Full Transcript
Lecture Notes: Linear Algebra Jolijn Cottaar, Marc Vorstermans September 2023 Contents 1 Linear equations and systems of linear equations (part 1) 4 1.1 Linear Algebra..................
Lecture Notes: Linear Algebra Jolijn Cottaar, Marc Vorstermans September 2023 Contents 1 Linear equations and systems of linear equations (part 1) 4 1.1 Linear Algebra...................................... 4 1.2 Systems of Linear Equations............................... 4 1.3 Solving Linear Systems.................................. 5 1.4 Step-by-step algorithm to solve linear systems..................... 9 1.5 Exercises......................................... 9 2 Linear equations and systems of linear equations (part 2) 12 2.1 Solution Sets of Linear Systems............................. 12 2.2 Linear Independence................................... 14 2.3 Linear Mappings..................................... 16 2.4 Exercises......................................... 19 3 Matrix Algebra (part 1) 21 3.1 Matrix Operations.................................... 21 3.1.1 Addition and Scalar Multiplication....................... 22 3.2 Matrix Multiplication.................................. 22 3.3 The Transpose and Inverse of a Matrix......................... 25 3.3.1 The Transpose of a Matrix........................... 25 3.3.2 The Inverse of a Matrix............................. 26 3.3.3 Invertible Linear Transformations........................ 28 3.4 Exercises......................................... 29 4 Matrix Algebra (part 2) 31 4.1 LU -Decomposition.................................... 31 4.2 Subspaces of Rn..................................... 32 4.3 Dimension and Rank of a Matrix............................ 36 4.4 Exercises......................................... 37 5 Determinants 39 5.1 Introduction to Determinants.............................. 39 5.2 Properties of Determinants............................... 43 5.3 Applications of Determinants.............................. 44 5.3.1 Cramer’s rule................................... 44 5.3.2 Areas and Volumes................................ 45 5.4 Exercises......................................... 45 6 Eigenvalues and Eigenvectors (part 1) 47 6.1 Introduction to Eigenvalues and Eigenvectors..................... 47 6.2 Characteristic equation................................. 48 6.3 Diagonalization...................................... 51 1 6.4 Powers of matrices.................................... 53 6.5 Exercises......................................... 54 7 Eigenvalues and Eigenvectors (part 2) 56 7.1 Complex Eigenvalues................................... 56 7.2 Applications to Differential Equations......................... 57 7.3 Exercises......................................... 61 8 Solutions 63 8.1 Lecture 1......................................... 63 8.2 Lecture 2......................................... 66 8.3 Lecture 3......................................... 69 2 Preface These are the lecture notes for the Linear Algebra part of the course 6BBR06 Programming and Linear Algebra. These lecture notes contain what has been explained in the lectures, sometimes with an extra example or explanation. The paragraphs, theorems and exercises mentioned in these notes are all from the following book: David Lay, Steven Lay and Judi McDonald, ’Linear Algebra and its Applications, 6th edition’, Pearson New International Edition, ISBN 9781292351216. If you notice any mistakes in these lecture notes, please let us know. 3 Chapter 1 Linear equations and systems of linear equations (part 1) Paragraphs in the book: 1.1 to 1.4. In this lecture we introduce the topic of linear equations and linear systems. We look at multiple ways of how to write down such a system and ways to solve them. We consider the existence and uniqueness of solutions. 1.1 Linear Algebra ”Linear”: A linear equation in the variables x1 ,..., xn is an equation that can be written in the form: a1 x1 + a2 x2 +... + an xn = b where the coefficients a1 ,..., an and b are numbers (either real or complex). The subscript n counts the number of variables (and coefficients) and thus is an integer. It can be seen as the number of dimensions we are looking at. Example (Linear equations). 6x1 + 2x2 = 4 x+y =3 ”Algebra”: Using letters and other symbols to represent numbers and quantities in formulas and equations. So exactly what we did while talking about linearity. 1.2 Systems of Linear Equations We look at the methane combustion reaction: CH4 + O2 → CO2 + H2 O. We can look at the balancing equation for this: x1 CH4 + x2 O2 = x3 CO2 + x4 H2 O. 4 Let us look at the components vector: #C #H , #O we can rewrite this as a vector equation: 1 0 1 0 x1 4 + x2 0 = x3 0 + x4 2. 0 2 2 1 Now we would prefer to have only a constant vector on the right hand side, so let us move everything to the left hand side: 1 0 −1 0 0 x1 4 + x2 0 + x3 0 + x4 −2 = 0. 0 2 −2 −1 0 We can also look at this as a system of linear equations: x1 − x3 =0 4x1 − 2x4 = 0 2x2 − 2x3 − x4 = 0 Or as an augmented matrix, where we put the coefficients of the linear system in a matrix and augment it with the vector containing the right-hand sides: 1 0 −1 0 0 4 0 0 −2 0 0 2 −2 −1 0 Now we can split this augmented matrix up in the shape of A⃗x = ⃗b, where A is matrix, and ⃗x, ⃗b are vectors. In our case the size of A will be 3 × 4 (3 rows, 4 columns), and both ⃗x and ⃗b are size 4 vectors. x 0 0 1 1 0 −1 x 2 0 4 0 0 −2 x3 = 0 0 2 −2 −1 x4 0 We shall see later that solving this system will give us infinitely many solutions, but of course in practice we would only be interested in positive integer solutions. 1.3 Solving Linear Systems Now we have seen how we can write linear systems in a multitude of ways. Now how do we solve these kind of systems? The first question we should ask ourselves is if there even is a solution. Let us look at some examples of linear system in R2 : So we can see here that either there is: 1. one, unique solution,i.e., consistent and unique, 5 ( ( ( x1 + x2 = 3 x1 + x2 = 3 x1 + x2 = 3 (a) (b) (c) −x1 + 2x2 = 2 x1 + x2 = 5 2x1 + 2x2 = 6 Figure 1.1: The three options for solving a linear system 2. no solutions, i.e., not consistent or 3. infinite number of solutions, i.e., consistent and not-unique. Given a system, we usually start out by asking the questions: Is the system consistent, i.e., is it possible to have a solution at all? If there is a solution is it unique? So how do we go about solving this system? When looking at a simple example: ( x1 + x2 = 5 1 3 2 x2 = 2 we can first multiply the second row by 2, without changing the solution space for this linear system: ( x1 + x2 = 5 x2 = 3 We can then easily find that x2 = 3 and thus x1 = 2. We can find this by subtracting the first line by the second. This is one of the operations we can do on a linear system without changing the system itself. Secondly we can switch rows: ( x2 = 3 x1 + x2 = 5 is the same linear system as above. Now this also works when looking at the augmented matrix form of this system: 1. Scaling: Multiply all entries of a row by a nonzero constant: 0 21 32 0 1 3 = 1 1 5 1 1 5 2. Interchange: Interchange two rows: 0 1 3 1 1 5 = 1 1 5 0 1 3 6 3. Replacement: Add a multiple of a row to another row. 1 1 5 1 0 2 = 0 1 3 0 1 3 We call these three operations the ‘elementary row operations’. We can use these freely while solving a system since none of these operations change the solution space. In principal these three operations are enough to solve any linear systems, but if you have system of about 10000 variables this can lead to issues. So we can use these to clean up our matrices to be able to find the solutions for our systems. So given a linear system, for example: 2x1 + 6x2 + 6x3 = 8 2x2 + 9x3 = 22 4x1 + 13x2 + 15x3 = 21, the first question we want to answer is if this system is consistent, so if this system has at least one solution. To reach this goal, we put our matrices in something called ‘row echelon form’ or simply ‘echelon form’. It is also called ‘triangular form’ on occasion. This is an augmented matrix where all rows consisting of only zeros are at the bottom and the leading entry of every nonzero row is to the right of the leading entry of all rows above. Example (Augmented matrices). 3 2 1 5 1 0 3 2 1 1 5 0 , 1 4 3 , 0 0 3 7 . 0 1 3 0 0 0 2 0 0 0 0 This already looks a lot nicer and can sometimes already lead to a solution. An example how we can use the elementary row operations to get to this echelon form, starting from the example linear system: 2 6 6 8 2 6 6 8 3d −2·1st 0 2 9 22 ∼ 0 2 9 22 4 13 15 21 0 1 3 5 2 6 6 8 switch 2,3 ∼ 0 1 3 5 0 2 9 22 2 6 6 8 3d −2·2nd ∼ 0 1 3 5 0 0 3 12 Note the symbol ∼ between our matrices. This implies that the matrix before and after are equiv- alent, i.e., the systems belonging to the matrix before and after have the same solutions. So from the row echelon form we can already conclude if a system has any solutions at all, or in other ways if a system is consistent. A system in a row echelon form is not consistent if it has a row of zeros before the vertical line and a non zero entry after the vertical line. An example is: 3 2 1 5 0 1 4 3 0 0 0 2 7 as we have seen before. The bottom row now implies that: 0 · x1 + 0 · x2 + 0 · x3 = 2 which of course is impossible. This system thus has no solutions. In our running example though we have no such issues, thus this system is consistent. Now our second question is, does there exist one unique solution or infinitely many? The ‘reduced row echelon form’ takes the row echelon Form and continues doing row operations until: the leading entry for each row is 1, Each column containing a leading 1 has zeros in all its other entries. Let us continue with our example: 2 6 6 8 1 ·1 st 1 3 3 4 0 1 3 5 2∼ 0 1 3 5 0 0 3 12 0 0 3 12 1 rd 1 3 3 4 3 ·3 ∼ 0 1 3 5 0 0 1 4 1 0 −6 −11 1st −3·2nd ∼ 0 1 3 5 0 0 1 4 1 0 0 13 1st −6·3rd ∼ 0 1 3 5 0 0 1 4 1 0 0 13 2nd −3·3rd ∼ 0 1 0 −7 0 0 1 4 Now we can easily find the solution, since putting this into a linear system and read off the solution: x1 = 13 x2 = −7 x3 = 4 It is clear that this system thus has one unique solution, namely the one given here. In Lecture 3, we discuss the relation between the augmented matrix and the uniqueness of a solution. In other cases though we can find the reduced row echelon form as in the following example of an augmented matrix, where we have 4 variables, but only 3 equations in our linear system, thus we get a 3 × 4 matrix: 1 0 3 0 5 0 1 7 0 11 0 0 0 1 −2 The columns that contain a 1 and only zeros are called the pivot columns, or just pivots. They show the variables that are ‘basic’, they are uniquely defined depending on the variables that are not basic. So in this case x1 , x2 and x4 are basic variables. The variables who are not basic will be called free variables and in this case only x3 is a free variable, since it is free to choose. 8 We can also easily see this if we write this again into a system of linear equations: x1 + 3x3 = 5 x2 + 7x3 = 11 x4 = −2 Where we can rewrite this such that all basic variables depend only on constants and the free variables: x1 = 5 − 3x3 x2 = 11 − 7x3 x4 = −2 Now if we want to write this solution we use a parametric description: x1 5 − 3x3 5 −3 x2 11 − 7x3 11 −7 x3 = x3 = 0 + x3 1 ⃗x = x4 −2 −2 0 for any x3 ∈ R. Which means that there are infinitely many solutions, since there are infinitely many choice for x3. 1.4 Step-by-step algorithm to solve linear systems So given a linear system, follow the following steps to solve it: 1. Write the augmented matrix belonging to the system. 2. Use row reduction to reach the echelon form, the check if the system is consistent. If not, you are done, otherwise continue with the next step. 3. Continue row reduction until you reach reduced echelon form. 4. Write it back to a linear system. 5. Either write down the unique solution (if it exists) or write down the parametric solution based on the free variables. 1.5 Exercises The exercises of this lecture concern Sections §1.1, §1.2, §1.3, §1.4 of the book. 1. Solve the following system and verify your answer by substituting your found values into the original system: x2 + 4x3 = −4 x1 + 3x2 + 3x3 = −2 3x1 + 7x2 + 5x3 = 6 2. Determine the value of h such that the given matrix is the augmented matrix of a consistent linear system: 1 3 −2 −4 h 8 9 3. Row reduce the following matrix to reduced row echelon form. Note the pivot positions in both the original and the final matrix. 1 2 3 4 4 5 6 7 6 7 8 9 4. Find the general solution for the system: x1 − 7x2 + 6x4 = 5 x3 − 2x4 = −3 −x1 + 7x2 − 4x3 + 2x4 = 7 5. A system of linear equations with fewer equations than unknowns is often called an under- determined system. A system with more equations than unknowns is called overdetermined. (a) Suppose that an underdetermined system happens to be consistent. What can you conclude about the number of solutions and why? (b) Can an overdetermined system be consistent? Illustrate your answer with an example of a system with two unknowns and three equations. 6. Write the system of equations and the vector notation corresponding to the given augmented matrix: 2 −1 3 0 −1 −2 −9 0 2 0 −4 0 7. Determine if ⃗b is a linear combination of the columns of A: 1 −4 2 3 A= 0 3 5 , ⃗b = −7 −2 8 −4 −3 8. Write the following vector notation in the form A⃗x = ⃗b: 3 6 −1 0 2 1 0 4 x1 −1 + x2 −7 + x3 4 = −5 0 3 2 1 9. Given 1 2 4 −2 A= 0 1 5 , ⃗b = 2 . −2 −4 −3 9 (a) Write the augmented system corresponding to the matrix equation A⃗x = ⃗b. (b) Solve the system, write the solution in vector notation. 10. Let A be a 3 × 2 matrix. (a) Explain why the equation A⃗x = ⃗b cannot be consistent for all ⃗b in R3. (b) Generalize your argument in (a) to the case of an arbitrary matrix with more rows than columns. 10 For some additional exercises, check Sowiso or use the provided book. Recommended exercises from the book for this paragraph are: §1.1: 3, 9, 39 §1.2: 1a, 1c, 11, 19 §1.3: 5, 9 §1.4: 3, 7, 17, 21, 41, 46 Having trouble with the exercises? Try the Practice Problems at the start of each paragraph! 11 Chapter 2 Linear equations and systems of linear equations (part 2) Paragraphs in the book: 1.5, 1.7 to 1.9. In this lecture we continue looking at how to solve linear systems and we take a look at linear independence and linear maps. 2.1 Solution Sets of Linear Systems In the last lecture we have seen how we can check if a system is consistent, i.e., if there exists at least one solution, and then if that solution is unique. This lecture we will take a closer look at this, in the form of homogenous or inhomogenous sets of equations. Let us start with an example of a linear system: Example (Solving a linear system). x1 + x2 + 2x3 + 4x4 = 5 2x1 + 3x2 + 7x3 + 8x4 = 5 x2 + 3x3 + 3x4 = 1 We now write this in augmented matrix form and we are going to rewrite it to echelon form to check consistency: 1 1 2 4 5 1 1 2 4 5 2nd −2·1st 2 3 7 8 5 ∼ 0 1 3 0 −5 0 1 3 3 1 0 1 3 3 1 1 1 2 4 5 3d −2nd ∼ 0 1 3 0 −5 0 0 0 3 6 Now we see that this system is consistent, since it is in echelon form without any rows in the form 12 of [000...|a] for some non-zero a. We can now continue to find the solution: 1 1 2 4 5 st nd 1 0 −1 4 10 1 −2 0 1 3 0 −5 ∼ 0 1 3 0 −5 0 0 0 3 6 0 0 0 3 6 1 d 1 0 −1 4 10 3 ·3 ∼ 0 1 3 0 −5 0 0 0 1 2 1 0 −1 0 2 1st −4·3d ∼ 0 1 3 0 −5 0 0 0 1 2 We have now reached the reduced echelon form and we have three basic variables x1 , x2 , x4 and one free variable x3. So we can write our solution again as a linear system and rewrite it in vector notation: 2 1 x 1 − x 3 = 2 x 1 = 2 + x 3 −5 −3 x2 + 3x3 = −5 → x2 = −5 − 3x3 → ⃗x = 0 + x3 1 x4 = 2 x4 = 2 2 0 The system we now looked at is called an inhomogenous system. A homogenous system, on the other hand, always has zeros on the righthand side of the equation in the linear system. An example of a homogenous set of equations is the following: x1 + x2 + 2x3 + 4x4 = 0 2x1 + 3x2 + 7x3 + 8x4 = 0 x2 + 3x3 + 3x4 = 0 Note that each right hand side is equal to 0. Systems of this type always have a trivial solution, namely ⃗x = ⃗0. Thus a homogenous system is always consistent, but the question is if there exists nontrivial solutions. If there are any free variables after row reduction, then we can conclude that there are infinitely many nontrivial solutions. If there are no free variables there is only the trivial solution ⃗x = ⃗0. Example (Solving a homogeneous system). We take the above example of a homogeneous system and we can reach the following reduced row echelon form, since we do not have to check consistency (check this as an exercise): 1 0 −1 0 0 0 1 3 0 0 0 0 0 1 0 (Note here that you could easily leave out the part after the vertical line, since it will always only contain zeros). This reduced echelon form then leads to the following solution space in vector notation: 1 −3 ⃗x = x3 1 0 Note the similarity between this solution and the one we found in our first example. You can see that apart from the first vector the solutions are the same. We can thus easily see now that the homogenous and inhomogenous system are very much related. On this we have the following theorem: 13 Theorem (6). Suppose the equation A⃗x = ⃗b is consistent for some given ⃗b, and let p⃗ be a solution for this equation. Then the solution set of A⃗x = ⃗b consists of all vectors of the form w ⃗ = p⃗ + ⃗vh where ⃗vh is any solution of the homogenous equation A⃗x = ⃗0. So why is this actually true? To understand this, we look at the geometrical interpretation of the theorem in R2 and R3 in Figure 2.1. We see in these figures that we get a solution space for A⃗x = ⃗0. Then we lift it to the solution space for A⃗x = ⃗b, by lifting the whole space with p⃗. (a) Mapping solutions in R2 (b) Mapping solutions in R3 Figure 2.1: Geometric interpretation of theorem 6 2.2 Linear Independence Let us stay in geometry a bit longer and take a look at what linear independence looks like for a bunch of vectors as seen in Figure 2.2. Let ⃗v1 , ⃗v2 be two vectors who are not multiple of each other. These vectors span a space, i.e., the space can be defined as the span of these two vectors: Definition (Span). The span of the vectors ⃗v1 ,..., ⃗vp is the set of all linear combinations of these vectors. We denote the span as ⟨⃗v1 ,..., ⃗vp ⟩ or Span(⃗v1 ,..., ⃗vp ). where we need the definition of a linear combination: Definition (Linear combination). A linear combination y of vectors ⃗v1 , ⃗v2 ,..., ⃗vp and scalars c1 , c2 ,..., cp is defined as: y = c1⃗v1 + c2⃗v2 +... + cp⃗vp with weights c1 ,..., cp. In Figure 2.2 the span of ⃗v1 , ⃗v2 is the blue plane. Now let us look at ⃗v3 which is a linear combination of ⃗v1 and ⃗v2 , so it is linearly dependent on them. The vector ⃗v4 , on the other hand, is not in the span of ⃗v1 and ⃗v2 and thus is linearly independent from them. In other words, the set of vectors {⃗v1 , ⃗v2 , ⃗v3 } is linearly dependent and {⃗v1 , ⃗v2 , ⃗v4 } is linearly independent. The formal definition of linear independency is the following: Definition (Linear independence). A set of vectors ⃗v1 , ⃗v2 ,..., ⃗vp ∈ Rn is said to be linearly inde- pendent if the vector equation: x1⃗v1 + x2⃗v2 +... + xp⃗vp = ⃗0 14 Figure 2.2: Example of linear (in)dependence in R2 x1 x2 only has a trivial solution, i.e, . = ⃗0. .. xp This might look like a complicated definition, but the idea is pretty simple. Assume the above is not true, then it means there is a solution of (x1 ,..., xp ) such that the equation is equal to 0. Then you can immediately express one of the vectors as a linear combination of the others, thus would mean linear dependence. If we take a look at the following line in the definition: x1⃗v1 + x2⃗v2 +... + xp⃗vp = ⃗0 we in actuality have a homogenous system of linear equations with our ⃗vi as the coefficients and x1 ,..., xp as our variables. So if we have a given number of vectors and we want to know if they are linear dependent we can just test the uniqueness of the underlying homogenous system of equations. 1 2 1 Example (Linear independence). Given three vectors ⃗v1 = 3 , ⃗v2 = 4 and ⃗v3 = −1, 1 6 2 the question is if these are linearly independent. To solve this, we need to see if there is a nontrivial solution for the following equation: 1 2 1 x1 3 + x2 4 + x3 −1 = ⃗0 1 6 2 We can write this into a system of linear equations (but this is not necessary): x1 + 2x2 + x3 = 0 3x1 + 4x2 − x3 = 0 x1 + 6x2 + 2x3 = 0 Immediately from our first equation, or from the system of linear equations we can set up the augmented matrix. The row of zeros at the end though are unnecessary and will be left out after the first, since they will never change due to being all zeros: 15 1 2 1 0 1 2 1 2nd −3·1st , 3d −1st 3 4 −1 0 ∼ 0 −2 −4 1 6 2 0 0 4 1 1 0 −3 3d −2·2nd , 1st +2nd ∼ 0 −2 −4 0 0 −7 1 nd 1 ·3d 1 0 −3 −2 ·2 , −7 ∼ 0 1 2 0 0 1 1 0 0 1st +3·3d , 2nd −2·3d ∼ 0 1 0 0 0 1 From the reduced row echelon form we can conclude that the only solution to this system is ⃗x = ⃗0, so ⃗v1 , ⃗v2 , v⃗3 are linearly independent. We have already seen the span of vectors ⃗v1 and ⃗v2 , but given a set of vectors that span a space we can use a similar method to find the minimal spanning set of this space. To do this we, similarly as with showing linear (in)dependency, create a linear system and solve this. The basic variables then will form a minimal spanning tree, since the free variables depend on the basic variables. Note that the vectors you find will not give you a unique minimal spanning set. The number of vectors though is minimal. 2.3 Linear Mappings We have looked at systems of linear equations like: ( x1 + 3x2 + 5x3 + x4 = b1 x2 + 5x3 − x4 = b2 which we can rewrite to x1 1 3 5 1 x2 = b1 , 0 1 5 −1 x3 b2 x4 ⃗ 1 3 5 1 or also as A⃗x = b with A =. This is just a convenient notation though. We can 0 1 5 −1 x1 4 x2 also look at this as a map which takes an element in R (namely x3 ), and maps it under the x4 2 b1 “action” of A to an element in R (namely ). Using a matrix A of size m × n like this, we b2 create a linear map or function from Rn to Rm. Definition (Mappings). A mapping (or function or transformation) T from Rn to Rm assigns each vector ⃗x ∈ Rn to a vector T (⃗x) ∈ Rm. We denote this as T : Rn → Rm. Then: The set Rn is called the domain of T. The set Rm is called the codomain of T. For an ⃗x ∈ Rn , the vector T (⃗x) ∈ Rm is the image of ⃗x. 16 The set of all images T (⃗x) is called the range of T. Example (Mappings). Let us look at an example: x1 x2 T −−−−−−−−−−→ b1. x3 b2 x4 1 3 2 −1 Here the map T is induced by multiplication on the left by A = Then we have 5 2 5 1 the following: The domain of T is R4. The codomain of T is R2. 1 2 9 The image of for example ⃗x = is. 1 14 0 1 3 2 −1 The range of T is the span of its columns so Span { , , , }. 5 2 5 1 Now let us look at some properties of these maps. All maps induced by a matrix are linear maps, whose definition is as follows: Definition (Linear mapping). A mapping T is linear if: (i) T (u + v) = T (u) + T (v) for all u, v in the domain of T. (ii) T (cu) = cT (u) for all scalars c and all u in the domain of T Thus, in other words, it does not matter if you first add two elements in the domain together and then map by T , or first map by T and then add the two images of these elements in the codomain. Similarly multiplication by a scalar can be done in either the domain or the codomain. Sometimes these maps are also onto, or surjective. The definition is as follows: Definition (Onto/surjective). A mapping T : Rn → Rm is said to be onto Rm or surjective if each b ∈ Rm is the image of at least one x ∈ Rn. A mapping thus is onto or surjective if for every element in the codomain there is at least one element in the domain that maps to it, note that it can be multiple elements in the domain mapping to the same element in the codomain (if n > m, then there are more elements in the domain than the codomain). Figure 2.3 shows an example of a mapping that is surjective and a mapping which is not. Note that in Figure 2.3a all elements in the codomain (the right bubble) have at least one incoming arrow, while in Figure 2.3b we have an element in the codomain where no element is mapped to, hence not surjective. Another possible property is one-to-one or injective: Definition (One-to-one/injective). A mapping T : Rn → Rm is said to be one-to-one or injective if each b ∈ Rm is the image of at most one x ∈ Rn. So now every element in the codomain can at most have one element in the domain that maps to it. Note however that in this case there can be elements in the codomain that do not get mapped to, but at least never two elements in the domain will map to the same element in the codomain. Figure 2.4 shows again an example of an injective function and one who is not injective. In Figure 2.4a we see that each of the elements in the codomain has one or zero incoming arrows, 17 (a) An example of a onto or surjective (b) An example of a not-onto or non- mapping surjective mapping Figure 2.3: Examples and counterexamples of surjectivity (a) An example of a one-to-one or injective (b) An example of a not-one-to-one or non- mapping injective mapping Figure 2.4: Examples and counterexamples of injectivity thus it is injective. Figure 2.4b however has an element with two incoming arrows, thus two elements are mapped to the same element in the codomain, thus not injective. If a function is both onto and one-to-one (or surjective and injective respectively) it is called bijective: Definition (Bijective). A mapping T : Rn → Rm is said to be bijective if it is both onto/surjective and one-to-one/injective. This means that each element in the codomain has exactly one element in the domain that maps to it. Figure 2.5 shows an example of a bijective function where each element of the domain gets mapped to one unique element in the codomain. Any of the examples shown in Figure 2.3 and Figure 2.4 shows examples of functions that are not bijective. Figure 2.5: An example of a bijective function Showing that a linear map T , induced by the action of matrix A, is surjective or injective is pretty 18 straightforward. For surjectivity we need the reduced row echelon form of the matrix A. Then the map T is surjective if and only if the reduced row echelon form of A does not have a row of zeros, or, in other words, it is consistent. This implies that there is no element in the codomain that cannot be reached by multiplication with A. The map T is injective if and only if the homogeneous system T (⃗x) = ⃗0 only has a trivial solution, namely ⃗x = ⃗0. Or, which we have seen is an equivalent problem, the map T is injective if and only if the columns of A are linearly independent. This implies that differing elements in the domain cannot map to the same element in the codomain. Example (Onto/surjective and one-to-one/injective). Let us look at T : R3 → R2 , induced by the matrix A where 3 2 1 A=. 4 1 0 The reduced row echelon form of this matrix is the following (check this): 1 0 − 51 4 0 1 5 This map is surjective since there is no row of zeros, but, on the other hand, we have a free variable (namely x3 ), so it is not injective. Given geometric transformations, like reflections, contractions etc, there are standard matrices that induce these mappings. For all these look at Table 1, 2, 3 and 4 of the book on page 102-104. 2.4 Exercises The exercises of this lecture concern Sections §1.5, §1.7, §1.8, §1.9 of the book. 1. Determine if the following system has a nontrivial solution. Try to do this as efficient as possible. Give the solution set. (a) 2x1 − 5x2 + 8x3 = 0 −2x1 − 7x2 + x3 = 0 4x1 + 2x2 + 7x3 = 0 (b) x1 + 3x2 + x3 = 0 −4x1 − 9x2 + 2x3 = 0 −3x2 − 6x3 = 0 2. A is a 3 × 2 matrix with two pivot positions. (a) Does the equation A⃗x = ⃗0 have a nontrivial solution? (b) Does the equation A⃗x = ⃗b have at least one solution for every possible ⃗b? 3. Determine if the following vectors are linearly independent: (a) 5 7 −2 1 , 2 , −1 0 −6 6 19 (b) 6 4 11 3 , −2 , −14 1 3 24 4. Determine if the columns of the following matrix are linearly independent: −4 −3 0 0 −1 4 1 0 3 5 4 6 5. Determine for which value of h the given vectors are linearly dependent: 1 3 −1 −1 , −5 , 5 4 7 h 6. Suppose A is an m×n matrix, with the property that for all ⃗b ∈ Rm the equation A⃗x = ⃗b has at most one solution. Use the definition of linear independence to explain why the columns of A must be linearly independent. 7. The map T is defined by: 1 −3 2 A = 0 1 −4 3 −5 −9 Find a vector ⃗x whose image under T is: 6 ⃗b = −7 −9 8. The map T is defined by: 1 3 9 2 1 0 3 −4 A= 0 1 2 3 −2 3 0 5 Find all ⃗x ∈ R4 that T maps to ⃗0. 9. Let T : Rn → Rm be a linear transformation and let S = {v⃗1 , v⃗2 , v⃗3 } be a linear dependent set in Rn. Explain why the images of S also are linearly dependent. 10. Given the following matrix: 3 4 −1 A= 9 2 0 . −1 2 −1 Is the map induced by this matrix injective, surjective and/or bijective? For some additional exercises, check Sowiso or use the provided book. Recommended exercises from the book for this paragraph are: §1.5: 11, 19 §1.7: 5, 11, 17, 37 §1.8: 3, 9, 15 §1.9: 1, 3, 9, 17, 37, 38, 43 Having trouble with the exercises? Try the Practice Problems at the start of each paragraph! 20 Chapter 3 Matrix Algebra (part 1) Paragraphs in the book: 2.1 to 2.3. In the previous lectures, we have seen how matrices can be used to solve systems of linear equations. This lecture focuses on operations that can be performed using matrices, such as addition and multiplication. Additionally, we discuss the transpose and the inverse of a matrix. 3.1 Matrix Operations A matrix A with m rows and n columns is called an m by n matrix (or in mathematical notation, an m × n matrix). The entry in the i’th row and j’th column is denoted by ai,j , and is called the (i, j)-entry of A. When a matrix has the same number of rows and columns, it is called a square matrix. We often denote a matrix using capital letters. An example of a 2 × 3 matrix can be seen below. a11 a12 a13. a21 a22 a23 Matrices do not necessarily correspond to a system of linear equations, but can be objects on their own like the real numbers. Before discussing the most common matrix operations, we briefly mention two special matrices: the identity matrix, denoted by I or In (to indicate that it has size n × n), and the zero matrix, denoted by 0 or 0m,n (if it has m rows and n columns). The identity matrix is a square matrix which has all 1’s on the diagonal and all 0’s on the other entries. As an example, I3 (the identity matrix of size 3) is given below. 1 0 0 I3 = 0 1 0 0 0 1 The zero matrix is a matrix containing only 0’s. It can have any size. For example 0 0 03,2 = 0 0. 0 0 For both the identity and zero matrix, it is fine to denote them by I and 0 respectively, as long as their sizes can be deduced from the context. 21 3.1.1 Addition and Scalar Multiplication Now let’s have a look at some common matrix operation. Say we have two matrices A and B. If two matrices have the exact same size, then we can add or subtract them. When adding (or subtracting) two matrices, we add (or subtract) the entries ai,j with bi,j. Example (Matrix addition). The matrices A and B, as given below, have similar size and therefore we can add them. 2 0 3 5 −3 1 A= , B= 1 −2 1 2 0 −1 Adding the two matrices gives 7 −3 4 A+B =. 3 −2 0 If two matrices do not have the same size, we cannot add or subtract them from one another. Example (Matrix addition, distinct sized matrices). The following matrices A, B cannot be added because A is a 2 × 3 matrix and B a 2 × 2 matrix. 2 0 3 5 −3 A= and B =. 1 −2 1 2 0 Matrices can be multiplied with scalars. Let λ ∈ R be a scalar and A be a matrix with entries ai,j. Multiplying λ · A gives for each entry λ · ai,j. Example (Scalar multiplication). Let λ be a scalar and A a matrix as defined as below. 2 0 3 λ = 3, A= 1 −2 1 Then multiplying λ with A gives 6 0 9 λ·A= 3 −6 3 3.2 Matrix Multiplication Before we look at what happens when we multiply two matrices with one another, we first look at what happens when a matrix is multiplied by a vector. An m × n matrix can only be multiplied with a vector of length n. The idea of multiplying a vector with a matrix is to multiply the first column of A with the first entry of ⃗v , the second column of A with the second entry of ⃗v , and so on. Then, these results are added and this results in the final answer. In general, if one multiplies an m × n matrix with a vector of length n, the resulting vector has length m. An example is given below. For this example, we have m = 2 and n = 3 and we see that this is indeed true. Example (Matrix-vector multiplication). We take matrix A and vector ⃗v as defined below. −2 2 0 3 A= , ⃗v = 1 , 1 −2 1 3 When determining A · ⃗v , we get −2 2 0 3 2 0 3 A · ⃗v = · 1 = −2 · +1· +3· = 1 −2 1 1 −2 1 3 −2 · 2 + 1 · 0 + 3 · 3 5 =. −2 · 1 + 1 · −2 + 3 · 1 −1 22 One can see that, when we take the first row of A and the vector ⃗v , we entry-wise multiply the entries of A and ⃗v and then add them to obtain the first entry of the resulting vector (being 5). When taking the second row of A and performing the same computation, we obtain the second entry of the resulting vector (being −1). Let’s have a look at multiplying two matrices. Let A be an m × n matrix and B and n × k matrix. We want to determine C = A · B. The procedure of multiplying two matrices is very similar to that of multiplying a matrix with a vector. The difference is that we have to perform the above step for each column of B, which is a total of k times. In the end, the columns are placed next to each other. The resulting matrix C is an m × k matrix. The number of columns of A needs to be the same as the number of rows of B to multiply two matrices. Otherwise, matrix multiplication cannot be performed. Take a look at the following example, which illustrates the above described procedure. In the example, A is a 2 × 3 matrix and B a 3 × 4 matrix. Therefore, the resulting matrix C will be a 2 × 4 matrix. Example (Matrix multiplication, following the book). We take the following two matrices. 0 3 1 4 2 0 3 A= , B = 1 −2 1 1. 1 −2 1 0 1 −4 0 We want to determine A · B. We split B into four columns, which gives 0 3 1 4 ⃗b1 = 1 , ⃗b2 = −2 , ⃗b3 = 1 , ⃗b4 = 1. 0 1 −4 0 Now we separately determine A · ⃗b1 , A · ⃗b2 , A · ⃗b3 , A · ⃗b4 , in a similar way as we have done for matrix-vector multiplication. This gives 0 2 0 3 2 0 3 0 A · ⃗b1 = · 1 =0· +1· +0· = , 1 −2 1 1 −2 1 −2 0 3 2 0 3 2 0 3 9 A · ⃗b2 = · −2 = 3 · −2· +1· = , 1 −2 1 1 −2 1 8 1 1 2 0 3 2 0 3 −10 A · ⃗b3 = · 1 =1· +1· −4· = , and 1 −2 1 1 −2 1 −5 −4 4 2 0 3 2 0 3 8 A · ⃗b4 = · 1 =4· +1· +0· =. 1 −2 1 1 −2 1 2 0 Putting all columns next to each other, we obtain 0 3 1 4 2 0 3 0 9 −10 8 A·B = · 1 −2 1 1 = . 1 −2 1 −2 8 −5 2 0 1 −4 0 We see that the result is indeed a 2 × 4 matrix. Now let’s have a look at an alternative way of multiplying two matrices. Example (Matrix multiplication per entry). Alternatively, one can perform a matrix multipli- cation by computing the entries of the matrix one-by-one. Let’s look at the same matrices A, B again and we want to determine C = A · B. 0 3 1 4 2 0 3 A= , B = 1 −2 1 1 1 −2 1 0 1 −4 0 23 Suppose we are interested in entry c1,3 , i.e. the entry in row 1 and column 3 of C. To compute this entry, we take the first row of A and the third column of B and multiply those with each other. This gives 1 2 0 3 · 1 = 1 · 2 + 1 · 0 − 4 · 3 = −10. −4 The procedure is similar to that of a matrix-vector multiplication, where the matrix has size 1 × 3 and the vector size 3. We expect as an answer a vector of size 1 in this case, which is indeed the case (Note that the 2, 0, and 3 in the multiplication can be seen as vectors of size 1). In general, for matrices A of size m × n and B of size n × k and C = A · B, one can determine ci,j by n X ci,j = ai,1 b1,j + ai,2 b2,j +... + ai,n bn,j = ai,k bk,j. k=1 Let A be an m × n and let B and C have sizes for which the indicated operations are defined. Below, an overview is given of the properties of matrix operations. A · (B · C) = (A · B) · C associative law of multiplication A · (B + C) = A · B + A · C left-distributive law (B + C) · A = B · A + C · A right-distributive law r · (A · B) = (r · A) · B = A · (r · B) for any scalar r Im · A = A = A · In identity for matrix multiplication Table 3.1: Properties of matrix multiplication One of the properties that is not in the table is commutativity, i.e. if A · B = B · A for any matrices A, B. The example on matrix multiplication already shows that matrix multiplication is not a commutative operation. Interchanging the order to B · A in the example gives a multiplication between a 3×4 with a 2×3 matrix. We have seen that these sizes do not match so we multiplication is not even defined in this case. Now let’s take another example. Let A be a 2 × 4 matrix and B be a 4 × 2 matrix. Determining A · B results into a 2 × 2 matrix in this case. Determining B · A results into a 4 × 4 matrix. In both cases, we can perform matrix multiplication, but it is obvious that A · B ̸= B · A. Lastly, we take the following example. Example (Counterexample commutativity). We take the following two matrices. 1 2 0 1 A= and B =. 0 3 0 0 Then 2 1 0 1 0 1 A·B = · = and 3 0 0 0 0 0 0 1 1 2 0 3 B·A= · = 0 0 0 3 0 0 With this example, we see that even if the sizes of A · B and B · A are the same, the products do not have to be the same. The example on commutativity show that we cannot simply apply the known computation rules for real numbers to matrices. Next, we discuss some interesting properties you should be careful with while performing computations. Let A, B, C be matrices such that AB = AC. For real numbers a, b, c and a ̸= 0, one can say that ab = bc =⇒ b = c. However, this does not necessarily hold for matrices. Take a look at the following example. 24 Example (Counterexample AB = AC =⇒ B = C). In this example, we give three matrices A, B, C so that AB = AC but B ̸= C. 2 −3 8 4 5 −2 A= , B= , C=. −4 6 5 5 3 1 Then 1 −7 1 −7 A·B = and A · C =. −2 14 −2 14 In a similar way, for matrices A, B, we have that A · B = 0 does NOT imply that A = 0 or B = 0 (Here 0 corresponds to the zero-matrix). Can you come up with an example such that A · B = 0 but A and B are non-zero matrices? Lastly, we discuss the powers of a matrix. Let A be a square matrix of size n, i.e. A is an n × n matrix. Let k be a positive integer. Then Ak = A · A ·... · A (a total of k times). We define for any matrix A that A0 = In (the identity matrix of size n). Note that, in order to take powers of a matrix A, the matrix A must be a square matrix. Example (Powers of matrices). Let A be the following matrix 3 0 A=. 1 −2 We compute A3 as follows: 3 3 3 0 A = 1 −2 3 0 3 0 3 0 = · · 1 −2 1 −2 1 −2 9 0 3 0 = · 1 4 1 −2 27 0 =. 7 −8 3.3 The Transpose and Inverse of a Matrix 3.3.1 The Transpose of a Matrix Let A be an m × n matrix. The transpose of A, denoted by AT , is an n × m matrix such that row i of A is column i of AT. Example (Transpose of a matrix). 1 5 1 3 −2 A= and AT = 3 0. 5 0 1 −2 1 If the matrix A is a square matrix, the size of A is the same as the size of AT. It may be the case that A = AT. Then we call A a symmetric matrix. An example of a symmetric matrix is 1 2 3 A = 2 3 4. 3 4 5 Some properties of the transpose of a matrix are the following. For any matrices A, B of appro- priate size (such that the operations are well-defined), we have that: 25 1. (AT )T = A. Taking the transpose twice gives back the original matrix. 2. (A + B)T = AT + B T. First adding two matrices and then transposing it gives the same result as first transposing the individual matrices and then adding them. 3. For any scalar r, (rA)T = rAT First multiplying with a scalar r and then transposing the matrix gives the same result as first transposing the matrix and then multiplying it with the scalar r. 4. (AB)T = B T AT. This last one is a little bit more tricky. Taking the transpose of a product of matrices is the same as taking the transpose individually BUT you have to reverse the order of the multiplication. Exercise: Take two 2 × 2 matrices and check the computation rules yourself! 3.3.2 The Inverse of a Matrix Now we have a look at taking the inverse of a matrix. Taking an inverse operation for real numbers is a very common operation. For example, when solving the equation 7x = 21, we “invert” 7 so that 7x = 21 1/7 · 7x = 1/7 · 21 x=3 We will focus on how we can perform such an inverse operation for matrices. Only square matrices can be invertible. If a matrix is non-square, an inverse cannot exist. This leads to the following definition. Definition (Invertible matrix). An n × n matrix A is said to be invertible if there exists an n × n matrix A−1 such that A · A−1 = I and A−1 A = I. Here I = In , the identity matrix of size n. We call A−1 the inverse of A. If there only exists an A−1 such that A · A−1 = I, then matrix A is right-invertible. Similarly, if there only exists an A−1 such that A−1 · A = I, then matrix A is left-invertible. In this way, if a matrix is both right- and left-invertible and these matrices are the same, then a matrix is invertible. A matrix that is NOT invertible is also called a singular matrix. A matrix that is invertible is also called a nonsingular matrix. The inverse of a matrix is always unique. Example (Inverse of a matrix). We provide an example of a square matrix A and its inverse A−1. We verify that the inverse is indeed correct. 2 3 −1 −7 3 A= , A =. 5 7 5 −2 We verify that A−1 is indeed the inverse of A. −1 2 3 −7 3 1 0 −1 −7 3 2 3 1 0 A·A = · = and A · A = · =. 5 7 5 −2 0 1 5 −2 5 7 0 1 Now the question remains how we find such an inverse. The basic idea is that we want to solve a multiple systems of linear equations at the same time. We take a look at the matrix 4 3 A=. 3 2 26 What we are trying to solve is the following: 4 3 a a1,2 1 0 A · A−1 = I =⇒ · 1,1 = =⇒ 3 2 a2,1 a2,2 0 1 ( ( 4a1,1 + 3a2,1 = 1 4a1,2 + 3a2,2 = 0 3a1,1 + 2a2,1 = 0 3a1,2 + 3a2,2 = 1 We will show how we can solve these systems simultaneously. This will be done in the following way. We combine the augmented matrices for both systems, which always results in the matrix A next to the identity matrix. This is denoted by (A|I). 4 3 1 0 (A|I) =. 3 2 0 1 The next step is to bring the matrix A to the reduced row echelon form. We perform the following steps for this. 4 3 1 0 1st −2nd 1 1 1 −1 ∼ 3 2 0 1 3 2 0 1 nd 2 −3·1 st 1 1 1 −1 ∼ 0 −1 −3 4 −1·2 nd 1 1 1 −1 ∼ 0 1 3 −4 st 1 −2 nd 1 0 −2 3 ∼. 0 1 3 −4 When we have reduced A to the reduced row echelon form, we can read of the inverse because in the final step, we obtain (I|A−1 ). Therefore, the inverse of A is −1 −2 3 A =. 3 −4 Checking this by computing both A · A−1 = I and A−1 · A = I is left as an exercise to the reader. When a matrix is invertible, a system of linear equations has a unique solution. This leads to the following theorem. Theorem (5). Let A be an invertible n × n matrix. Then for each b ∈ Rn , the equation A⃗x = ⃗b has the unique solution ⃗x = A−1⃗b. Note that computing the inverse is not always the most efficient approach in practice to solve a system of linear equations. Recall that for any matrices A, B, C (with appropriate sizes) we cannot say that AB = AC =⇒ B = C. Now let’s see what happens if A is invertible. Let A, B, C be matrices of appropriate size such that matrix multiplication is well-defined and AB = AC. Now we assume that A is invertible, so there exists an A−1 such that A−1 A = I. Now we see that AB = AC =⇒ −1 A (AB) = A−1 (AC) =⇒ (A−1 A)B = (A−1 A)C =⇒ IB = IC =⇒ B = C. This shows that the property AB = AC =⇒ B = C if A is invertible. Let A, B be invertible matrices. Then the following results hold: 27 1. A−1 is invertible and (A−1 )−1 = A. The inverse of an invertible matrix is invertible and taking the inverse of an inverse gives back the original matrix. 2. AB is invertible and (AB)−1 = B −1 A−1. The product of two invertible matrices is invertible and is given by the product of the inverses of the individual matrices in reverse order. 3. AT is invertible and (AT )−1 = (A−1 )T. The transpose of an invertible matrix is invertible and taking the transpose first and then inverting it gives the same result as taking the inverse first and then transposing it. Exercise: Take two 2 × 2 matrices and check the computation rules yourself! A square matrix A is invertible if and only if A is row equivalent to I. This means that, if we can reduce a matrix A using row-operations to the identity matrix, then an inverse exist. If this is not the case, then A is not invertible. In lecture 5, the relation between invertibility and the determinant is discussed. 3.3.3 Invertible Linear Transformations Lastly, we discuss how the inverse of a linear transformation can be found. Let T : Rn → Rn be a linear transformation. We can apply T to elements of x ∈ Rn so that ⃗x can be mapped to T ⃗x. We want to find a linear transformation S : Rn → Rn such that T ⃗x can be mapped to ⃗x. Such an S exists if T is invertible. This gives the following property to T and S. S(T ⃗x) = (ST )⃗x = ⃗x and T (S⃗x) = (T S)⃗x = ⃗x. Now we translate this to matrices. As we have seen, a linear transformation T can be represented by a matrix A. Such a matrix is called the standard matrix. Therefore, to check whether a linear transformation is invertible, we simply check whether the corresponding matrix is invertible. If T , with standard matrix A, is invertible and its inverse is S, then the standard matrix of S is A−1. Example (Inverse of linear transformation). Let T be a linear mapping, such that 1 1 0 2 0 0 T 0 = −2 , T 1 = −3 , T 0 = −1. 0 0 0 4 1 −3 Then T has the following standard matrix 1 2 0 AT = −2 −3 −1. 0 4 −3 The standard matrix for T −1 , the inverse mapping of T , can be found by inverting matrix AT. We obtain 1 2 0 1 0 0 1 2 0 1 0 0 −2 −3 −1 0 1 0 ∼ 0 1 −1 2 1 0 0 4 −3 0 0 1 0 4 −3 0 0 1 1 0 2 −3 −2 0 ∼ 0 1 −1 2 1 0 0 0 1 −8 −4 1 1 0 0 13 6 −2 ∼ 0 1 0 −6 −3 1 0 0 1 −8 −4 1 28 So the standard matrix of T −1 is given by 13 6 −2 AT −1 = −6 −3 1 . −8 −4 1 3.4 Exercises The exercises of this lecture concern Sections §2.1, §2.2, §2.3 of the book. 1. Given are the matrices 1 −1 3 1 0 −3 7 −2 3 A= , B= , C= 2 0 2. 2 −1 −1 −5 0 0 −3 2 0 Determine B − 2A, AC, CA, B 2 , C 2. If the expression is undefined, explain why. 2. Let A be a 3 × 7 matrix and B be a 7 × 5 matrix. What is the size of AB? 3. How many rows and columns does a matrix C have if CA is a 2 × 5 matrix? 3 −2 11 −5 4. Let X = and XY =. Determine matrix Y. 6 2 −2 −22 5. For each of the following “properties”, provide an example on why they do not hold. Also, reflect on the “properties”. What makes them invalid? a. A2 = O =⇒ A = O. b. A2 = B 2 =⇒ A = B or A = −B. c. AB = C =⇒ A = B −1 C. 5 4 6. Determine the inverse of A =. Verify that the inverse is indeed correct! 9 7 7. Use the inverse you found in exercise 6 to solve the following system of equations: ( 5x1 + 4x2 = −1. 9x1 + 7x2 = 3 8. Suppose that AB = AC for some matrices A, B, C. What assumptions on matrix A are necessary so that B = C? 9. Suppose that (B − C)D = 0 for some matrices B, C and some invertible matrix D. Show that B = C. 1 −2 1 10. Find the inverse of A = 4 −7 3 . −2 6 −4 11. Is the following matrix invertible? Explain! 1 0 0 1 0 1 1 0 A= 0 1 1 0 1 0 0 1 12. Is the following matrix invertible? Explain! 1 0 1 A = 1 1 0 0 1 1 29 13. Let A⃗x = ⃗b be a system of equations that has more than one solution. Let A be an n × n matrix. Can the columns of A span Rn ? Explain why or why not. 14. Let A be a 10 × 10 matrix with linearly independent columns. How many solutions does the system A⃗x = ⃗b have? Explain! 15. Let T (x, y) = (5x + 7y, −3x − 6y) be a linear transformation. Show that T is invertible and find a formula for T −1. For some additional exercises, check Sowiso or use the provided book. Recommended exercises from the book for this paragraph are: The practice exercises of §2.1, §2.2, and §2.3. §2.1: 9, 35 §2.2: 2, 9, 41 §2.3: 41, 44 30 Chapter 4 Matrix Algebra (part 2) Paragraphs in the book: 2.5, 2.8, 2.9. In this lecture, we discuss how matrices can be factorized using the LU -decomposition. After that, we make a connection between matrices and subspaces. We discuss concepts like the basis and dimension of a subspace. Two spaces regarding matrices will be discussed in more detail, being the column space and null space of a matrix. 4.1 LU -Decomposition We know that numbers can be factorized. For example, 24 = 6 · 4. In this section, we discuss how we can factor a matrix. We call this the LU -decomposition. Given a square matrix A, we aim to find two matrices L, U such that A = L · U. Matrix L has to be a lower triangular matrix with 1’s on the diagonal and U is the echelon form of A. For a 4 × 4 matrix A, we have that L and U are in the following form: 1 0 0 0 ⋆ ⋆ ⋆ ⋆ ⋆ 1 0 0 0 ⋆ ⋆ ⋆ L= ⋆ ⋆ 1 0 , U = 0 0 ⋆ ⋆. ⋆ ⋆ ⋆ 1 0 0 0 ⋆ Here, all ⋆ symbols represent arbitrary values. These values may be zero but that depends on the matrix A. Note that not all square matrices have an LU -decomposition. Now let’s see how we can determine the LU -decomposition of a matrix A. The matrix U can be constructed by row-reducing the matrix A to the echelon form. We keep track of each operation that we perform in order to do this. The non-zero entries of L indicate what operation has been performed to reduce L to I. For example, if l1,2 = 4, it means that we have subtracted row 1 four times from row 2. We illustrate the procedure with an example. Note that the LU -decomposition is not unique. Example (LU -decomposition). Let A be the following matrix for which we want to determine the LU -decomposition: 1 2 3 A = 3 4 8. 0 2 3 First, we row-reduce A to an echelon form to determine U. While doing this, we keep track of all the row operations that are performed. This information will be used to determine L. We give 31 step-by-step how L is constructed. We start with 1 0 0 L = ⋆ 1 0. ⋆ ⋆ 1 The first row reduction we perform is R2 − 3R1 , which gives 1 2 3 nd st 1 2 3 1 0 0 2 −3·1 3 4 8 ∼ 0 −2 −1 , L = 3 1 0. 0 2 3 0 2 3 ⋆ ⋆ 1 We see that a3,1 = 0, so no row operations are required. Therefore l3,1 = 0 so 1 0 0 L = 3 1 0. 0 ⋆ 1 Continuing this process gives 1 2 3 1 2 3 1 0 0 3rd −(−1)·2st 0 −2 −1 ∼ 0 −2 −1 , L = 3 1 0. 0 2 3 0 0 2 0 −1 1 We have row reduced A to echelon form and it is an upper triangular matrix. This finishes the procedure and we conclude 1 0 0 1 2 3 L = 3 1 0 , U = 0 −2 −1 0 −1 1 0 0 2 We verify that this is indeed correct by computing L · U = A. Indeed 1 0 0 1 2 3 1 2 3 3 1 0 · 0 −2 −1 = 3 4 8. 0 −1 1 0 0 2 0 2 3 Now we look at the practical application of the LU -decomposition. Let A⃗x = ⃗b be a system of linear equations. Let A = L · U , the LU -decomposition of A. We can solve A⃗x = ⃗b by solving the following pair of systems of linear equations: ( L⃗y = ⃗b U⃗x = ⃗y Due to L and U being triangular matrices, it is relatively easy to solve these systems. However, note that determining the LU -decomposition is not guaranteed to be the most efficient way of solving a system of linear equations. 4.2 Subspaces of Rn In this section, we discuss some properties of matrices that help us to better understand how to interpret solutions of systems of linear equations. We look at the following matrix 1 −2 2 3 1 A= 2 −4 5 8 4. −3 6 −1 1 7 32 Suppose that A is a coefficient matrix of a system of linear equations. We could see the columns of A as vectors and obtain the column vectors 1 −2 2 3 1 { 2 , −4 , 5 , 8 , 4}. −3 6 −1 1 7 We see that all these vectors are elements of R3. Similarly, we can look at the rows, which gives the following row vectors { 1 −2 2 3 1 , 2 −4 5 8 4 , −3 6 −1 1 7 }. We see that these vectors are elements of R5. Lastly, we could also look at the vectors x that are solutions of the homogenous system A⃗x = 0. These vectors are also elements of R5. By this, we see that a matrix has certain vectors related to it. These vectors can be used to construct subspaces of Rn. Definition (Subspace of Rn ). A subspace of Rn is a set of vectors H that satisfy the following properties 1. The zero vector 0 ∈ H. 2. If vectors ⃗u, ⃗v ∈ H, then ⃗u + ⃗v ∈ H. This means that the sum of two vectors in the set is also in the set. 3. If vector ⃗v ∈ H and c ∈ R, then c · ⃗v ∈ H. This means that for any vector ⃗v in the set, all scalar multiples of ⃗v are also in the set. Let’s look at an example of a subspace of Rn. Example (Subspace of Rn ). We define the following set of vectors and show that this is a subspace of R3. x H = { 0 : x, z ∈ R}. z All vectors in this set have the second component equal to 0. Now we check whether all three properties of a subspace are satisfied. Firstly, the zero vector is in our set by taking the first and third coordinates equal to 0. For the second property, we take two arbitrary vectors in H, say x1 x2 0 and 0 . z1 z2 Adding both vectors gives x1 x2 x1 + x2 0 + 0 = 0 . z1 z2 z1 + z2 As the second coordinate is zero, we can conclude that the sum is also in our set. For the third property, we take an arbitrary scalar c ∈ R and vector ⃗v ∈ H and multiply them. We get x cx c · 0 = 0 . z cz 33 For any value of c, this vector is also in our set. This shows that H is a subspace of Rn. We also discuss an example that is not a subspace of Rn Example (No subspace of Rn ). We take the following set of vectors x H={ : x ∈ R} x+1 It is sufficient to give an example of vectors so that one of the three properties is violated. In this case, it is easiest to show that the zero vector is not in H as we need both coordinates to be zero which is not possible. We take a look at two special subspaces of Rn which can be related to matrices. The first one we discuss is the column space. Definition (Column space). Let A be a matrix. The column space of A, denoted by Col(A), is the set of vectors containing all linear combinations of the columns of A. Example (Column space). We take the following matrix