Mathematics I Part 1: Linear Algebra Week 1 PDF
Document Details
Uploaded by CalmingAnemone
Indian Institute of Information Technology, Guwahati
Tags
Summary
This document provides an introduction to linear algebra and covers topics such as linear equations, systems of linear equations, and matrix operations. The content presents definitions, examples, and exercises related to these concepts.
Full Transcript
Mathematics I Part 1: Linear Algebra Week 1 The basic motivation of studying linear algebra is the characterization of solutions of a system of linea...
Mathematics I Part 1: Linear Algebra Week 1 The basic motivation of studying linear algebra is the characterization of solutions of a system of linear equations. 1. A Linear Equation A linear equation in n variables x1 , x2 ,... , xn is given by a1 x1 + a2 x2 + · · · + an xn = b, (1) where ai , b ∈ R for 1 ≤ i ≤ n. A solution of the linear equation (1) is an n-tuple or values of the variables or a vector (s1 , s2 ,... , sn ) ∈ Rn such that a1 s1 + a2 s2 · · · + an sn = b. (1) The general linear equation in one variable is ax = b with a, b ∈ R. If a ̸= 0, then the equation has a unique solution given by x = ab. If a = 0, then the equation has no solution for b ̸= 0 and has all real numbers as solutions for b = 0. (2) Consider the linear equation in two variables ax + by = c with a, b, c ∈ R. If a = b = c = 0, then any (s1 , s2 ) ∈ R2 is a solution for the equation. If a = b = 0 but c ̸= 0, then the equation has no solution. If at least one of a and b is nonzero, then the points of this equation or the set of solutions of the equation given by {(s1 , s2 ) ∈ R2 : as1 + bs2 = c} represents a straight line in the two-dimensional plane R2 implying that the equation has infinitely many solutions. (3) In a similar way, the set of solutions or the points on the linear equation in three variables ax + by + cz = d with a, b, c, d ∈ R3 has also three possibilities. For a = b = c = 0, d ̸= 0, the set of solutions is empty. If a = b = c = d = 0, then all points in R3 are solutions. If at least one of a, b, c is nonzero, then the points on the equation represent a plane in the three-dimensional space R3. In general, the set of solutions of (1) is given by S := {(s1 , s2 ,... , sn ) ∈ Rn : a1 s1 + a2 s2 · · · + an sn = b}. If ai = 0 for all i but b ̸= 0, then S is empty. If ai = 0 for all i and b = 0, then S = Rn. If at least one of ai ’s is nonzero, say ak , then b − a1 s1 − · · · − ak−1 sk−1 − ak+1 sk+1 − an sn S = {(s1 ,... , sk−1 , , sk+1 ,... , sn ) : s1 ,... , sn ∈ R}. ak is an infinite set. Exercise: Find the set of solutions of the following equations, and describe them geometrically. (1) 3x = 2 (2) 2x + 3y = 1 (3) 7x − 3y = 0 (4) x − 3y + 2z = 3 (5) 2x + 7y − z = 0 2. System of linear equations We begin with a system of two linear equations in two variables given by a1 x + b1 y = c1 and a2 x + b2 y = c2 (2) with a1 , b1 , a2 , b2 , c1 , c2 ∈ R. A solution of this system of linear equations is a vector or an ordered pair or a point (s1 , s2 ) ∈ R2 that is simultaneously a solution of each equation in the system. The set of solutions of the system of linear equations (2) is given by S := {(s1 , s2 ) ∈ R2 : a1 s1 + b1 s2 = c1 and a2 s1 + b2 s2 = c2 }. 1 2 The solution of a system of two linear equations in two variables, if exists, can be obtained using the following methods (we consider that at least one of a1 , b1 and at least one of a2 , b2 is nonzero): (1) Geometrical method: Note that the points on each of the equations of (2) represents a straight line, and hence (2) is a collection of two straight lines. As a result, the two straight lines may intersect at a point (2x − 3y = 1, x + 2y = 3) or overlap completely (3x + y = 3, 6x + 2y = 6) or are distinct parallel lines (x + 3y = 2, 2x + 6y = 3) implying that the system of equations has a unique solution, infinitely many solutions, or no solution, respectively. (2) Cramer’s rule: For the system of equations (2), the Cramer’s rule gives x y 1 = = , ∆1 ∆2 ∆ c 1 b1 a1 c1 a1 b1 where ∆1 = , ∆2 = , and ∆ =. It is easy to see that (2) has c 2 b2 a2 c2 a2 b2 unique solution if ∆ ̸= 0, given by x = ∆∆1 , y = ∆∆2 , infinitely many solutions if ∆ = ∆1 = ∆2 = 0, and no solution if ∆ = 0 but atleast one of ∆1 and ∆2 is nonzero. (3) Variable elimination and back substitution: If any one of the constants a1 , a2 , b1 , b2 is zero, then one can directly obtain the value of one variable. Otherwise (all a1 , a2 , b1 , b2 are nonzero), we multiply the equations by suitable numbers and then add/subtract them to eliminate one of the variables. For example, a2 × (2) first − a1 × (2) second implies (b1 a2 − b2 a1 )y = c1 a2 − c2 a1 (3) Thus if b1 a2 − b2 a1 ̸= 0, then the system (3) has a unique solution. If b1 a2 − b2 a1 = 0 but c1 a2 − c2 a1 ̸= 0, then (3) does not have a solution. Again, if b1 a2 − b2 a1 = c1 a2 − c2 a1 = 0, then (3) has infinitely many solutions. −c2 a1 In the case of b1 a2 − b2 a1 ̸= 0, we have y = cb11 aa22 −b2 a1 , and then back substituting this in c1 b2 −c2 b1 any of the equations of (2), we obtain the value x = b1 a2 −b2 a1. These three methods can also be used for a system of two or more equations of three variables. However, the visualization of the planes represented by the equations of three variables may be difficult. In addition, the graphical method is impossible to use for a system of equations involving more than three variables. The Cramer’s rule for more variables has high complexity (takes a very long time to execute) due to the need to find determinants of matrices. This is because, for a system of 10 equations with 10 variables, one needs to find 11 determinants of order 10 × 10. Each determinant of order 10 × 10 consists of 100 determinants of order 9×9, and then each 9×9 determinant consists of 81 determinants of order 8 × 8, and so on. Moreover, the Cramer’s rule can only be used if the system has the equal number of variables and equations due to the need for a square matrix to evaluate determinants. Finally, the variable elimination and back substitution method is also very tricky when the number of variables and equations is more as it needs to find a suitable combination of equations to eliminate a variable. However, one can note from this method that the systems of equations x + y = 1, 2x − 5y = 3; or x + y = 1, 7y = −1; or x + y = 1, 7x = 8; or 2x − 5y = 3, 7x = 8; and so on, has the same set of solutions. Clearly, finding solutions for the later system of equations is easier compared to that of the first system of equations. Thus motivated by this, Gauss utilized the idea of the equivalent system of linear equations (system of linear equations having identical set of solutions), and introduced a method called the Gauss-elimination method (almost identical to variable elimination and back substitution) using matrices and algebra of matrices. Thus we now begin with 3 the representation of a system of linear equations in the form of matrices, and then we concentrate on the algebra of matrices to use them to solve the system of linear equations using the Gauss elimination method. A general system of m linear equations in n variables is given by a11 x1 + a12 x2 + · · · + a1n xn = b1 a21 x1 + a22 x2 + · · · + a2n xn = b2... am1 x1 + am2 x2 + · · · + amn xn = bm , (4) where aij , bi ∈ R for 1 ≤ i ≤ m and 1 ≤ j ≤ n. Using matrices, one can write the system of equations (4) in the form Ax = b, (5) where a11 a12 ··· a1n x1 b1 a21 a22 ··· a2n x2 b2 A= ,x = and b = . .. .. .. . . . am1 am2 ··· amn m×n xn n×1 bm m×1 Here A is called the coefficient matrix, x isthe matrix of variables, and b is the matrix of constants. y1 y2 A solution for (5) is a column matrix y = . such that Ay = b. In this connection, there .. ym m×1 is another matrix a11 a12... a1n b1 a21 a22... a2n b2 (A|b) := , .......... ..... am1 am2... amn bm m×(n+1) called an augmented matrix. Based on the discussion of the two variable cases, one can easily note (we shall prove later) that the system of equations (4) can have no solution or unique solution or infinitely many solutions. If a system of linear equations has at least one solution, then we call the system of equations consistent. Otherwise, the system of linear equations is called inconsistent. With respect to (5), there is an associated equation given by Ax = 0, (6) which is called a homogeneous equation. Note that(5) is a non-homogeneous equation if b ̸= 0. 0 0 Clearly, the zero vector or the zero matrix x = 0 := . is a solution for (6), and it is called .. 0 m×1 the trivial solution for (6). Any solution other than the zero vector of a homogenous system of linear equations is called a nontrivial solution. Exrecise: Solve the following system of linear equations using the graphical method, Cramer’s rule, and variable elimination and back substitution: (1) 2x + 3y = 7, x − 2y = 3 (2) x − 3y + 5z = 3, y − 3z = 1, 2x + y − z = 7 (3) 2x + y − z = 3, 3x − 5y + 2z = 1, 5x − 13y + 7z = −3 4 Exercise: If X1 is a nontrivial solution of (6), then for any c ∈ R, show that cX1 is also a solution of (6). Hence deduce that a homogeneous system of linear equations can have the trivial solution or infinitely many solutions. Exercise: If X1 and X2 are two distinct solutions of (5) (or the system of equations (4)), then for any c ∈ R, show that X1 + c(X1 − X2 ) is also a solution of (5) (or (4)). Hence deduce that a system of linear equations can not have more than 1 and a finite number of solutions. 3. Matrices A matrix is a rectangular array of numbers, called entries or elements of the matrix. The order of a matrix is said to be m × n if the rectangular arrangement has m rows and n columns. We generally denote an m × n matrix by A := [aij ]m×n or a11 a12 ··· a1n a21 a22 ··· a2n A := , .. . am1 am2 · · · amn 1 0 5 7 −3 2 where aij ∈ R for all 1 ≤ i ≤ m, 1 ≤ j ≤ n. For example, M = −1 2 is a 3 × 4 matrix, 1 4 5 −2 1 −1 3 N= is a 2 × 2 matrix, C = −3 is a 3 × 1 matrix, R = 1 0 5 −1 3 2 is 7 5 5 5 0 0 0 3 0 0 0 5 0 0 a 1 × 6 matrix, D = 0 1 0 is a 3 × 3 matrix, S = 0 0 5 0 is a 4 × 4 matrix. 0 0 −5 0 0 0 5 Remark: (1) For any m ∈ N, an m × 1 matrix is called a column matrix. For example, C is a column matrix. (2) For any n ∈ N, a 1 × n matrix is called a row matrix. For example, R is a row matrix. (3) For any n ∈ N, an n × n matrix is called a square matrix. For example, N, D, S are square matrices. (4) For a square matrix A = [aij ]n×n , the entries aii for 1 ≤ i ≤ n are called diagonal entries or diagonal elements. For N , the elements −1, 5 are diagonal and for D, the elements 3, 1, −5 are diagonal. (5) If all non-diagonal elements of a square matrix A = [aij ]n×n are zero (i.e. aij = 0 for all i ̸= j), then the matrix is called a diagonal matrix. For example, D, S are diagonal matrices. Moreover, if the diagonal entries of a diagonal matrix are equal, then the matrix is called a scalar matrix. For example, S is a scalar matrix. (6) A square matrix A = [aij ]n×n is called a upper triangular matrix if aij = 0 for all i> j, 3 1 −1 i.e. every entry below the main diagonal is zero. For example, A = 0 −5 2 , 0 0 0 3×3 7 3 A= are upper triangular matrices. 0 −3 2×2 5 (7) A square matrix A = [aij ]n×n is called a lower triangular matrix if a ij = 0 for all i< j, 3 0 0 i.e. every entry above the main diagonal is zero. For example, A = −2 −1 0 , 0 5 1 3×3 7 0 A= are lower triangular matrices. −1 −3 2×2 One can consider the matrices as a generalization of n-tuples or vectors due to the fact that a matrix can be written in terms of column matrices (a column matrix can be viewed as a vector) or row matrices (a row matrix can be viewed as a vector). For example, R1 R2 A = (C1 , C2 ,... , Cn ) = . , .. Rm a1i a2i where Ci = . and Ri = (ai1 , ai2 ,... , ain ). .. ami Equality of matrices: Two matrices A = [aij ]m×n and B = [bij ]ℓ×k are said to be equal, with 1 ≤ i ≤m and 1 ≤ j≤ n. written as A = B, if and only if m = ℓ, n = k and aij =bij for all i, j x x+y −1 0 The values of x, y, z for which the two matrices A = and B = are −2 z − 2 x − y 3 x x+y equal are given by x = −1, y = 1, z = 5. However, the two matrices A = and −2 z − 2 −1 1 B= are never equal for any values of x, y, z ∈ R. x−y z+x 3.1. Operation on matrices. Let A = [aij ]m×n , B = [bij ]ℓ×k be two matrices with real entries, and α ∈ R. (1) Matrix addition: The matrix addition of A and B is possible only if m = ℓ and n = k. In this case, the addition of A and B, written as A + B, is given by A + B := [aij + bij ]m×n. (2) Scalar multiplication: The scalar multiplication of a matrix A by a scalar α ∈ R, denoted by αA, is defined as αA = [αaij ]m×n. (3) Matrix multiplication: The matrix multiplication of A and B is possible only if n = ℓ. In this case, the multiplication of A and B, written as A · B of AB, is given by AB := [cij ]m×k , Pn where cij = r=1 air brj. Theorem 3.1. Let Mm×n (R) := {A = [aij ]m×n : aij ∈ R for all 1 ≤ i ≤ m, 1 ≤ j ≤ n} denote the set of all m × n real matrices. (1) For all A, B ∈ Mm×n (R) and α ∈ R, we have A + B, αA ∈ Mm×n (R). (2) For all A, B, C ∈ Mm×n (R), we have A + (B + C) = (A + B) + C. (3) For all A, B ∈ Mm×n (R), we have A + B = B + A. (4) For any given A ∈ Mm×n (R), there exists O := m×n ∈ Mm×n (R) such that A + 0 = A = 0 + A. Here O is the zero matrix. (5) For any given A = [aij ]m×n ∈ Mm×n (R), there exists −A = [−aij ]m×n ∈ Mm×n (R) such that A + (−A) = O = (−A) + A. (6) For all A, B ∈ Mm×n (R) and α, β ∈ R, we have α(A + B) = αA + αB. (7) For all A ∈ Mm×n (R) and α, β ∈ R, we have (α + β)A = αA + βA. 6 (8) For all A ∈ Mm×n (R) and α, β ∈ R, we have α(βA) = (αβ)A. (9) For all A ∈ Mm×n (R), we have 1A = A. Proof. The proof of (1) follows from the definition. For (2), let A = [aij ]m×n , B = [bij ]m×n , and C = [cij ]m×n. Then A + (B + C) = [aij ]m×n + ([bij ]m×n + [cij ]m×n ) = [aij ]m×n + [(bij + cij )]m×n = [aij + (bij + cij )]m×n and (A + B) + C = ([aij ]m×n + [bij ]m×n ) + [cij ]m×n = [(aij + bij )]m×n + cij ]m×n = [(aij + bij ) + cij ]m×n. Since aij , bij , cij are real numbers, we have aij + (bij + cij ) = (aij + bij ) + cij , and hence the result follows. The proof of (3) follows along the same line as the proof of (2). We now give a proof of (4), and the proof of (5) follows in the similar way. For given A = [aij ]m×n ∈ Mm×n (R), let P = [pij ]m×n such that A + P = A = P + A. Then [aij + pij ]m×n = [aij ]m×n = [pij + aij ]m×n , and hence using the equalty of matrices, we have aij + pij = aij = pij + aij for all 1 ≤ i ≤ m, 1 ≤ j ≤ n. This implies that pij = 0 for all 1 ≤ i ≤ m, 1 ≤ j ≤ n due to the fact that aij , pij ∈ R. Hence P = O = m×n. Finally, we give a proof of (7), and the proofs of (6),(8), and (9) follow along the same line. Let A = [aij ]m×n and α, β ∈ R. Then (α+β)A = (α+β)[aij ]m×n = [(α+β)aij ]m×n = [αaij +βaij ]m×n = [αaij ]m×n +[βaij ]m×n = αA+βA. □ Theorem 3.2. Let Mn×n (R) := {A = [aij ]n×n : aij ∈ R for all 1 ≤ i, j ≤ n} denote the set of all square real matrices of order n. (1) For all A, B ∈ Mn×n (R), we have AB ∈ Mn×n (R). (2) For all A, B, C ∈ Mn×n (R), we have A(BC) = (AB)C. (3) For any given A ∈ Mn×n (R), there exists I := [(dij )]n×n with dii = 1 and dij = 0 for all i ̸= j such that AI = A = IA. Here I is called the identity matrix. (4) For any A, B, C ∈ Mn×n (R), we have A(B + C) = AB + AC and (A + B)C = AC + BC. Proof. The proof of (1) follows from the definition. For (2), let A = [(aij ]n×n , B = [(bij ]n×n , and C = [(cij ]n×n. Suppose D = BC and P = AB, then D = [dij ]n×n and P = [pij ]n×n , where ∀1 ≤ i, j ≤ n, dij = bi1 c1j + bi2 c2j + · · · + bin cnj and pij = ai1 b1j + ai2 b2j + · · · + ain bnj. Now (AB)C = P C =[pi1 c1j + pi2 c2j + · · · + pin cnj ]n×n = [(ai1 b11 + ai2 b21 + · · · + ain bn1 )c1j + (ai1 b12 + ai2 b22 + · · · + ain bn2 )c2j + · · · + (ai1 b1n + ai2 b2n + · · · + ain bnn )cnj ]n×n =[ai1 (b11 c1j + b12 c2j + · · · + b1n cnj ) + ai2 (b21 c1j + b22 c2j + · · · + b2n cnj ) + · · · + ain (bn1 c1j + bn2 c2j + · · · + bnn cnj )]n×n =[ai1 d1j + ai2 d2j + · · · + ain dnj ]n×n =AD = A(BC). 7 We now give a proof of (3), and leave (4) as an exercise. To prove (3), we consider A = [aij ]n×n and Q = [qij ]n×n such that AQ = A = QA. As a result, it is easy to see that [ai1 q1j + ai2 q2j + · · · + ain qnj ]n×n = [aij ]n×n = [qi1 a1j + qi2 a2j + · · · + qin anj ]n×n. Thus ai1 q1j + ai2 q2j + · · · + ain qnj = aij = qi1 a1j + qi2 a2j + · · · + qin anj for all i, j. Note that if qjj = 1 and qij = 0 for i ̸= j, then the above equality holds. Thus Q = I = [qij ]n×n such that qjj = 1 and qij = 0 for i ̸= j satisfies AI = A = IA. □ Exercise: (1) Find two matrices A and B such that (a) AB exists but BA does not exists (b) both AB and BA exist but AB ̸= BA (c) AB = BA. (2) Find A and B such that AB = 0 but A ̸= 0 and B ̸= 0. (3) Prove of disprove that AB = AC implies B = C. 3.2. Certain important terminologies and their properties. (1) Transpose of a matrix: The transpose of a matrix A is the matrix AT obtained by ex- T changing the rows and columns of A. If A =[aij ]m×n , then A := [aji ]n×m. For example, if 2 −3 12 2 1 −1 0 1 5 −7 A = −3 5 4 2 , then AT = −1 4 0 12 −7 0 9 3×4 0 2 9 4×3 (2) Symmetric matrix: If a matrix A and its transpose are equal, i.e., A = AT , then the matrix A is called a symmetric matrix. Only a square matrix can be symmetric. A matrix A = [aij ]n×n is symmetric if and only if aij = aji for all 1 ≤ i, j ≤ n. The matrices 2 1 −1 A = 1 5 −7 , O = n×n , I = [aij ] with aii = 1 and aij = 0 for i ̸= j are −1 −7 0 3×3 symmetric matrices. (3) Anti-symmetric matrix or skew-symmetric : If a matrix A is equal to the negative of its transpose, i.e. A = −AT , then the matrix A is said to be anti-symmetric. Only a square matrix can be anti-symmetric. A matrix A = [aij ]n×n is anti-symmetric if and only 0 3 −1 if aij = −aji for all 1 ≤ i, j ≤ n. The matrices A = −3 0 −5 , O = n×n are 1 5 0 3×3 anti-symmetric, but I = [aij ] with aii = 1 and aij = 0 for i ̸= j is not anti-symmetric matrices. (4) Inverse of a matrix: A square matrix A is said to be invertible (or nonsingular), if there exists a square matrix B of the same size such that AB = BA = I. In this case, B is said to be an inverse of the matrix A. If C is another inverse of A then B = BI = B(AC) = (BA)C = IC = C, and hence the inverse of a square matrix A is unique, if exists. Thus we denote B = A−1. Theorem 3.3. Let A, B be two real square matrices and α ∈ R, then (1) (AT )T = A (2) (A + B)T = AT + B T (3) (αA)T = αAT (4) (AB)T = B T AT Proof. Let A = [(aij ]n×n and B = [bij ]n×n. (1) Clearly AT = [aji ]n×n , and hence (AT )T = [aij ]n×n = A. (2) Note that A + B = [aij + bij ]n×n. Letting cij = aij + bij , we have (A + B)T = [cji ]n×n = [aji + bji ]n×n = [aji ]n×n + [bji ]n×n = AT + B T. 8 (3) is left as an exercise. (4) Note that AB = [cij ]n×n such that cij = ai1 b1j + ai2 b2j + · · · + ain bnj. Then cji = aj1 b1i + aj2 b2i + · · · + ajn bni. As a result, we have (AB)T = [cji ]n×n. It is easy to see that AT = [aji ]n×n and B T = [bji ]n×n , and hence B T AT = [dji ]n×n such that dji = aj1 b1i + aj2 b2i + · · · + ajn bni. Since cji = dji for all i, j, hence the result follows. □ Exercise: (1) If A is a square matrix, then show that A + AT is a symmetric matrix. (2) For any matrix A, show that AAT and AT A are symmetric matrices. (3) Find the values of diagonal entries of an anti-symmetric matrix. Theorem 3.4. Let A and B be two invertible matrices, then (1) A−1 is also invertible, and (A−1 )−1 = A. (2) (AB)−1 = B −1 A−1. Proof. The proof of (1) follows from the definition. We now give proof of (2). Using associativity of matrices, we have (AB)(B −1 A−1 ) = A(BB −1 )A−1 = AIA−1 = AA−1 = I and (B −1 A−1 )(AB) = B −1 (A−1 A)B = A−1 IA = A−1 A = I. As a result, (AB) = B −1 A−1. −1 □ 4. Elementary row operations and row Echelon form Two systems of linear equations given by Ax = b and Px = q are said to be equivalent if both systems have the same set of solutions. In the Gauss-elimination method, we use elementary row operations (we define below) on the augmented matrix [A|b] of a system of equations Ax = b to reduce it to a row echelon form (we define below) resulting an equivalent system of linear equations, which only needs back substitution to find the solutions. Definition 1 (Elementary row operations). Let A = [aij ]m×n be a matrix. Then there are three elementary row operations given by (1) Ri ←→ Rj ; exchanging of ith row of A and jth row of A. (2) Ri −→ cRi ; replacement of ith row of A by multiplication of ith row of A by a nonzero constant c. (3) Ri −→ cRj + Ri ; replacement of ith row of A by c times of jth row of A plus ith row of A. For example; −2 1 −1 −2 1 −1 −4 2 −2 −4 2 −2 R ←→R3 R −→2R1 R −→5R3 −R2 1 0 2 −−2−−−−→ 5 7 3 −−1−−−−→ 5 7 3 −−2−−−−−−−→ 0 −7 7 5 7 3 3×3 1 0 2 1 0 2 1 0 2 Using the elementary row operations suitably on a matrix, one can obtain a row echelon form of the matrix. Remark: It is easy to see that the effect of each elementary row operation on a matrix can be undone by another elementary row operation of the same kind. The reverse elementary operations of Ri ←→ Rj , Ri → cRi (c ̸= 0), and Ri → Ri + cRj are Rj ←→, Ri → 1c Ri , and Ri → Ri − cRj , respectively. 9 Definition 2 (Row echelon form). A matrix A = [aij ]m×n is said to be in a row Echelon form if (1) all elements below each leading entry or pivot element (first non zero element of a row) in a column are zero. (2) the leading entry of a row lies right to the leading entry of the previous row. −4 2 −2 0 7 −2 3 0 0 0 For example, 0 −7 7 , 0 0 −1 , 0 −2 7 0 are in row echelon form. 0 0 2 0 0 0 0 0 2 −1 −4 2 −2 0 0 −2 However, 0 −7 7 , 0 −7 7 are not in row echelon form. 1 0 2 1 −1 2 Remark: The row echelon form of a matrix is not unique. We use elementary row operations on the matrix 5 3 14 4 0 1 2 1 . 1 −1 2 0 to reduce it to a row echelon form. Note that 5 3 14 4 0 8 4 4 0 0 −12 −4 R1 −→−5R3 +R1 R1 −→−8R2 +R1 0 1 2 1 −−−−−−−−−−→ 0 1 2 1 −−−−−−−−−−→ 0 1 2 1 1 −1 2 0 1 −1 2 0 1 −1 2 0 and 0 0 −12 −4 1 −1 2 0 R ←→R3 0 1 2 1 −−1−−−−→ 0 1 2 1 1 −1 2 0 0 0 −12 −4 Exercise: Reduce the following matrices to a row echelon form: 1 2 1 (1) −1 0 2 2 1 −3 1 −1 2 −3 4 1 0 2 (2) 0 3 0 4 0 1 0 2 Definition 3 (Rank). The number of nonzero rows in a row echelon form of a matrix is called the rank of the matrix. 2 3 0 1 0 Example: 0 3 2 and have rank 2. 0 1 0 0 0 5. Gauss-elimination method It is a modified version of the variable elimination and back substitution method. In this method, we use the algebra of matrices to solve a system of linear equations. In fact, we follow the following steps: (1) express the system of linear equations in the matrix form. (2) use elementary row operations on the augmented matrix to reduce it to a row echelon form. (3) convert back to an equivalent system of linear equations and solve using back substitution. We solve the system of linear equations x − y − z = 2, 3x − 3y + 2z = 16, 2x − y + z = 9 using the Gauss-elimination method. 10 The system of linear equations can be written in the form 1 −1 −1 x 2 3 −3 2 y = 16 . 2 −1 1 z 9 Note that the augmented matrix 1 −1 −1 2 1 −1 −1 2 1 −1 −1 2 R3 −→−2R1 +R3 R −→−3R1 +R2 3 −3 2 16 − −−−−−−−−−→ 3 −3 2 16 −−2−−−−−−−−→ 0 0 5 10 2 −1 1 9 0 1 3 5 0 1 3 5 and 1 −1 −1 2 1 −1 −1 2 R ←→R3 0 0 5 10 −−2−−−−→ 0 1 3 5 . 0 1 3 5 0 0 5 10 Thus an equivalent system of linear equations x − y − z = 2, y + 3z = 5, 5z = 10, is obtained, which gives z = 2, y = −1, x = 3 on back substitution. Exercise: Solve the following system of linear equations using the Gauss-elimination method: (1) x + y + 2z = 3, 7x − 3y + 9z = 15, −x + 2y − z = 1 (2) 2x − y + z = 2, x − 3y + 7z = 5, 7x + 4y − 16z = 3 (3) 2x + 3y − z + 5w = 12, x + 2z − w = 7, 5x − 7y + 2y − 2w = 9, −6y + 3z − w = 13 6. Reduced row echelon form and Gauss-Jordan elimination The Gauss-Jordan elimination is an extended version of the Gauss elimination method which sim- plifies the back substitution phase in solving a system of linear equations. In this method, the row echelon form (REF) is further extended to a reduced row echelon form (RREF). Definition 4 (Reduced row echelon form). A matrix A is said to be in reduced row echelon form (RREF) if (1) A is in row echelon form. (2) each leading entry of a row is 1. (3) all other entries of a column containing 1 as a leading entry are zero. −3 0 1 2 0 1 −1 0 1 0 1 2 For example, 0 1 0 0 are in row echelon form but not in reduced 0 0 5 2 , 0 0 1 2 0 0 0 0 1 −2 0 0 row echelon form, 0 0 1 0 is in reduced row echelon form. The identity matrix is always 0 0 0 1 in reduced row echelon form. Remark 6.1. A reduced row echelon form is always a row echelon form. However, the converse is not true. The reduced row echelon form of a matrix is always unique, unlike the row echelon form. 5 3 14 4 We reduce the matrix A = 0 1 2 1 to a reduced row echelon form. Using R1 ←→ R3 , 1 −1 2 0 we have 1 −1 2 0 1 −1 2 0 1 −1 2 0 R3 →R3 −5R1 R3 →R3 −8R2 A≡ 0 1 2 1 −−−−−−−−→ 0 1 2 1 −−−−−−−−→ 0 1 2 1 , 5 3 14 4 0 8 4 4 0 0 −12 −4 11 which is already in row echelon form, but not in reduced row echelon form. Thus 1 −1 2 0 1 0 4 1 R2 →R2 + 16 R3 1 0 4 1 R1 →R1 +R2 1 0 1 2 1 −−−−−−−−→ 0 1 2 1 −−−−−−−−−→ 0 1 0 3 0 0 −12 −4 0 0 −12 −4 0 0 −12 −4 and −1 −1 1 0 4 1 R →R + 1 R 1 0 0 3 1 R3 →− 12 1R3 0 0 3 1 1 1 3 3 1 1 0 1 0 3 − − − −−− − − −→ 0 1 0 3 −−−−−−−−→ 0 1 0 3 . 1 0 0 −12 −4 0 0 −12 −4 0 0 1 3 6.1. Gauss-Jordan elimination. In this method, we follow the following steps to solve a system of linear equations: (1) express the system of linear equations in the matrix form. (2) use elementary row operations on the augmented matrix to reduce it to a reduced row echelon form. (3) convert back to an equivalent system of linear equations, which automatically gives the solu- tions, if the system is consistent. We solve the system of linear equations x + y + z = 6, x + 2y + 3z = 14, x + 4y + 7z = 30 using the Gauss-Jordan elimination. The augmented matrix corresponding to the system of linear equations is 1 1 1 6 [A|b] = 1 2 3 14 . 1 4 7 30 Using the elementary row operations R2 → R2 −R1 , R3 → R3 −R1 , R3 → R3 −3R2 , and R1 → R1 −R2 in a sequence, we reduce the augmented matrix to the reduced row echelon form 1 0 −1 −2 [A|b] ≡ 0 1 2 8 . 0 0 0 0 This provides an equivalent system of linear equations as x − z = −2, y + 2z = 8. Taking z = t for any t ∈ R, we obtain the solution as x = t − 2, y = 8 − 2t, z = t. Thus the system of equations has infinitely many solutions. Remark 6.2. The system of linear equations Ax = b is consistent if rank[A|b] =rank(A). In addition, if rank[A|b] =rank(A) = number of variables, then the system has a unique solution. However, the system of linear equations has infinitely many solutions for the case rank[A|b] =rank(A) < number of variables. In this case, the variables corresponding to the leading entries in the RREF of [A|b], called leading variables, can be expressed in terms of the other (number of variables − rank[A|b]) free variables. Exercise: Find the reduced row echelon form of the following matrices: 3 −1 3 4 −2 −1 −3 −1 0 3 4 1 1 2 3 −1 (a) 2 3 (b) 7 5 1 0 1 1 2 5 11 6 0 1 1 −1 Exercise: Solve the following system of linear equations using the Gauss-Jordan elimination method: (a) x + y + z = 9, 2x + 5y + 7z = 52, 2x + y − z = 0. (b) x + y + z = −3, 3x + y − 2z = −2, 2x + 4y + 7z = 7. (c) x + y + z = 1, x + 2y + 4z = 1, x + 4y + 10z = 1. 12 Exercise: Find the value(s) of a, b such that the system of linear equations x + y + z = 1, x + 2y + 3z = 10, 5x + 7y + az = b has (i) a unique solution (ii) no solution (iii) infinitely many solutions.