Chapter 6 - Orthogonality PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document details the concept of orthogonality in linear algebra, covering dot products, vector modulus, and orthogonal sets. It also introduces the Gram-Schmidt process for creating orthogonal bases. The content is useful for a university-level linear algebra course.
Full Transcript
Chapter 6 - ORTHOGONALITY 6.1. Dot Product and Modulus The dot product, or scalar product, ⟨u, v⟩ of two vectors u and v of Kn is the number u∗v ∈ K. v1 ...
Chapter 6 - ORTHOGONALITY 6.1. Dot Product and Modulus The dot product, or scalar product, ⟨u, v⟩ of two vectors u and v of Kn is the number u∗v ∈ K. v1 v ⟨u, v⟩ = u∗v = [ u1 u2... un ].2 = u1v1 + · · · + unvn. . vn EXAMPLE: 2 2+i , = [ ] = 1−i i = Algebra 2024-25 6-1 Properties Let u, v and w be vectors of Kn and α, β ∈ K: 1. ⟨u, u⟩ ∈ R 2. ⟨u, u⟩ ≥ 0 3. ⟨u, u⟩ = 0 ⇔ u=0 4. ⟨u, v⟩ = ⟨v, u⟩ 5. ⟨u, αv + βw⟩ = α⟨u, v⟩ + β⟨u, w⟩ 6. ⟨αu+βv, w⟩ = α ⟨u, w⟩+β⟨v, w⟩ Algebra 2024-25 6-2 The modulus (or length, or norm) of a vector v, denoted by |v| or ||v|| , is the non-neggative real number p |v| = ⟨v, v⟩ EXAMPLE: 2+i 2+i u = 1 ⇒ |u|2 = [ 2−i 1 − i ] 1 = i i Properties Let u and v be vectors of Kn and α ∈ K: 1. |u| ≥ 0 2. |u| = 0 ⇔ u=0 3. |αu| = |α| |u| 4. |u + v| ≤ |u| + |v| (Triangle Inequality) Algebra 2024-25 6-3 A unit vector is a vector whose modulus is 1. If we divide a nonzero vector v by its modulus, we obtain a unit vector u in the same direction v 2 u = Example: v = 1 ⇒ u= |v| −1 This process is refered to as normalizing v , and u is said to be normalized. The distance from the point x to the point y is defined d(x, y) = |x − y| It holds that d(x, y) = 0 ⇔ x = y. Algebra 2024-25 6-4 6.2. Orthogonal Sets Two vectors u and v are orthogonal or perpendicular to each other if ⟨u, v⟩ = 0. This is denoted by u ⊥ v. 1 i 1 i i EXAMPLE: ⊥ since , = [1 − i] =0 i 1 i 1 1 EXAMPLE: The vector 0 is A set of vectors v1,... , vp of Kn is an orthogonal set if anyy two of them are orthogonal to each other, ⟨vi, vj ⟩ = 0 ∀ i ̸= j EXAMPLE: 3 −1 −1 ◦ This set: 1 , 2 , −4 1 1 7 ◦ Algebra 2024-25 6-5 Theorem 6.1. Any orthogonal set of nonzero vectors of Kn is linearly independent. Proof: Let S = v1,... , vp be one of these sets. Suppose it were linearly dependent. Then, there would exist α1,... , αp not all zero such that α1 v1 +... + αp vp = 0. Dot multiplying v1 by this expression, would lead us to α1 ⟨v1, v1⟩ +... + αp⟨v1, vp⟩ = ⟨v1, 0⟩ but taking into account that ⟨vi, vj ⟩ = 0 (∀ i ̸= j) , this is ⇒ Likewise, we could also conclude that α2 =... = αp = 0, in contradiction with the fact that they are supposed to be linearly dependent. Algebra 2024-25 6-6 A set of vectors of a linear subspace W of Kn are an orthogonal basis of W if: a) They are a basis of W b) They are an orthogonal set. Theorem 6.2. Let B = {b1, b2,... bp} be an orthoggonal basis of a linear subspace W of Kn. The coordinates of an arbitrary vector y ∈ W in the basis B are given by α1 ⟨bi, y⟩ ⟨bi, y⟩ α2 [ y ] = . where αi = = ∀i B . ⟨bi, bi⟩ |bi|2 αp These coefficients are called Fourier Coefficients. Algebra 2024-25 6-7 Proof: α1 α2 [y] = . ⇔ y = α1 b1 +... + αp bp B . αp By dot multiplying bi by this expression, we obtain ⟨bi, y⟩ = α1 ⟨bi, b1⟩ +... + αi ⟨bi, bi⟩ +... + αp ⟨bi, bp⟩ but, as B is an orthogonal set, we find that Watch out: In general, ⟨bi, y⟩ = αi ⟨bi, bi⟩ ⇒ αi = Algebra 2024-25 6-8 EXAMPLE: Consider the vectors i 1 0 1 b1 = 1 b2 = i b3 = 0 x=0 0 0 i 1 1.- Show that B = {b1, b2, b3} is a basis of C3. 2.- Show that B is an orthogonal basis. 3.- Find [ x ] , the coordinates of x in the basis B. B 1.- and 2.- ⟨b1, b2⟩ = ⟨b1, b3⟩ = ⟨b2, b3⟩ = 3.- ⟨b1, x⟩ = ⟨b2, x⟩ = ⟨b3, x⟩ = ⟨b1, b1⟩ = ⟨b2, b2⟩ = ⟨b3, b3⟩ = Thus, we can write: x= b1 b2 b3 ⇔ [x] = = B Algebra 2024-25 6-9 An orthonormal set of vectors u1,... , up of Kn is an orthogonal set of unit vectors. The vectors of the set verify ⟨ui, uj ⟩ = 0 (∀ i ̸= j), ⟨ui, ui⟩ = 1 (∀ i) NOTE: Given an orthogonal set of nonzero vectors, A set of vectors of a linear subspace W of Kn is an orthonormal basis of W if: a) They are a basis of W b) They are an orthonormal set. → 6.1 Algebra 2024-25 6-10 6.3. Unitary Matrices Theorem 6.3. The entries of the matrix A∗A are the scalar products of the columns of A: If A = [ a1... an ] ∈ K(m×n) ⇒ (A∗A)i j = ⟨ai, aj ⟩ Proof: ⟨a1, a1⟩ ⟨a1, a2⟩... ⟨a1, an⟩ ⟨a2, a1⟩ ⟨a2, a2⟩... ⟨a2, an⟩ A∗A = =...... ·· · ⟨an, a1⟩ ⟨an, a2⟩... ⟨an, an⟩ Theorem 6.4. An (m × n) matrix Q has orthonormal columns if and only if Q∗Q = I Algebra 2024-25 6-11 EXAMPLE: Any matrix with different columns of a canonical basis. 0 1 0 0 1 0 0 0 0 0 0 1 0 ⇒ Q Q = 1 0 0 0 0 0 0 Q= 1 0 0 ∗ 1 = 0 0 0 0 0 1 0 0 1 0 0 1 EXAMPLE: (1+i)/2 √ 0 i/2 Q= √2 i/2 1/2 − 2/2 " # 1+i 0 " # 1 1−i −i 1 1 √ Q∗Q = √ √ i 2i = ··· = 2 0 − 2i − 2 2 √ 1 − 2 Notice that, in contrast, 1+i 0 " # 1 √ 1 1−i −i 1 QQ∗ = i 2i √ √ = ··· = 2 √ 2 0 − 2i − 2 1 − 2 Algebra 2024-25 6-12 Theorem 6.5. Let Q be an (m×n) matrix with orthonormal columns and x and y vectors of Rn. Then, a) |Qx| = |x| b) ⟨Qx, Qy⟩ = ⟨x, y⟩ c) ⟨Qx, Qy⟩ = 0 ⇔ ⟨x, y⟩ = 0 Proof: ⟨Qx, Qy⟩ = (Qx)∗Qy = Theorem 6.6. Let Q be an (m×n) matrix with orthonormal columns. The transformation T : Rn −→ Rm x −→ Qx a) Preserves the moduli of the vectors. b) Preserves the dot products c) Preserves the relations of orthogonality. Algebra 2024-25 6-13 A sqquare matrix whose columns are orthonormal is called a unitary matrix. These matrices verify U square ∗ U U =I ⇔ ⇔ Theorem 6.7. A unitary (n × n) matrix U verifies: a) It’s invertible and its inverse is its conjugate transpose. b) Its columns are an orthonormal basis of Kn. c) Its rows are an orthonormal basis of Kn. d) | det U | = 1 e) If V is unitary, U V is also unitary. f) If {b1,... , bn} is an orthonormal basis of Kn, {U b1,... , U bn} is an orthonormal basis of Kn. Algebra 2024-25 6-14 Proof: b) U invertible ⇒ Columns are a basis of Kn Orthonormal Columns ⇒ c) Let’s define M = U T , whose columns are the rows of U. M ∗M = (U T )∗U T = d) U ∗U = I ⇒ det(U ∗U ) = But det(U ∗U ) = det U ∗ det U =. e) V unitary ⇒ V ∗V = I. The square matrix M = U V verifies M ∗M = (U V )∗U V = ( ⟨bi, bi⟩ = 1 (∀ i) f) ⟨U bi, U bj ⟩ = ⟨bi, bj ⟩ = ↑ ⟨bi, bj ⟩ = 0 (∀ i ̸= j) → 6.2 theo. 6.5 Algebra 2024-25 6-15 6.4. Orthogonal Complement Let W be a linear subspace of Kn. The orthogonal com- plement of W , denoted by W ⊥, is the set of all vectors of Kn that are orthogonal to everyy vector of W. IN MATHEMATICAL NOTATION: W⊥ = Algebra 2024-25 6-16 Theorem 6.8. If a vector is orthogonal to a set of vectors, is also orthogonal to any linear combination of them. Proof: If y = α1 v1 + · · · + αp vp, then ⟨x, y⟩ = α1 ⟨x, v1⟩ + · · · αp ⟨x, vp⟩ = 0 + · · · + 0 = 0 IN MATHEMATICAL NOTATION: ⟨x,.v1⟩ = 0 If. ⇔ ⟨x, vp⟩ = 0 Theorem 6.9. The orthogonal complement of the column space of a matrix A is the null space of A∗: W = Col A ⇔ W ⊥ = Nul A∗ Proof: If A = [ a1 a2 ··· an ] " # a∗1 A∗x =.. x = and, therefore, A∗ x = 0 ⇔ a∗n Algebra 2024-25 6-17 1 ⊥ EXAMPLE: Determine W for W = Span 1 −1 ⊥ 1 W⊥ = Col 1 = −1 ↑ theo. 6.9 We solve A∗x = 0 −1 1 1 1 −1 ⇒ ⇒ x = µ 1 +ν 0 0 1 That is, x W = y ∈ K | x + y − z = 0 = Span ⊥ 3 = Col z → 6.3 Algebra 2024-25 6-18 Theorem 6.10. If W is a linear subspace of Kn, dim W + dim W ⊥ = n Proof: Consider A (n × p) such that W = Col A ⊂ Kn. ◦ dim W = ◦ dim W ⊥ = ⊥ Theorem 6.11. W⊥ =W Proof: ⊥ ◦ If x ∈ W ⇒ x ⊥ v ( ∀ v ∈ W ⊥) ⇒ ⇒ W ⊂ W⊥ ⊥ ⊥ ◦ dim W ⊥+dim W ⊥ = n ⇒ ⇒ W = W⊥ Theorem 6.12. W = Nul A ⇔ W ⊥ = Col A∗ Algebra 2024-25 6-19 6.5. Orthogonal Projections Theorem 6.13. Let W be a linear subspace of Kn (W ̸= {0}). Any vector x ∈ Kn can be uniqquelyy decomposed as the sum of two orthogonal vectors x = xW + xW ⊥ where xW ∈ W is the orthogonal projection of x onto W and xW ⊥ ∈ W ⊥ is the component of x orthogonal to W. In fact, if {b1,... , bp} is an orthoggonal basis of W , ⟨b1, x⟩ ⟨bp, x⟩ xW = b1 +... + bp ⟨b1, b1⟩ ⟨bp, bp⟩ xW ⊥ = x − x W Algebra 2024-25 6-20 Proof: ◦ From the expression for xW , we see that ◦ Consider xW ⊥ and compute ⟨b1, xW ⊥ ⟩ = ⟨b1, x−xW ⟩ = ⟨b1, x⟩ − ⟨b1, xW ⟩ ⟨b2, x⟩ ⟨bp, x⟩ = ⟨b1, x⟩ − b1, + b2 +... + bp ⟨b2, b2⟩ ⟨bp, bp⟩ ⟨b2, x⟩ ⟨bp, x⟩ = ⟨b1, x⟩ − − ⟨b1, b2⟩ −... − ⟨b1, bp⟩ ⟨b2, b2⟩ ⟨bp, bp⟩ = ⟨b1, x⟩ − ⟨b1, x⟩ − 0 −... − 0 Likewise, ⟨bi, xW ⊥ ⟩ = 0 (∀i) and, therefore, ◦ If the descomposition wasn’t unique, we could write x = xW + xW ⊥ = yW + yW ⊥ ⇒ xW − yW = yW ⊥ − xW ⊥ Both members are equal and orthogonal to each other. Therefore, ( xW = yW xW − yW = yW ⊥ − xW ⊥ = ⇒ xW ⊥ = yW ⊥ Algebra 2024-25 6-21 1 0 EXAMPLE: W = Col , x= 2 1 [1 2] ⟨b1, x⟩ 1 xW = b1 = = ⟨b1, b1⟩ 1 2 [1 2] 2 0 2 1 xW ⊥ = x − xW = − = 1 5 2 Algebra 2024-25 6-22 Let {u1,... , up} be an orthonormal basis of a linear subspace W. We define the orthogonal projection matrix onto W , denoted by PW , as the matrix PW = QQ∗ where Q = [ u1 u2... up ] Theorem 6.14. The orthogonal projection of a vector x onto a linear subspace W verifies xW = PW x Proof: u∗1⟨u1, x⟩ ∗ . PW x = QQ x = [ u1 · · · up ]. x = [ u1 · · · up ] .. u∗p ⟨up, x⟩ That is, PW x = = xW ↑ Algebra 2024-25 6-23 theo. 6.13 Properties: ◦ PW∗ = PW ◦ PW2 = PW ◦ PW + PW ⊥ = I ◦ PW PW ⊥ = 0 0 1 EXAMPLE: Find PW for W = Span 1 , 0 . 0 1 0 1 Orthonormal 1 , 0 ⇒ Basis 0 1 0 √1 " # 2 0 1 0 PW = 1 0 √1 √1 = 0 0 √1 2 2 2 Algebra 2024-25 6-24 Geometrical Interpretation of the Orthogonal Projection Theorem 6.15. Consider a linear subspace W of Kn, and the orthogonal decomposition of an arbitrary vector x ∈ Kn: x = xW + xW ⊥ where xW ∈ W and xW ⊥ ∈ W ⊥. Then xW is the closest point in W to x , in the sense that any other point w ∈ W , verifies d(x, w) > d(x, xW) Then, d(x, xW) = | xW ⊥ | is the minimum distance of x to W. Algebra 2024-25 6-25 Proof: Let w be a vector of W distinct from xW : w = xW + vW con vW ∈ W but vW ̸= 0. 2 d(x, w) = | x − w |2 = | x − xW − vW |2 = | xW ⊥ − vW |2 = ⟨ xW ⊥ −vW , xW ⊥ −vW ⟩ = ⟨ xW ⊥ −vW , xW ⊥ ⟩ − ⟨ xW ⊥ −vW , vW ⟩ = But since vW ̸= 0, then |vW | > 0, and therefore, 2 d(x, w) = |xW ⊥ |2 + |vW |2 > Algebra 2024-25 6-26 EXAMPLE: Find the distance from the point v to the plane H. 1 2 −2 v = 2, H : Span 5 , 1 −1 3 1 −2 {b1, b2} is ◦ ⟨b1, b2⟩ = [ 2 5 − 1 ] 1 = ⇒ an 1 basis of H ◦ Then, we can decompose v = vH + vH ⊥ , where ⟨b1, v⟩ ⟨b2, v⟩ vH = b1 + b2 ⟨b1, b1⟩ ⟨b2, b2⟩ Algebra 2024-25 6-27 [ 2 5 − 1 ] [ −2 1 1 ] 2 −2 vH = 5+ 1 2 −2 −1 1 [ 2 5 − 1 ] 5 [ −2 1 1 ] 1 −1 1 2 −2 Closest point = 5 + 1 = in H to v −1 1 ◦ Minimum distance: 1 d(v, vH ) = | v−vH | = 2 − = = → 6.4 3 Algebra 2024-25 6-28 6.6. The Gram-Schmidt Process Theorem 6.16. If {x1, x2,... , xp} is a basis of a linear subspace W of Kn , then the set of vectors v1 = x1 ⟨v1, x2⟩ v2 = x2 − v1 ⟨v1, v1⟩ ⟨v1, x3⟩ ⟨v2, x3⟩ v3 = x3 − v1 − v2 ⟨v1, v1⟩ ⟨v2, v2⟩.. ⟨v1, xp⟩ ⟨vp−1, xp⟩ vp = xp − v1 − · · · − vp−1 ⟨v1, v1⟩ ⟨vp−1, vp−1⟩ is an orthoggonal basis of W. In addition, Span{x1,... , xk } = Span{v1,... , vk } ∀ 1≤k≤p Algebra 2024-25 6-29 Proof: (By induction) Define Wj = Span{x1,... , xj } ∀ 1 ≤ j ≤ p. ◦ The theorem is trivially true if we only have one vector. Also, since x1 = v1 , then Span{x1} = Span{v1}. ◦ Suppose it is true for k vectors (1 ≤ k < p). That is, suppose we have found the vectors {v1,... , vk } that are an orthoggonal basis of Wk. Consider now the vector ⟨v1, xk+1⟩ ⟨vk , xk+1⟩ vk+1 = xk+1 − v1 − · · · − vk , ⟨v1, v1⟩ ⟨vk , vk ⟩ This vector is vk+1 = xk+1 − (xk+1)Wk = (xk+1)W ⊥ k Algebra 2024-25 6-30 Note that: 1.- vk+1 ̸= 0 because xk+1 ∈ / Wk. 2.- vk+1 ∈ Wk⊥. That is, vk+1 is orthogonal to {v1,... , vk }. 3.- The set {v1,... , vk , vk+1} is an (orthogonal ) basis of Span{v1,... , vk , vk+1}. 4.- {v1,... , vk , vk+1} is a set of k + 1 linearly independent vectors of Wk+1. 5.- The set {v1,... , vk , vk+1} is an orthogonal basis of Wk+1 = Span{x1,... , xk , xk+1}. Algebra 2024-25 6-31 ATTENTION: If we define ⟨x1, x2⟩ e 2 = x2 − v x1 ⟨x1, x1⟩ But ⟨x1, x3⟩ ⟨x2, x3⟩ e 3 = x3 − v x1 − x2 ⟨x1, x1⟩ ⟨x2, x2⟩ and, in general, ⟨e e2⟩ = v3 , v ̸ 0, ⟨e v3 , x 1 ⟩ = ̸ 0 and ⟨e v3 , x2 ⟩ = ̸ 0 Algebra 2024-25 6-32 EXAMPLE: Find an orthoggonal basis of the hyperplane W : x1 + x2 − i x3 − x4 = 0 ◦ Basis of W x1 = −µ1 + iµ2 + µ3 −1 i 1 x2 = µ1 1, 0, 0 [ 1 1 −i −1 ] ⇒ ⇒ 0 1 0 x3 = µ2 x4 = µ3 0 0 1 ◦ Orthogonalization [ ] i v1 = ; v2 = − = ··· = 1 i 2 2 0 [ ] Algebra 2024-25 6-33 [ ] [ ] v3 = − − [ ] [ ] 1 −1 i 6 −3 −1 2 0 1 1 i i v3 = + + = 1 0 + 3 + −1 = 1 2 0 2 0 6 2 6 0 0 2i 6 2i 1 0 0 6 0 0 6 −1 i 1 n o Therefore, 1, i,1 is an orthoggonal basis of W. 0 2 i → 6.7 0 0 3 Algebra 2024-25 6-34 6.7. The QR factorization Theorem 6.17. Any matrix A with linearlyy indeppendent columns can be factorized A = Q R (m×n) (m×n)(n×n) where the columns of Q are an orthonormal basis of Col A (thus, Q∗Q = I) and R is upper triangular with positive diagonal elements (hence, invertible) Proof: We apply Gram-Schmidt to the columns of A = [ a1 · · · an ] v 1 = a1 ⇒ β1u1 = a1 v2 = a2 − α12 v1 Normalizing: β2u2 = a2 − α12 β1u1 u = v3 = a3 − α13 v1 − α23 v2 i β3u3 = a3 − α13 β1u1 − α23 β2u2.. βi =.. Algebra 2024-25 6-35 Inverting these equations: a1 = β1u1 a2 = β2u2 + β1α12 u1 a3 = β3u3 + β1α13 u1 + β2α23 u2.. which can be written in matrix form: β1 β1α12 β1α13... β1α1n 0 β2 β2α23... β2α2n [ a 1... an ] = [ ] 0 0 β3... β3α3n ....... . ·· · 0 0 0... βn Algebra 2024-25 6-36 1 0 0 1 1 0 EXAMPLE: Find a QR factorization of A = 1 1 1 1 1 1 1 1 1 1 1 1 a1 = v1 = u1 = 1 1 1 1 1 1 0 −3 −3 1 GS Process 1 Normalizing 1 a2 = 1 v2 = 1 u2 = 1 ⇒ ⇒ 1 1 1 0 0 0 0 −2 −2 a3 = v3 = u3 = 1 1 1 1 1 1 Algebra 2024-25 6-37 ◦ Matrix Q: √ √3 −3 √0 1 3 1 −2√2 Q = [ u1 u2 u3 ] = √ √ 2 3 √3 1 √2 3 1 2 ◦ Matrix R: QR = A ⇒ Q∗QR = Q∗A ⇒ R = Q∗A √ √ 1 0√ √ 0 3 3 3 3 1 1 1 0 1 R= √ −3 √1 √1 √1 = √ 0 2 3 1 1 1 2 3 0−2 2 2 2 0 0 1 1 1 Algebra 2024-25 6-38 6.8. Least-Squares Problems Let A be an (m × n) matrix and b a vector of Km. The vector x̂ is a least-squares solution of the equation Ax = b if it verifies that | Ax̂ − b | ≤ | Ax − b | for every x ∈ Kn. The number | Ax̂ − b | is called the least-squares error. Note: If the system Ax = b is consistent, the solutions of this equation are also the least-squares solutions Algebra 2024-25 6-39 Theorem 6.18. The least-squares solution of the equation Ax = b verifies A x̂ = bColA where bColA is the orthogonal projection of b in Col A. This equation is equivalent to Normal A∗A x̂ = A∗b. Equation Proof: ◦ We look for an x̂ such that Ax̂ is as close as possible to the vector b. As we know (theo. 6.15), this vector is the orthogonal projection of b in Col A, namely bColA. Therefore, Algebra 2024-25 6-40 ◦ Consider the orthogonal decomposition b = bColA + b(ColA)⊥. As (Col A)⊥ = , it follows that A∗b(ColA)⊥ = If x̂ verifies A x̂ = bColA, it must also verify A∗A x̂ = A∗bColA, but since A∗bColA = A∗b, then ◦ If x̂ verifies A∗A x̂ = A∗b, it follows that A∗(A x̂ − b) = 0. In other words, the vector v = A x̂− b ∈ Nul A∗. But as Nul A∗ = (Col A)⊥, then v ∈ (Col A)⊥. Therefore, we have that b = A x̂−v where A x̂ ∈ Col A and v ∈ (Col A)⊥. As the orthogonal decomposition is unique, we identify A x̂ = bColA. And indeed, Algebra 2024-25 6-41 EXAMPLE: Solve the system Ax = b. If inconsistent, find the least-squares solution. 4 0 2 A= 0 2 b= 0 1 1 11 ◦ Solving 4 0 2 A b = 0 2 0 ∼ 1 1 11 Algebra 2024-25 6-42 ◦ The least-squares solution verifies A∗A x̂ = A∗ b : 4 0 2 4 0 1 x̂1 4 0 1 0 2 = 0 0 2 1 x̂2 0 2 1 1 1 11 That is, x̂1 = x̂2 ∼ The least-squares solution is x̂ =. → 6.9 Algebra 2024-25 6-43 6.9. Linear Regressions ◦ We aim to find the straight line that fits best a set of given points: (x1, y1) (x2, y2) y = β 0 + β1 x.. (xn, yn) The coefficients β0 and β1 are called Regression Coefficients. ◦ If a straight line could go through all the points, we could find the coefficients β0 and β1 from the system: β0 + β1 x1 = y1 β0 + β1 x2 = y2 .. ⇔ = β0 + β1 xn = yn However, the line doesn’t pass through all the points and the system M β = y is, in general, inconsistent. Algebra 2024-25 6-44 ◦ The vertical distance between the line and a point is called residual, ϵi = which, of course, depends on β0 and β1. We try to find the values of β0 and β1 that somehow minimize the residuals. ◦ A candidate to be minimized is E(β0, β1) = ϵ21 + ϵ22 + · · · + ϵ2n , or, E(β0, β1) = which is the modulus of the vector M β − y squared, |M β − y|2. ◦ We know that the value of β that minimizes |M β − y| is the least- squares solution of the equation M β̂ = yColM , which is equivalent to M ∗M β̂ = M ∗y. Therefore, the value of the regression coefficients β̂0 and β̂1 that minimize E(β0, β1) can be obtained from the equation: β̂0 = β̂1 Algebra 2024-25 6-45 x 2 EXAMPLE: Fit the curve y = a + b sin(x) + c π to the points 4 3 (π/2,4) 2 (0,3) 1 (-π,0) (π,0) 0 (-π/2,-1/2) -4 -2 0 2 4 If the curve could pass through every point, the coefficients would verify the equations: (−π, 0) ⇒ a + b sin( ) + c ( /π)2 = (−π/2, −1/2) ⇒ a + b sin( + c( π)2 = (0, 3) ⇒ a + b sin( ) + c ( /π)2 = (π/2, 4) ⇒ a + b sin(π/2) + c (π/2π)2 = 4 (π, 0) ⇒ a + b sin(π) + c (π/π)2 = 0 Algebra 2024-25 6-46 These equations can be written: 1 0 1 0 a + c = 0 a − b + 1 c = −1/2 1 −1 14 a − 12 4 a = 3 ⇔ 1 0 0 b = 3 a 1 + b + 4c = 4 1 c 1 1 4 4 a + c = 0 1 0 1 0 The system is inconsistent but the least-squares solution verifies 1 0 1 0 1 1 1 1 1 1 1 −1 41 a 1 1 1 1 1 − 2 0 −1 0 1 0 1 0 0 b = 0 −1 0 1 0 3 1 41 0 14 1 1 1 4 1 c 1 1 4 0 1 4 1 4 1 0 1 0 that is a b = c Algebra 2024-25 6-47 Solving 5 13 5 0 2 2 1 0 0 9 0 2 0 2 ∼ ··· ∼ 0 1 0 5 17 7 2 0 8 8 0 0 1 The sought-after curve is y= 4 3 2 1 0 -4 -2 0 2 4 → 6.11 Algebra 2024-25 6-48 6.10. Multiple Regression Multivariable functions can also be fit, as long as they depend linearlyy on the parameters. DATA (x1, y1, f1) (x2, y2, f2).. (xn, yn, fn) Suppose we choose: f (x, y; β0, β1, β2, β3) = + y+ xy + x2 that depends linearlyy on the parameters β0, β1, · · · Algebra 2024-25 6-49 If this function passed through all points, the following system would be verified: β0 + β1 y1 + β2 x1y1 + β3 x21 = f1 1 y1 x 1 y1 x21 β0 f1 β0 + β1 y2 + β2 x2y2 + β3 x22 = f2 1 x22 ⇔ y2 x 2 y2 β1 = f.2 .... ........ β2 . β0 + β1 yn + β2 xnyn + β3 x2n = fn 1 yn x1yn x2n β3 fn This system, M β = f , is in general inconsistent; but its least- squares solution determines the parameters which provide the best fit: MTMβ = MTf Algebra 2024-25 6-50