ELEN-30073 Numerical Methods Instructional Material PDF

Table of Contents 1. Approximation and Rounding Errors……………………………….……….………..1 2. Roots and Optimization…………………………………………….………….……....5 a. Bracketing Method………………………………………….…………………7 i. Bisection Method…………………….............................................7 ii. False Position Method…………………………………………........8 b. Open Method………………………………………………………..……...…9 i. Fixed Point Iteration……………………………………………….....9 ii. Newton Rhapson Method……………………………………..……..9 iii. Secant Method………………………………………………..……...10 c. Optimization……………………………………………………..…………….11 i. Golden Section Method………………………………..……………13 ii. Newton’s method……………………………………….………..…..15 iii. Multimodal functions…………………………………………………16 3. Linear System…………………………………………………………….……………18 a. Introduction to Matrices…………………………………………….……...…19 b. Non iterative method………………………………………………………….22 i. Graphical method…………………………………………….……....22 ii. Determinant and Cramer’s rule……………………………….…….22 iii. Naïve Gauss elimination……………………………………….……23 iv. Pivoting………………………………………………………………..25 v. Tridiagonal System……………………………………………..……26 vi. Gauss Jordan method……………………………………………….28 vii. LU decomposition………………………………………………..…..29 viii. Inverse of matrix……………………………………………..………30 ix. Cholesky Factorization/Decomposition……………………...…….31 c. Iterative method …………………………………………………………..….32 i. Gauss-Seidel method………………………………………………..32 ii. Jacobi iteration…………………………………………………..……35 4. Curve Fitting …………………………………………………………………………...37 a. Introduction to statistic…………………………………………...……...……38 b. Linear Least squares Regression…………………………………………....39 c. Polynomial Regression………………………………………………...……...44 d. Polynomial Interpolation………………………….........................................47 i. Newton Interpolating Polynomial…………………………...….........48 ii. Lagrange Interpolating Polynomial………………………..…........…51 Activities………………………… …………………………………………………………..…..54 Module 1: APPROXIMATION AND ROUNDING ERRORS This module is all about the approximations and rounding errors that is being used for numerical analysis of equations and formulas. Overview: Numerical methods must be sufficiently exact and precise to satisfy the requirements of electrical engineering problems may it be theoretical (Ohm’s law, Kirchoff’s Voltage and Current Law, Fault Analysis) and practical (sizing and length of wire estimation, selecting of optimal brand of devices based on cost). Thus, it is important to differentiate the terminologies and its application first before going into the numerical analysis. Module Objectives: After successful completion of this module, you should be able to: Differentiate between accuracy, precision and bias Define the significant figures Determine different error types Conduct an error analysis for solutions and answers Course Materials: The term accuracy, precision and bias are terminologies use to describe the veracity or authenticity of a certain thing. For example, one must determine the accuracy of a certain news, by verifying to a trusted source to determine its degree of precision to the real issue without bias in either side. In the same way, that these terms are being used to determine the integrity of a certain calculation, on how it deviates from the real or actual numbers. This is crucial since engineering especially electrical engineering deals with such meticulous calculation. These terms are often defined interchangeably, as this have very similar meaning in terms of its usage. To illustrate its difference, below is a four shooting board (Figure 1.1): 1 Figure 1.1 Shooting boards showing Accuracy, Precision and Bias As shown, the following observation can be observed and can be summarize by Table 1: a. Case A is successful in terms of its accuracy (close to the target) and precision (how holes are close to each other). b. In Case B, the holes agree with each other (consistency or precision) but they deviate considerably from where the shooter was aiming (no correctness) and hence, it lacks exactness or inaccurate. c. Case C lacks both correctness (accuracy) and consistency (precision). d. Case D lacks consistency (precision). e. The shooters of targets C and D were imprecise since both of their holes are inconsistent with each other. Target Bias Precision Accuracy A None (unbiased) High High B High High Low C None (unbiased) Low Low D Moderate Low Low Table 1: Summary of Shooting Boards From this example, Accuracy can be define as the degree to which the measurements deviate from the true value (in this case, the exactness of the hole to the target), Precision as the ability to give multiple estimates that are near to each other (how the holes agree with each other) and Bias is the difference between the center of the holes and the center of the target or a systematic deviation of values from the true value. 2 In regard to this, significant figures are being used to determine how accurate or precise is one solution to the real answer. For example, if 46.23 is exact to the four digits shown, it has four significant digits which means that the last digit is imprecise in which the error can be estimated to be no more than 0.005. The digits from 1 to 9 are always significant, with zero being significant where it is not being used to set the position of the decimal point. For example, numbers 2410, 2.41, 0.00241 has both three significant digits where 0 in 2410 is used only to set the decimal place. Thus, rounding off of answers should be made at the end of computation, not at intermediate calculation to avoid inaccuracy of the solution. Scientific notation can then be used to avoid confusion in significant figures. For example, 2.41 x 103 has three significant digits while 2.410 x 103 has four significant digits. Thus, a mathematical operation using an imprecise digit is imprecise since it will affect the accuracy of the final answer. In general, errors can be classified based on their sources such as non-numerical and numerical errors: Non-Numerical errors can be classified as follows: a. Modelling errors which is generated by assumptions and limitations such as the exact value of pi. b. Blunders and mistakes which are attributed to human errors such as misreading of weighing scale. c. Uncertainty in information and data gathered. Numerical errors can be classified as follows: a. Round off errors is due to a limited number of significant digits. b. Truncation errors is due to the shortened terms of a certain calculation such as the infinite Taylor series. c. Propagation errors is due to a sequence of operations which can reduced with a good computation order. In example, in summarizing several values, one can rank the values in ascending order before performing the summation. d. Mathematical approximation errors are due to estimating a certain model. For example, using a linear model for representing a nonlinear expression. 3 Error is defined as the difference between the computed and true values of number x. This can be defined by the equation below: 𝑒 = 𝑥𝑐 − 𝑥𝑡 𝑥𝑐 − 𝑥𝑡 𝑒 𝑒𝑟 = = 𝑥𝑡 𝑥𝑡 which: e- error xc- computed value xt- true value 𝑒𝑟 =relative error Error can be defined by the following terms: a. True value= approximation + absolute error b. Absolute error= true value-approximation c. Relative error=absolute error/true value This can be summarized by the following formulas: 𝑎𝑏𝑠𝑜𝑙𝑢𝑡𝑒 𝑒𝑟𝑟𝑜𝑟 ℰ𝑡 = ∗ 100 𝑡𝑟𝑢𝑒 𝑣𝑎𝑙𝑢𝑒 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑒 𝑒𝑟𝑟𝑜𝑟 (𝑎𝑐𝑡𝑢𝑎𝑙 − 𝑝𝑟𝑒𝑣𝑖𝑜𝑢𝑠) 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑖𝑜𝑛 ℰ𝑎 = ∗ 100 = ∗ 100 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑒 𝑣𝑎𝑙𝑢𝑒 𝑎𝑐𝑡𝑢𝑎𝑙 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑖𝑜𝑛 where: ℰ𝑡 =true error ℰ𝑎 =absolute error To learn more about Approximation and Rounding errors, stick with these following resources: Watch: Floating Point Representation and Rounding error | Chad Higdon-Topaz (https://www.youtube.com/watch?v=wbxSTxhTmrs) Read: Applied Numerical Methods with MatLAB for Engineers and Scientists by S.T. Chapra Chapter 4 – Roundoff and Truncation errors 4 Module 2: ROOTS AND OPTIMIZATION This module is all about the roots and optimization that is being used to determine the solutions and answer to a particular objective function. Overview: Finding the roots of a certain equation is to determine its approximate value or solutions where it will converge to zero (f(x)=0). One of the conventional way to obtain approximate solution is to plot the function to determine where it crosses the x axis but such approach may inappropriate for some functions because of its lack of consistency to the true answer. Or one may try the gruesome trial and error method where one will guess the value of x and repeatedly evaluating f(x) is zero to determine whether the new value provides a better estimate of the root. Such methods are inefficient for the requirements of electrical engineering. Thus, numerical methods provide means to approximate but employ strategic to determine the feasible solution to f(x)=0. Besides from finding the root, another point of interest is finding the optimal maximum and minimum values for each function. Back in calculus, such solutions can be obtained analytically by determining the value at which the function is flat that is when its derivative is zero. From a numerical analysis standpoint, this is very similar to the root location methods which both involve guessing and searching for a location on a function. The fundamental difference between the two types can be summarized by the figure below: 5 Figure 2.1: Difference between roots and optima using a single variable Module Objectives: After successful completion of this module, you should be able to: Utilize bracketing methods such as false position and bisection to determine the root of an equation. Utilize open methods such as Newton Rhapson, secant and fixed point to determine the root of an equation. Compare the difference between bracketing and open methods in terms of its accuracy and precision in find the root and convergence to the true value. Utilize optimization methods such as golden section search method to determine the optima of an equation. Utilize computer software such as Microsoft Excel for both root location and optimization. Course Materials: A simple approach of finding the estimate of the root of an equation f(x)=0 is to plot the said function in a graph and observe where it crosses the x axis and at this point which provides the rough approximation of the root. 6 Graphical techniques however have a limited practical value because they are imprecise especially in obtaining the actual values of a function. This however can be used to obtain rough estimate of roots. This is where methods with initial guesses comes in. It is where one must repeatedly make guesses until the function was sufficiently close to zero. The two major classes of methods available by the type of initial guess are: a. Bracketing Method- are based on two initial guesses that “bracket” or border the root. b. Open methods are methods which can involve one or more initial guesses, but there is no need for them to bracket the root. Bracketing method as start with guesses, or contain, the root and then systematically reduce the width of the bracket. The two most common variation of this method are Bisection Method and False Position Method. The Bisection method is a variation of the incremental search method in which the interval is always divided in half. If a function changes sign over an interval, the function value at the midpoint is evaluated (Figure 2.2). The location of the root is then determined as lying within the subinterval where the sign change occurs. The subinterval then becomes the interval for the next iteration. The process is repeated until the root is known to the required precision. It is given by equation 1 and condition: 𝑥𝑙 + 𝑥𝑢 𝑥𝑟 = (1) 𝑤ℎ𝑒𝑟𝑒: 𝑥𝑙 = 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑥𝑟 = 𝑚𝑒𝑑𝑖𝑎𝑛 𝑥𝑢 = 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 2 The following conditions will be used for the succeeding iteration: If f(xl)f(xr) > 0, no sign change occurs between the lower bound and the midpoint. Thus, the root must be located between xr and xu. If f(xl)f(xr) < 0, the root is now lower that xl and xr. 7 Figure 2.2: Graphical depiction of bisection method by successive iteration False position (also called the linear interpolation method) is another well-known bracketing method. It is very similar to bisection with the exception that it uses a different strategy to come up with its new root estimate. Rather than bisecting the interval, it locates the root by joining f (xl) and f (xu) with a straight line in the figure. The intersection of this line with the x axis represents an improved estimate of the root. Thus, the shape of the function influences the new root estimate. Using similar triangles (Figure 2.3), the intersection of the straight line with the x axis can be estimated as for equation 2. Figure 2.3: Graphical representation of False position method 8 𝑥𝑢 − 𝑥𝑟 𝑥𝑙 − 𝑥𝑢 𝑏𝑦 𝑠𝑖𝑚𝑖𝑙𝑎𝑟 𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒, = 𝑓(𝑥𝑢 ) 𝑓(𝑥𝑙 ) − 𝑓(𝑥𝑢 ) 𝑓(𝑥𝑢 )(𝑥𝑙 − 𝑥𝑢 ) 𝑡ℎ𝑢𝑠, 𝑥𝑟 = 𝑥𝑢 − (2) 𝑓(𝑥𝑙 ) − 𝑓(𝑥𝑢 ) Open method will start with also with a guesses, but unlike Bracketing Method, it will only require single starting value or two starting values that do not necessarily bracket the root. As such, they sometimes diverge or move away from the true root as the computation progresses. The three most common variation of this method are Fixed Point Iteration Method, Newton Rhapson Method and Secant Method. Fixed Point iteration (also called one-point iteration or successive iteration) is done by rearranging the function f(x)=0 so that x is on left-hand side of the equation such x=g(x). This transformation can be accomplished either by algebraic manipulation or by simply adding x to both sides of the original equation such xi+1=g(xi). Figure 2.4 illustrates the graphical illustration of fixed point iteration. Figure 2.4: Graphical illustration of Fixed point iteration Newton Rhapson method is perhaps the most widely used of all root-locating formulas. This can be illustrated that if initial guess at the root is xi, a tangent can be extended from the 9 point [xi, f(xi)] as shown in Figure 2.5. The point where this tangent crosses the x axis usually represents an improved estimate of the root. This method an be derived on the basis of geometrical interpretation of similar triangles. The first derivative at x is equivalent to the slope and the shown by equation 3: 𝑓(𝑥𝑖 ) − 0 𝑓(𝑥𝑖 ) 𝑓 ′ (𝑥𝑖 ) = ℎ𝑒𝑛𝑐𝑒, 𝑥𝑖+1 = 𝑥𝑖 − (3) 𝑥𝑖 − 𝑥𝑖+1 𝑓 ′ (𝑥𝑖 ) Figure 2.5: Graphical representation of Newton Rhapson method Secant method addresses the problem in implementing the Newton Rhapson method which is the evaluation of the derivative. Although, this is not inconvenient for polynomials and many other functions, there are certain functions whose derivatives may be difficult or inconvenient to evaluate. For these cases the derivative can be approximated by a backward finite divided difference which shown by equation 4. 𝑓(𝑥𝑖+1 ) − 𝑓(𝑥𝑖 ) 𝑓 ′ (𝑥𝑖 ) = (4) 𝑥𝑖+1 − 𝑥𝑖 The approximation in Newton Rhapson can be substituted to yield the following iterative formula shown in equation 5: 10 𝑓(𝑥𝑖 )(𝑥𝑖−1 −𝑥𝑖 ) 𝑥𝑖+1 = 𝑥𝑖 − 𝑓(𝑥𝑖−1 )−𝑓(𝑥𝑖 ) (5) which is the secant method An alternative approach involves a fractional perturbation of the independent variable to estimate f’(x). 𝑓(𝑥𝑖 + 𝛿𝑥𝑖 ) − 𝑓(𝑥𝑖 ) 𝑓 ′ (𝑥𝑖 ) = 𝛿𝑥𝑖 𝑤ℎ𝑒𝑟𝑒 𝛿 = 𝑠𝑚𝑎𝑙𝑙 𝑝𝑒𝑟𝑡𝑟𝑢𝑏𝑎𝑡𝑖𝑜𝑛 𝑓𝑟𝑎𝑐𝑡𝑖𝑜𝑛 𝑎𝑛𝑑 𝑖𝑡 𝑐𝑎𝑛 𝑠𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒𝑑 𝑡𝑜 𝑦𝑖𝑒𝑙𝑑 𝑡ℎ𝑒 𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛: 𝛿𝑥 𝑓(𝑥 ) 𝑖 𝑖 𝑥𝑖+1 = 𝑥𝑖 − 𝑓(𝑥 +𝛿𝑥 )−𝑓(𝑥 ) which is the modified secant method. 𝑖 𝑖 𝑖 Optimization or mathematical programming and root finding whether it is open or bracket method, both involve guessing and searching for a point on a function such as f(x). However, the fundamental difference is that root finding focuses on searching for zeros of a function/s while optimization focuses on finding the minimum or the maximum of a function of several variables. It can be stated as: Find x, which minimizes or maximizes f(x) that is subject to, 𝑑𝑖 (𝑥) ≤ 𝑎𝑖 𝑖 = 1,2, … , 𝑚∗ 𝑑𝑖 (𝑥) ≤ 𝑎𝑖 𝑖 = 1,2, … , 𝑚∗ Where x is an n-dimensional design vector, f(x) is the objective function, di(x) are inequality Optimization problems can be classified on the basis of the form of f(x): If f(x) and the constraints are linear, we have linear programming If f(x) is quadratic and the constraints are linear, we have quadratic programming. 11 If f(x) is not linear or quadratic and/or the constraints are non linear, we have nonlinear programming. When equations(*) are included, we have a constrained optimization problem; otherwise, it is unconstrained optimization problem. A unimodal function has a single maximum or a minimum in the a given interval as shown in Figure 2.6. For a unimodal function:  First pick two points that will bracket your extremum [xl, xu].  Pick an additional third point within this interval to determine whether a maximum occurred.  Then pick a fourth point to determine whether the maximum has occurred within the first three or last three points  The key is making this approach efficient by choosing intermediate points wisely thus minimizing the function evaluations by replacing the old values with new values. 12 Figure 2.6: Graphical representation of unimodal function 𝑙1 𝑙2 𝑙0 = 𝑙0 + 𝑙1 𝑡ℎ𝑢𝑠, = 𝑙0 𝑙1 The first condition specifies that the sum of the two sub lengths l1 and l2 must equal the original interval length. The second say that the ratio of the length must be equal as shown in equation 3: 𝑙1 𝑙2 𝑙2 = 𝑡ℎ𝑢𝑠, 𝑅= (3) 𝑙1 + 𝑙2 𝑙1 𝑙1 1 1+𝑅 = 𝑡ℎ𝑢𝑠, 𝑅 2 + 𝑅 − 1 = 0 𝑅 −1 + √1 − 4(−1) √5 − 1 𝑅= = = 0.61803 2 2 13 The method starts with two initial guesses, xl and xu, that bracket one local extremum of f(x):  Next two interior points x1 and x2 are chosen according to the golden ratio as shown in equation 4.  The function is then evaluated at these two interior points as shown in Figure 2.7. √5 − 1 𝑑= (𝑥𝑢 − 𝑥𝑙 ) (4) 2 𝑥1 = 𝑥𝑙 + 𝑑 𝑥2 = 𝑥𝑢 − 𝑑 Figure 2.7: Graphical representation of an optimization using two initial guesses or golden search method 14 From this, two results can occur:  If f(x1)>f(x2) then the domain of x to the left of x2 from xl to x2, can be eliminated because it does not contain the maximum. Then, x2 becomes the new xl for the next round.  If f(x2)>f(x1), then the domain of x to the right of x1 from xl to x2, would have been eliminated. In this case, x1 becomes the new xu for the next round. New xi can now be determined as before by equation 4: √5 − 1 𝑥1 = 𝑥𝑙 + (𝑥𝑢 − 𝑥𝑙 ) (4) 2 The real benefit from the use of golden ratio is because the original x1 and x2 were chosen using golden ratio, we do not need to recalculate all the function values for the next iteration. Newton’s Method is a similar approach to Newton- Raphson method can be used to find an optimum of f(x) by defining a new function g(x)=f‘(x). Thus because the same optimal value x* satisfies both: 𝑓 ′ (𝑥 ∗ ) = 𝑔(𝑥 ∗ ) = 0 We can use the following as a technique to the extremum of f(x) as given by equation 5. 𝑓 ′ (𝑥𝑖 ) 𝑥𝑖+1 = 𝑥𝑖 − (5) 𝑓 ′′ (𝑥𝑖 ) Newton’s Method algorithm can be summarized by following step: Initialization. Determine a reasonably good estimate for the maxima or the minima of the function f(x). Step 1. Determine f’(x) and f’’(x) 15 Step 2. Substitute xi (initial estimate xo for the first iteration) f’(x) and into f’’(x). 𝑓 ′ (𝑥𝑖 ) 𝑥𝑖+1 = 𝑥𝑖 − 𝑓 ′′ (𝑥𝑖 ) to determine xi+1 and the function value in iteration i. Step 3. If the value of the first derivative of the function is zero then you have reached the optimum (maxima or minima). Otherwise, repeat Step 2 with the new value of xi. In Multimodal functions, both local and global optima can occur. In almost all cases, we are interest in finding the absolute highest or lowest value of a function as shown by cartesian and contour graph on Figure 2.8. Figure 2.8: Representation of finding global maxima and minima using plot (a) and contour (b) graph Global optimum can be distinguished from local optimum by: By graphing to gain insight into the behavior of the function (Figure 2.9). 16 Using randomly generated starting guesses and picking the largest of the optima as global. Perturbating the starting point to see if the routing returns a better point or the same local minimum. Figure 2.9: Graphical representation of Global and Local maximum To learn more about Roots and Approximation, stick with these following resources: Watch: MATLAB Introduction: Root Solving | Jonathan Curie (https://www.youtube.com/watch?v=4n4eXQLHGlg) How to locate a root | ExamSolutions (https://www.youtube.com/watch?v=OzFuihxtbtA) Read: Applied Numerical Methods with MatLAB for Engineers and Scientists Chapter 5,6, and 7 by S.T. Chapra Applied Numerical Methods using MatLAB Chapter 4 by Won Young Yang, Wenwu Cao, Tae-Sang Chung and John Morris 17 Module 3: LINEAR SYSTEM This module is all about the system of linear algebraic equations which can be either linear or nonlinear in nature. Overview: Many of the fundamental equations of electrical engineering are based on conservation law and some of the familiar quantities that conform to such laws are mass, energy and momentum. This can lead to balance or continuity equations that relate system behavior as represented by the levels or response of the quantity being modeled to the properties or characteristics of the system and the external stimuli or forcing functions acting on the system. When these dependencies are expressed mathematically, the resulting equations often of the linear algebraic form where x’s are usually measures of the magnitudes of the responses of the individual components. Aside from physical systems, simultaneous linear algebraic equations also arise in a variety of mathematical problem contexts. These result when mathematical functions are required to satisfy several conditions simultaneously. Each condition results in an equation that contains known coefficients and unknown variables Module Objectives: After successful completion of this module, you should be able to: Define matrix and its use in linear algebraic equations Discuss the importance of Gauss elimination and its application on solving simultaneous linear equation Formulate and discuss LU factorization from Gauss elimination. Describe how LU factorization can be employed to efficiently calculate the matrix inverse. Describe different iterative solution techniques in solving linear algebraic problem. 18 Course Materials: Linear algebraic equations are of the general form as follows: 𝑎11 𝑥1 + 𝑎12 𝑥2 + ⋯ + 𝑎1𝑛 𝑥𝑛 = 𝑏1 𝑎21 𝑥1 + 𝑎22 𝑥2 + ⋯ + 𝑎2𝑛 𝑥𝑛 = 𝑏2 𝑎𝑛1 𝑥1 + 𝑎𝑛2 𝑥2 + ⋯ + 𝑎1𝑛 𝑥𝑛 = 𝑏𝑛 Where the a’s are constant coefficients, the b’s are constants, the x’s are the unknowns, and n is the number of equation which can be simplified in the form of matrix. A matrix consists of a rectangular array of elements represented by a single symbol. As depicted in Figure 3.1, [A] is the shorthand notation for the matrix and ai j designates an individual element of the matrix. A horizontal set of elements is called a row and a vertical set is called a column. The first subscript i always designates the number of the row in which the element lies. The second subscript j designates the column. For example, element a23 is in row 2 and column 3. The matrix has m rows and n columns and is said to have a dimension of m by n (or m × n). It is referred to as an m by n matrix. Figure 3.1: Example of a matrix A 19 Matrices where m = n are called square matrices. The diagonal consisting of the elements a11, a22, and a33 is termed the principal or main diagonal of the matrix. For example, a 3 × 3 matrix is: Asymmetric matrix is one where the rows equal the columns.that is, ai j = aji for all i.s and j.s. An identity matrix is a diagonal matrix where all elements on the main diagonal are equal to 1, as in: The identity matrix has properties similar to unity. That is, [A][I]=[I][A]=[A] An upper triangular matrix is one where all the elements below the main diagonal are zero, as in: A lower triangular matrix is one where all elements above the main diagonal are zero, as in: 20 Figure 3.2 shows the visual Depiction of how the rows and columns line up in matrix multiplication: Figure 3.2: Row and columns in a matrix Matrix Multiplication can be performed only if the inner dimensions are equal: The product of two matrices is represented as [C] = [A][B], where the elements of [C] are defined using equation 1: 𝑛 𝑐𝑖𝑗 = ∑ 𝑎𝑖𝑘 𝑏𝑘𝑗 (1) 𝑘=1 Although multiplication is possible, matrix division is not a defined operation. However, if a matrix [A] is square and nonsingular, there is another matrix [A]−1, called the inverse of [A], for which [A][A]−1 = [A]−1[A] = [I ] 21 Thus, the multiplication of a matrix by the inverse is analogous to division, in the sense that a number divided by itself is equal to 1. That is, multiplication of a matrix by its inverse leads to the identity matrix. 1. Graphical Method A graphical solution is obtainable for two linear equations by plotting them on Cartesian coordinates with one axis corresponding to x1 and the other to x2. Because the equations are linear, each equation will plot as a straight line. As shown in Figure 3.3 Figure 3.3: Graphical depiction of singular and ill conditioned system (a) no solution (b) infinite solutions (c) ill conditioned system where the slopes are close that the point of intersection is difficult to detect visually. 2. Determinants and Cramer Rule The determinant can be illustrated for a set of three equations and the determinant of coefficient matrix A: [A]{x}={b} 22 This rule states that each unknown in a system of linear algebraic equations may be expressed as a fraction of two determinants with denominator D and with the numerator obtained from D by replacing the column of coefficients of the unknown in question by the constants b1, b2,... , bn. For example, for three equations, x1 would be computed as: 3. Naïve Gauss elimination The elimination of unknowns was used to solve a pair of simultaneous equations. The procedure consisted of two steps. 1. The equations were manipulated to eliminate one of the unknowns from the equations and reducing the entire system to upper triangular form. The result of this elimination step was that we had one equation with one unknown. 2. Consequently, this equation could be solved directly and the result back-substituted starting from the last variable into one of the original equations to solve for the remaining unknown. 23  a11 a12 a13   x1   b1  a11 a12 a13   x1   b1  a a22 a23   x  = b   0 a22 ' a23 '  x  = b '  21  2  2   2  2  a31 a32 a33   x3  b3   0 0 a33 '  x3  b3 ' For Forward elimination: a   aij  aij −  i1 a1 j (1  j  n )   a11   To eliminate x1 2  i  n a   bi  bi −  i1 b1  a11   a   aij  aij −  i 2 a 2 j (2  j  n)   a 22   To eliminate x2 3  i  n a   bi  bi −  i 2 b2  a 22   a   aij  aij −  ik a kj (k  j  n)   a kk   To eliminate xk k + 1  i  n a   bi  bi −  ik bk  a kk   Continue until xn −1 is eliminated. For Backward substitution: 24 bn xn = a n ,n bn −1 − a n −1,n xn xn −1 = a n −1,n −1 bn − 2 − a n − 2,n xn − a n − 2,n −1 xn −1 xn − 2 = a n − 2, n − 2 n bi −  ai , j x j j = i +1 xi = a i ,i 4. Pivoting The primary reason that the foregoing technique is called ‘naïve’ is that during both the elimination and the back-substitution phases, it is possible that a division by zero can occur. The normalization of the first row would involve division by a11 = 0. Problems may also arise when the pivot element is close, rather than exactly equal, to zero because if the magnitude of the pivot element is small compared to the other elements, then round-off errors can be introduced. Therefore, before each row is normalized, it is advantageous to determine the coefficient with the largest absolute value in the column below the pivot element. The rows can then be switched so that the largest element is the pivot element. This is called partial pivoting. One of the problems of Naïve Gaussian Elimination is it may fail for very simple cases such as given by the matrix below in which the pivoting element is zero. 0 1  x1  1  1 1  x  = 2    2   Another thing is when the very small pivoting element may result in serious computation errors such as given by matrix below. 25 10−10 1  x1  1    =   1 1  x2  2 Index vectors are used because it is much easier to exchange a single index element compared to exchanging the values of a complete row. In practical problems with very large N, exchanging the contents of rows may not be practical. A good solution is when given Ax=B in which x is a solution if Ax-B=0. Residual vector can be computed by equation (2): R=Ax-B (2) However, due to rounding error, R may not be always zero. Thus, the solution is acceptable if Max |ri| ≤ ℰi. 5. Tridiagonal Systems The non-zero elements are in the main diagonal (red), super diagonal (blue) and subdiagonal (brown) as shown in the figure below. 𝑎𝑖𝑗 = 0 𝑖𝑓 |𝑖 − 𝑗| > 1 26 Tridiagonal systems occur in many applications. Selection of pivoting rows is unnecessary (under some conditions) Efficiently solved by Gaussian elimination. Algorithms to solve Tridiagonal Systems Based on Naive Gaussian elimination. As in previous Gaussian elimination algorithms o Forward elimination step o Backward substitution step Elements in the super diagonal are not affected. Elements in the main diagonal, and B need updating All the a elements will be zeros, need to update the d and b elements The c elements are not updated d1 c1   x1   b1  d1 c1   x1   b1  a d c2   x  b   d 2' c2   x  b '   1 2  2  2    2   2'   a2 d3    x3  =  b3    d 3'    x3  =  b3     cn −1          cn −1                   a n −1 d n   xn  bn   d n'   xn  bn'  Matrix A is diagonally dominant if: 𝑛 |𝑎𝑖𝑖 | > ∑ |𝑎𝑖𝑗 | 𝑓𝑜𝑟 (1 ≤ 𝑖 ≤ 𝑛) 𝑗=1,𝑗≠1 The magnitude of each diagonal element is larger than n the sum of elements in the corresponding row. 27 Examples : 3 0 1  − 3 0 1  1 6 1   2 3 2     1 2 − 5  1 2 1  Diagonally dominant Not Diagonally dominant A tridiagonal system is diagonally dominant if: |𝑑𝑖 | > |𝑐𝑖 | + |𝑎𝑖−1 | (1 ≤ 𝑖 ≤ 𝑛) Forward elimination preserves diagonal dominance. Forward Eliminatio n a  d i  d i −  i −1 ci −1  d i −1  a  bi  bi −  i −1 bi −1 2in  d i −1  Backward Substituti on b xn = n dn xi = 1 (bi − ci xi +1 ) for i = n − 1, n − 2,...,1 di 6. Gauss Jordan Method The method reduces the general system of equations AX=B to IX=B where I is an identity matrix. Only Forward elimination is done and no backward substitution is needed. It has the same problems as Naive Gaussian elimination and can be modified to do partial scaled pivoting. It takes 50% more time than Naive Gaussian method. 28 LU Decomposition Another way of solving a system of equations is by using a factorization technique for matrices called LU decomposition. This factorization is involves two matrices, one lower triangular matrix and one upper triangular matrix. LU factorization methods separate the time-consuming elimination of the matrix [A] from the manipulations of the right-hand-side [b]. Once [A] has been factored (or decomposed), multiple right-hand-side vectors can be evaluated in an efficient manner.  l11 0  0  u11 u12  u1m  l 0  u 2 m  l 22  0  u 22 L =  21 U =                l m1 lm2  l mm  0 0  u mm  Where, A=LU 1 0 0 u11 u12 u13  A = LU  =  21 1 0  0 u 22 u 23   31  32 1  0 0 u 33  LU factorization involves two steps as shown in Figure 3.4: ◼ Factorization to decompose the [A] matrix into a product of a lower triangular matrix [L] and an upper triangular matrix [U]. [L] has 1 for each entry on the diagonal. ◼ Substitution to solve for {x} ◼ Gauss elimination can be implemented using LU factorization 29 Figure 3.4: LU Factorization steps Given [A][X] = [C] 1. Decompose [A] into [L] (by forward elimination) and [U] 2. Solve [L][Z] = [C] for [Z] 3. Solve [U][X] = [Z] for [X] Inverse of a Matrix The inverse [B] of a square matrix [A] is defined as: 30 [A][B]=[I]=[B][A] LU Decomposition be used to find the inverse of a matrix. Assume the first column of [B] to be [b11 b12 … bn1]T.Using this and the definition of matrix multiplication First column of [B] Second column of [B] b11  1  b12  0 b  0  b  1 A  21  =   A  22  =                     bn1  0 bn 2  0 The remaining columns in [B] can be found in the same manner and this can be verified by the following operation, [A][A]-1=[I]=[A]-1[A]=1 Cholesky Factorization/Decomposition If A is a real, symmetric and positive definite matrix then there exists a unique lower triangular matrix L with positive diagonal element such that A=LLT=[U]T[U]. Symmetric systems occur commonly in both mathematical and engineering/science problem contexts, and there are special solution techniques available for such systems. The Cholesky factorization is one of the most popular of these techniques, and is based on the fact that a symmetric matrix can be decomposed as [A]=[U]T[U], where T stands for transpose. The rest of the process is similar to LU decomposition and Gauss elimination, except only one matrix, [U], needs to be stored. The terms can be multiplied out and set equal to each other. The factorization can be generated efficiently by recurrence relations as by the equation 3 below for ith row. 31 𝑖−1 2 𝑢𝑖𝑖 = √𝑎𝑖𝑖 − ∑ 𝑢𝑘𝑖 (3𝑎) 𝑘=1 𝑎𝑖𝑗 − ∑𝑖−1 𝑘=1 𝑢𝑘𝑖 𝑢𝑘𝑗 𝑢𝑖𝑗 = 𝑓𝑜𝑟 𝑗 = 1 + 1, ….. , 𝑛 (3𝑏) 𝑢𝑖𝑖 After obtaining the factorization, it can be used to determine a solution for a right hand side vector (b) in a manner similar to LU factorization. First, an intermediate vector {d} is created by solving [U]T[d]=[b] Then, the final solution can be obtained by solving, [U][x]=[d] Gauss-Seidel Method Gauss-Seidel Method is a method being used to solve linear equations using iteration or successive substitution. Its basic procedure is as follows: - Algebraically solve each linear equation for xi - Assume an initial guess solution array - Solve for each xi and repeat - Use absolute relative approximate error after each iteration to check if error is within a pre-specified tolerance. A set of n equations and n unknowns: a11 x1 + a12 x2 + a13 x3 +... + a1n xn = b1 a21 x1 + a22 x2 + a23 x3 +... + a2n xn = b2 an1 x1 + an 2 x2 + an3 x3 +... + ann xn = bn 32 If: the diagonal elements are non-zero Rewrite each equation solving for the corresponding unknown ex: First equation, solve for x1 Second equation, solve for x2 Rewriting each equation c1 − a12 x 2 − a13 x3  − a1n x n From Equation 1 x1 = a11 From equation 2 c − a21 x1 − a23 x3  − a2 n xn x2 = 2 a22    From equation n-1 cn −1 − an −1,1 x1 − an −1, 2 x2  − an −1,n − 2 xn − 2 − an −1,n xn xn −1 = an −1,n −1 cn − an1 x1 − an 2 x2 −  − an ,n −1 xn −1 From equation n xn = ann The general form of each equation can be written and for any row ‘i’, n  an−1, j x j n n c1 −  a1 j x j cn −1 − c n −  a nj x j j =1 j =1 j =1 j 1 j  n −1 j n x1 = xn −1 = xn = a11 an −1,n −1 a nn n c2 −  a2 j x j j =1 j2 x2 = a 22 33  x1  x   2       xn-1   xn  Solving the unknowns will involve assuming initial guess such as, Use rewritten equations to solve for each value of xi. It is important to remember to use the most recent value of xi. Which means to apply values calculated to the calculations remaining in the current iteration. Absolute Relative approximate error can be computed by equation 4: xinew − xiold a i = new 100 xi (4) The iterations are stopped when the absolute relative approximate error is less than a prespecified tolerance for all unknowns. A pitfall of the Gauss-Seidel method that not all systems of equations will converge. One class of system of equations always converges: One with a diagonally dominant coefficient matrix. [A] in [A] [X] = [C] is diagonally dominant if the following conditions are met: 𝑛 𝑛 |𝑎𝑖𝑖 | ≥ ∑ |𝑎𝑖𝑗 | 𝑓𝑜𝑟 𝑎𝑙𝑙 ′𝑖′ 𝑎𝑛𝑑 |𝑎𝑖𝑖 | > ∑ |𝑎𝑖𝑗 | 𝑓𝑜𝑟 𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 ′𝑖′ 𝑗=1,𝑗≠1 𝑗=1,𝑗≠1 34 Jacobi Iteration A somewhat different approach in solving linear equations using iterative methods aside from Gauss Seidel method. Rather than using the latest available x’s, this technique uses successive substitution per iteration to compute a set of new x’s on the bases of a set of old x’s. Thus, as new values are generated, they are not immediately used but rather are retained for the next iteration. The difference between the Gauss Seidel method and Jacobi iteration can be summarized in Figure 3.5. Figure 3.5: Fundamental difference between (a) Gauss Seidel method and (b) Jacobi iteration To learn more about Linear System, stick with these following resources: Watch: Solving linear systems by substitution | Algebra Basics | Khan Academy (https://www.youtube.com/watch?v=V7H1oUHXPkg) Gaussian elimination | Lecture 10 | Matrix Algebra for Engineers | Jeffrey Chasnov (https://www.youtube.com/watch?v=RgnWMBpQPXk) 35 Numerical Solutions of Linear Systems - Cholesky Decomposition/Factorisation | The Math Guy (https://www.youtube.com/watch?v=CzuWrFVZelY) Newton's method for solving nonlinear systems of Algebraic equations | The Math Guy (https://www.youtube.com/watch?v=zPDp_ewoyhM) Read: Applied Numerical Methods with MatLAB for Engineers and Scientists Chapter 8,9,10,11 and 12 by S.T. Chapra Applied Numerical Methods using MatLAB Chapter 2 by Won Young Yang, Wenwu Cao, Tae-Sang Chung and John Morris 36 Module 4: CURVE FITTING This module is all about the curve fitting methods that is being used for discrete values along a certain continuum and its application in engineering and science. Overview: The basic approach to fit a curve or a series of curves that pass directly through each of the points is when the data are known to be precise. Curve fitting is the process of constructing a curve or mathematical functions, which possess closest proximity to the series of data. Although many of the widely used engineering and scientific properties have been tabulated, there are great many more that are not available in tables. Special cases and new problem contexts often require that you measure your own data and develop your own predictive relationships. The attempts to find the best fit curve through a certain set of data can be seen on Figure 4.1 in which (a) and (b) falls under the category of Positive correlation while (c) has no correlation. Figure 4.1: Three attempts to fit certain data through a curve (a) least square regression (b) linear interpolation (c) curvilinear interpolation 37 The type of application which are generally encountered when fitting a set of experimental data are: a. Trend analysis- represents the process of using the pattern to make predictions such as weather and load forecasting. b. Hypothesis testing—existing mathematical model is compared with measured data. Module Objectives: After successful completion of this module, you should be able to: Compute the slope and intercept of a best fit straight line with linear regression. Implement polynomial and multiple linear regression. Formulate general linear least-squares model. Evaluate polynomial coefficients with simultaneous equations in which is ill-conditioned problem. Perform an interpolation with Newton’s and Lagrange polynomial. Solve an inverse interpolation problem. Course Materials: Before divulging into regression analysis, we will first review basic concepts on its mathematical background most especially in statistics. Arithmetic mean is the sum of the individual data points (yi) divided by the number of points (n) and is the measure of the central tendency as shown in equation 1. ∑ 𝑦𝑖 ŷ= , 𝑖 = 1, … , 𝑛 (1) 𝑛 Standard deviation is the most common measure of spread for a sample as shown in equation 2. 38 𝑆𝑡 𝑆𝑦 = √ , 𝑆𝑡 = ∑(𝑦𝑖 − ŷ)2 (2) 𝑛−1 Variance represents the spread by the square of the standard deviation as shown in equation 3. 2 ∑ 𝑦𝑖2 − (∑ 𝑦𝑖 )2 ∑(𝑦𝑖 − ŷ) 𝑛 𝑆𝑦2 = = (3) 𝑛−1 𝑛−1 Coefficient of variation has the utility to quantify the spread of data as shown in equation 4. 𝑆𝑦 𝑐. 𝑣. = 𝑥100 (4) ŷ Linear Least-Squares Regression One approach to do best curve-fitting strategy is to derive an approximating function that fits the shape or general trend of the data without necessarily matching the individual points. this is to visually inspect the plotted data and then sketch a best line through the points. If points define a perfect straight line, different analysts would draw different lines and to remove this subjectivity, one must remove the discrepancy between the data points and the curve. The simplest way to fit a straight line to a set of paired observations such as: (x1,y1), (x2,y2),….. (xn,yn). The mathematical expression for the straight line is given by equation 5: 𝑦 = 𝑎0 + 𝑎1 𝑥 + 𝑒 (5) Where: ao and a1=coefficients for intercept and slope respectively e= error or residual Equation 5 can be rearranged to determine the residual given by equation 6: 𝑒 = 𝑦 − 𝑎0 − 𝑎1 𝑥 − 𝑒 (6) Thus, in order to sought the “best” line through a set data, one must minimize the sum of the residual errors for all available data as given in equation 7. 39 𝑛 𝑛 ∑ 𝑒𝑖 = ∑(𝑦𝑖 − 𝑎𝑜 − 𝑎1 𝑥𝑖 ) (7) 𝑖=1 𝑖=1 Where: n=total number of points. To minimize the sum of the absolute values of the discrepancies, equation 3 can be modified to: 𝑛 𝑛 ∑ |𝑒𝑖 | = ∑ |𝑦𝑖 − 𝑎𝑜 − 𝑎1 𝑥𝑖 | (8) 𝑖=1 𝑖=1 Figure 4.2 shows some of the best criteria for “best fit” which are inadequate for regression. Figure 4.2: Criteria for “best fit” that are inadequate for regression (a) minimize the sum of the residuals (b) minimizes the sum of the absolute values of the residuals (c) minimizes the maximum error of any individual point. Minimax criterion is a technique for fitting the best line in which the line is chosen that minimizes the maximum distance that an individual point falls from the line and is ill-suited for regression because it gives undue influence to an outlier (as shown in Figure 1.2 (c) ) which is an single point in set of data with large error. 40 Which are called normal equations and can be solved simultaneously to determine slope as shown in equation 9: 𝑛 ∑ 𝑥𝑖 𝑦𝑖 − ∑ 𝑥𝑖 ∑ 𝑦𝑖 𝑎1 = (9) 𝑛 ∑ 𝑥𝑖2 − (∑ 𝑥𝑖 )2 Thus, solving for intercept as shown by equation: 𝑎𝑜 = ŷ − 𝑎1 𝑥 ^ (10) Where: ŷ 𝑎𝑛𝑑 𝑥 ^ are the means of y and x respectively. Recalling the formula for standard deviation and least square which has a striking similarity: 𝑛 2 𝑆𝑡 = ∑(𝑦𝑖 − ŷ) 𝑆𝑟 = ∑(𝑦𝑖 − 𝑎𝑜 − 𝑎1 𝑥𝑖 )2 𝑖=1 In order to quantify the error of Linear Regression, a number of additional properties of this fit can be elucidated by examining more closely the way in which residuals were computed. Thus, for standard deviation, the square of the residual represented the square of the discrepancy between the data and a single estimate of the measure of the central tendency which is the mean while in least square, the square of the residual represents the square of the vertical distance between the data and another measure of central tendency which is the straight as shown in Figure 4.3. 41 Figure 4.3: Graphic representation of residual in linear regression These comparisons can be applied further for cases where: 1. Spread of the points around the line is of similar magnitude along the entire range of data 2. Distribution of these points about the line is normal. If these criteria are met, least square regression will provide the most likely best estimates for ao and a1 which are called maximum likelihood principle (Draper and Smith,1981). An equation for “standard deviation” applied for regression line or standard error of the estimate is given by equation 11. 𝑆𝑟 𝑠𝑦 = √ (11) 𝑥 𝑛−2 Where: sy/x = standard error of the estimate in which the subscript notation “y/x” designates that the error is for a predicted value of y that corresponds to the value of x n=number of points which must be greater than 2 to avoid “spread of data” around the straight connecting two points 42 The graphical representation of residual errors is shown in Figure 4.4 in which (a) has less errors or “best fit” than (b). Figure 4.4: Graphical representation of linear regression Unexplained sum of the squares happens when an error remains after the regression which quantifies the improvement or error reduction due to describing the data in terms of a straight line which is given by equation 12. 𝑆𝑡 − 𝑆𝑟 𝑟2 = (12) 𝑆𝑡 Where: r2=coefficient of determination r=correlation coefficient An alternative formula for r using the variables of standard deviation and least square will yield. 43 𝑛 ∑(𝑥𝑖 𝑦𝑖 ) − (∑ 𝑥𝑖 )(∑ 𝑦𝑖 ) 𝑟= (13) √𝑛 ∑ 𝑥𝑖2 − (∑ 𝑥𝑖 )2 √𝑛 ∑ 𝑦𝑖2 − (∑ 𝑦𝑖 )2 Polynomial regression The least-squares procedure can be used to fit the data to a higher polynomial or even quadratic such as: 𝑦 = 𝑎0 + 𝑎1 𝑥 + 𝑎2 𝑥 2 + 𝑒 In this case, the sum of the squares of the residuals as shown in equation 14 is: 𝑛 2 𝑆𝑟 = ∑(𝑦𝑖 − 𝑎𝑜 − 𝑎1 𝑥𝑖 − 𝑎2 𝑥𝑖2 ) (14) 𝑖=1 The two dimensional case can be easily extended to an mth-order polynomial to give its general form as in equation 15: 𝑦 = 𝑎0 + 𝑎1 𝑥 + 𝑎2 𝑥 2 + ⋯ + 𝑎𝑚 𝑥 𝑚 + 𝑒 (15) Recognizing that determining the coefficients of an mth order polynomial is equivalent to solving a system of m+1, the standard error is given by equation 16: 𝑆𝑟 𝑠𝑦 = √ (17) 𝑥 𝑛 − (𝑚 + 1) Where: n-number of points m-order of the polynomial 44 Setting these partial derivatives to be minimized and rearranging to develop a simultaneous linear equations with three unknowns ao, a1 and a2 as: (𝑛)𝑎0 + (∑ 𝑥1,𝑖 ) 𝑎1 + (∑ 𝑥2,𝑖 ) 𝑎2 = ∑ 𝑦𝑖 2 (∑ 𝑥1,𝑖 ) 𝑎0 + (∑ 𝑥1,𝑖 ) 𝑎1 + (∑ 𝑥1,𝑖 𝑥2,𝑖 ) 𝑎2 = ∑ 𝑥1,𝑖 𝑦𝑖 2 (∑ 𝑥2,𝑖 ) 𝑎0 + (∑ 𝑥1,𝑖 𝑥2,𝑖 ) 𝑎1 + (∑ 𝑥2,𝑖 ) 𝑎2 = ∑ 𝑥2,𝑖 𝑦𝑖 From the previous discussion, simple linear, polynomial and multiple belong to the general linear least squares model as shown by equation 18. 𝑦 = 𝑎0 𝑧0 + 𝑎1 𝑧1 + 𝑎2 𝑧2 + ⋯ + 𝑎𝑛 𝑧𝑛 (18) Where z0, z1 …., zm are m+1 basis function. Thus, it can be said it is a “linear” in a sense that it only refers to the model’s dependence on its function-a’s. Equation 12 can be expressed then in matrix notation as in equation 19. {𝑦} = [𝑍]{𝑎} + {𝑒} (19) Where Z is the matrix of the calculated values of the basis functions such m=number of variables n=number of data points 45 column vector {y}=observed values of the dependent variable such {𝑦}𝑇 = [𝑦1 𝑦2 … 𝑦𝑛 ] column vector {a}=unknown coefficients such {𝑎}𝑇 = [𝑎0 𝑎1 … 𝑎𝑛 ] column vector {e}=residuals such {2}𝑇 = [𝑒1 𝑒2 … 𝑒𝑛 ] The sum of squares of the residuals can be defined in equation 20. 2 𝑛 𝑚 𝑆𝑟 = ∑ (𝑦𝑖 − ∑ 𝑎𝑗 𝑧𝑖𝑗 ) (20) 𝑖=1 𝑗=0 The outcome of this process can be generalized in its matrix form as given by equation 21: [[𝑍]𝑇 [𝑍]]{𝑎} = {[𝑍]𝑇 {𝑦}} (22) The coefficient of the determination and the standard error can also be formulated in terms of matrix algebra such as given in equation 16. From equation 7, 𝑆𝑡 − 𝑆𝑟 𝑆𝑟 𝑟2 = =1− 𝑆𝑡 𝑆𝑟 Thus, substituting the definitions of Sr and St yields, ∑(𝑦𝑖 − ŷ𝑖 )2 𝑟2 = 1 − ∑(𝑦𝑖 − 𝑦^ 𝑖 )2 where ŷ=prediction of the least squares fit. Thus, it can be expressed in vector form as: {𝑦} − [𝑍]{𝑎} 46 Polynomial Interpolation Polynomial Interpolation is simply a technique to estimate intermediate values between precise data points. The general formula for an (n-1)th order polynomial can be written as: 𝑓(𝑥) = 𝑎1 + 𝑎2 𝑥 + 𝑎3 𝑥 2 + ⋯ + 𝑎𝑛 𝑥 𝑛−1 (23) For n data points, there is one and only one polynomial of order (n-1) that passes through all the points. This polynomial then provides a formula to compute intermediate values such as shown in figure 4.5. Figure 4.5: Examples of interpolating polynomials (a) first order (linear) (b) second order (quadratic) (c) third order Polynomial interpolation consists the unique (n-1)th order polynomial that fits n data points. In order to be consistent with respect to x, equation 24 will use decreasing powers as in, 𝑓(𝑥) = 𝑝1 𝑥 𝑛−1 + 𝑝2 𝑥 𝑛−2 + ⋯ + 𝑝1 𝑥 + 𝑝𝑛 (24) Although the above approach provides an easy way to solve interpolation it has serious deficiency as the last column [1,1,1] and it is prone to round off errors. This type of matrices is referred to as Vandermonde matrices and thus, it is inaccurate if applied to a much larger 47 simultaneous. Newton and Lagrange polynomials mitigates this shortcomings by plotting its objective function. Newton Interpolating Polynomial The simplest form of interpolation is to connect data points with a straight line as shown in Figure 4.6 which is called linear interpolation. Figure 4.6: Graphical representation of linear interpolation From Figure 4.6, using similar triangles, 𝑓1 (𝑥) − 𝑓(𝑥1 ) 𝑓(𝑥2 ) − 𝑓(𝑥1 ) = 𝑥 − 𝑥1 𝑥2 − 𝑥1 Rearranging, to yield 48 𝑓(𝑥2 ) − 𝑓(𝑥1 ) 𝑓1 (𝑥) = 𝑓(𝑥1 ) + (𝑥 − 𝑥1 ) (25) 𝑥2 − 𝑥1 In which equation 25 is the Newton linear-interpolation formula. The notation f1(x) indicates that 𝑓(𝑥2 )−𝑓(𝑥1 ) this is a first order interpolation polynomial and the term 𝑥2 −𝑥1 is a finite difference of the first derivative. Thus, the smaller the interval between the data points, the better the approximation. If three data points are available, this can be accomplished with a second-order polynomial (also called a quadratic polynomial or a parabola). A particularly convenient form for this purpose can be shown in the sample equation below: 𝑓2 (𝑥) = 𝑏1 + 𝑏2 (𝑥 − 𝑥1 ) + 𝑏3 (𝑥 − 𝑥1 )(𝑥 − 𝑥2 ) To determine the coefficients, substituting x=x1 , x=x2 and x=x3 by successive substitution to the original equation, yields. 𝑏1 = 𝑓(𝑥1 ) 𝑓(𝑥2 ) − 𝑓(𝑥1 ) 𝑏2 = 𝑥2 − 𝑥1 𝑓(𝑥3 ) − 𝑓(𝑥2 ) 𝑓(𝑥2 ) − 𝑓(𝑥1 ) 𝑥3 − 𝑥2 − 𝑥2 − 𝑥1 𝑏3 = 𝑥3 − 𝑥1 These equations is beginning to manifest to a structure similar to Taylor series expansion in which terms are added sequentially to capture increasingly higher order curvature. The foregoing analysis of (n-1)th polynomials can be written into its general form as given in equation 26, 49 𝑓𝑛−1 (𝑥) = 𝑏1 + 𝑏2 (𝑥 − 𝑥1 ) + ⋯ + 𝑏𝑛 (𝑥 − 𝑥1 )(𝑥 − 𝑥2 ) …. (𝑥 − 𝑥𝑛−1 ) (27) Using the data points to determine the coefficients for (n-1)th polynomial into its general form given by equation 28, 𝑏1 = 𝑓(𝑥1 ) 𝑏2 = 𝑓[𝑥2 , 𝑥1 ] 𝑏3 = 𝑓[𝑥3 , 𝑥2 , 𝑥1 ]... 𝑏𝑛 = 𝑓[𝑥𝑛 , 𝑥𝑛−1 , … , 𝑥2 , 𝑥1 ] (28) The bracketed function evaluations are finite divided differences (graphically represented in Figure 4.8) which can be generalized as in equation 29,30 and 31 for first,second and nth divided difference respectively, 𝑓(𝑥𝑖 ) − 𝑓(𝑥𝑗 ) 𝑓[𝑥𝑖 , 𝑥𝑗 ] = (29) 𝑥𝑖 − 𝑥𝑗 𝑓[𝑥𝑖 , 𝑥𝑗 ] − 𝑓[𝑥𝑗 , 𝑥𝑘 ] 𝑓[𝑥𝑖 , 𝑥𝑗 , 𝑥𝑘 ] = (30) 𝑥𝑖 − 𝑥𝑘 𝑓[𝑥𝑛 , 𝑥𝑛−1 , … , 𝑥2 ] − 𝑓[𝑥𝑛 , 𝑥𝑛−1 , … , 𝑥2 ] 𝑓[𝑥𝑛 , 𝑥𝑛−1 , … , 𝑥2 , 𝑥1 ] = (31) 𝑥𝑖 − 𝑥𝑘 50 Figure 4.7: Graphical representation of finite divided difference which shows its recursive nature Using the general forms of determining the coefficients and the bracketed function evaluations (equation 29 to 31) to the general form of (n-1)th polynomial (equation 28) will yield the general form of Newton’s interpolating polynomials as shown in equation 32, 𝑓𝑛−1 (𝑥) = 𝑓1 (𝑥) + (𝑥 − 𝑥1 )𝑓[𝑥2 , 𝑥1 ] + (𝑥 − 𝑥1 )(𝑥 − 𝑥2 )𝑓[𝑥3 , 𝑥2 , 𝑥1 ] + ⋯ + (𝑥 − 𝑥1 )(𝑥 − 𝑥2 ) … (𝑥 − 𝑥𝑛−1 )𝑓[𝑥𝑛 , 𝑥𝑛−1 , … , 𝑥2 , 𝑥1 ] (32) Lagrange Interpolating Polynomial Linear interpolating polynomial can be written as the weighted average of the two values that are connected by a straight line such as shown in equation 33. 𝑓(𝑥) = 𝐿1 𝑓(𝑥1 ) + 𝐿2 𝑓(𝑥2 ) (33) Where L’s=weighting coefficient in which the first weighting coefficient is the straight line that is 𝑥−𝑥2 equal to 1 at x1 and 0 at x2 such 𝐿1 = 𝑥 and the coefficient is the straight line that is equal to 1 −𝑥2 𝑥−𝑥1 1 at x2 and 0 at x1 such 𝐿2 =. Substituting these coefficients to equation 14 will yield: 𝑥2 −𝑥1 𝑥 − 𝑥2 𝑥 − 𝑥1 𝑓1 (𝑥) = 𝑓(𝑥1 ) + 𝑓(𝑥2 ) (34) 𝑥1 − 𝑥2 𝑥2 − 𝑥1 51 Equation 25 is referred to as linear Lagrange interpolating polynomial in which this particular equation indicates that is a first order polynomial as graphically represented in Figure 4.9. This is the same strategy as that of Newton’s interpolating polynomial. For example, three parabolas will be fit through three points with each one passing through one of the points and equaling zero at the other two. Their sum would then represents the unique parabola that connects the three points and in which a second order Lagrange interpolating polynomial can be written as: (𝑥 − 𝑥2 )(𝑥 − 𝑥3 ) (𝑥 − 𝑥1 )(𝑥 − 𝑥3 ) (𝑥 − 𝑥1 )(𝑥 − 𝑥2 ) 𝑓2 (𝑥) = 𝑓(𝑥1 ) + 𝑓(𝑥2 ) + 𝑓(𝑥3 ) (𝑥1 − 𝑥2 )(𝑥1 − 𝑥3 ) (𝑥2 − 𝑥1 )(𝑥2 − 𝑥3 ) (𝑥3 − 𝑥1 )(𝑥3 − 𝑥2 ) Higher order Lagrange polynomials can be written in its general form as given in equation 35. 𝑛 𝑓𝑛−1 (𝑥) = ∑ 𝐿𝑖 (𝑥)𝑓(𝑥𝑖 ) (35) 𝑖=1 Where 𝑛 𝑥 − 𝑥𝑗 𝐿= ∏ 𝑥𝑖 − 𝑥𝑗 𝑗=1,𝑗≠1 Where: n=number of data points 52 Figure 4.8: Graphical representation of Lagrange interpolating polynomials for first order case To learn more about Curve fitting, stick with these following resources: Watch: Part 5: Numerical Methods: Curve Fitting | Jacob Bishop (https://www.youtube.com/watch?v=yQsDxOdn1hk&list=PLYdroRCLMg5NTT00m- 7ituVGdtY1X680M) Regression: Crash Course Statistics #32 | CrashCourse (https://www.youtube.com/watch?v=WWqE7YHR4Jc) Statistics 101: Linear Regression, The Very Basics | Brandon Foltz (https://www.youtube.com/watch?v=ZkjP5RJLQF4) Read: Applied Numerical Methods with MatLAB for Engineers and Scientists Chapter 14,15 and 17 by S.T. Chapra Applied Numerical Methods using MatLAB Chapter 3 by Won Young Yang, Wenwu Cao, Tae-Sang Chung and John Morris 53 Activities: Instruction: Please accomplish this following problem with your corresponding solutions and final answer in a yellow pad with your Name, Section, Subject for each module/topic and send it to the respective courier (Air21/LBC) who sent you this module. If you have problems and concerns regarding this activity, contact me at my cellphone number 09085212086. Please write eligibly. Approximation and Rounding Errors: A. Determine how many significant digits for each figure: a) 31.20002 b) 2014.2300 c) 2003 x 104 d) 201.003 x 103 e) 20.003 x 105 A. Given an area of circle of 96 inches squared, determine its given radius by using the initial estimate of 4.5 inches. B. Essay: Based on your own learning from your previous subjects, How can you apply the concept of accuracy, bias and precision in the field of electrical engineering? Roots and Optimization: A. Problem Solving (Bracketing Method): Solve the following problem using Bisection Method and False Position Method. Using the format below for your answer tabulation: N Xl Xu Xr Ea Et f(xl)f(xr) a. Determine the roots of 12 + 21𝑥 = 18𝑥 2 − 2.75𝑥 3 𝑤𝑖𝑡ℎ 𝑖𝑛𝑖𝑡𝑖𝑎𝑙 𝑔𝑢𝑒𝑠𝑠 𝑜𝑓 − 1 𝑎𝑛𝑑 0 𝑎𝑛𝑑 𝑠𝑡𝑜𝑝𝑝𝑖𝑛𝑔 𝑐𝑟𝑖𝑡𝑒𝑟𝑖𝑜𝑛 𝑜𝑓 1%. b. Locate the first non trivial root of sin(x)=x2 where x is in radians with the initial interval from 0.5 to 1 and perform the computation until the error is less than 0.02 %. 54 c. Solve the roots of ln(x2) + ln(x3) +ln(x4)=1 using an initial value of xl=0.2 and xu=4 at an error criterion of 0.005 %. B. Problem Solving (Open Method): Solve the following problem using Newton Rhapson Method and Modified Secant method. Using the format below for your answer tabulation for Newton Rhapson and Modified Secant respectively: n Xi f(xi) f’(xi) Ea Et Xi+1 n Xi f(xi) Xi+δXi f(xi+δxi) Ea Et Xi+1 a. Employ the Newton-Raphson method to determine a real root for f (x) = −2 + 6x − 4x2 + 0.5x3, using an initial guess of (a) 4.5, and (b) 4.43 b. A Parallel Circuit consisting of R,L and C has an impedance of: 1 1 1 2 = √ 2 + (𝑤𝐶 − ) 𝑍 𝑅 𝑤𝐿 c. Where Z=impedance (ohm), and w=angular frequency. Find the w that results in an impedance of 100 ohm using an initial value of 1 rad/s and perturbation factor of 10-6 at error criterion of 0.05 % using the modified secant method for the following parameters: R=225 ohm, C=0.6 microfarad, L=0.5 H. d. Use Newton Rhapson Method to locate the root of e-x+sin(x)=2 using an initial value of 0.1 at an error criterion of 0.02%. Use the following format for your solution. C. Use the Golden Section method to solve the minimum of f(x)=ln(x2) + ln(x3) +ln(x4) using an initial value of xl=0.2 and xu=4. Use the following format for your solution. N X1 f(x1) X2 f(x2) Xl f(xl) Xu f(xu) d D. Use Newton Method to locate the minimum of: 𝑓(𝑥) = 𝑥 2 + 𝑒 −𝑥 + (4𝑥 + 𝑥 2 )2 − 2 Using an initial value of 0.0.05. Use the following formats for your answer. I X f(‘x) f’’(x) x estimate F(x) 55 Linear System: A. Given the set of equation below, 17x+6y+4z+5t=17 8x+22y+2z+4t=20 6x+3y+17z+7t=11 3x+4y+7z+19t=22 a. Determine the value of the unknowns using Naïve Gaussian elimination. b. Determine the value of the unknowns using Gauss Jordan elimination. c. Determine the value of the unknowns using LU factorization. d. Compute the Cholesky factorization for the coefficient matrix [A]. B. Given the set of equation below, 17x+6y+4z=17 8x+22y+2z=20 6x+3y+17z=11 a. Determine the determinant of the coefficient. b. Determine the value of the unknowns using Cramer’s Rule c. Determine the inverse of the coefficient of matrix C. The following system of equations is designed to determine concentrations (the c.s in g/m3), c1, c2 and c3, in a series of coupled reactors as a function of the amount of mass input to each reactor (the right-hand sides in g/day): 15c1 − 3c2 − c3 = 3800 −3c1 + 18c2 − 6c3 = 1200 −4c1 − c2 + 12c3 = 2350 56 a. Determine the value of the concentrations using Gauss Seidel method and using 0 as initial values until the absolute relative error reaches less than 1% or it reaches 10th iteration which comes first. b. Determine the value of the concentrations using Jacobi iteration and using 0 as initial values until the absolute relative error reaches less than 1% or it reaches 10th iteration whichever comes first. Curve fitting: 1. Given the data Determine (a) mean (b) median (c) mode (d) range (e) standard deviation (f) variance (g) coefficient of variation 2. Use the least squares regression to fit a straight line to x 0 2 4 6 9 11 12 15 17 19 y 5 6 7 6 9 8 8 10 12 11 Along with the slope and intercept, compute the standard error of the estimate and the correlation coefficient. Plot the data and the regression line. 3. The following data were gathered to determine the relationship between pressure and temperature of a fixed volume of 1 kg nitrogen. The volume is 10 m3. T, C0 -40 0 40 80 120 160 P, N/m2 6900 8100 9350 10,500 11,700 12,800 57 Employ the ideal gas law pV=nRT to determine R on the basis of data. Note that for the law, T must be expressed in kelvins. 4. Fit a cubic polynomial to the following data: x 3 4 5 7 8 9 11 12 y 1.6 3.6 4.4 3.4 2.2 2.8 3.8 4.6 Determine the coefficients, r2 and sy/x. 5. Three disease-carrying organisms decay exponentially in seawater according to the following model: 𝑝(𝑡) = 𝐴𝑒 −1.5𝑡 + 𝐵𝑒 −0.3𝑡 + 𝐶𝑒 −0.05𝑡 Estimate the initial concentration (at t=0) of each organism (A,B and C) using multiple linear regression given the following measurements (Hint: Linearize the said equation first before proceeding using the transformation at Figure 4.5) t 0.5 1 2 3 4 5 6 7 9 p(t) 6 4.4 3.2 2.7 2 1.9 1.7 1.4 1.1 58

ELEN-30073 Numerical Methods Instructional Material PDF

Document Details

Tags

Related

Summary

Full Transcript