Least-Squares Regression

1y ago
3 Views
2 Downloads
856.31 KB
12 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Macey Ridenour
Transcription

CHAPTER17Least-Squares RegressionWhere substantial error is associated with data, polynomial interpolation is inappropriateand may yield unsatisfactory results when used to predict intermediate values. Experimental data are often of this type. For example, Fig. 17.1a shows seven experimentallyderived data points exhibiting signiicant variability. Visual inspection of these data suggests a positive relationship between y and x. That is, the overall trend indicates thathigher values of y are associated with higher values of x. Now, if a sixth-order interpolating polynomial is itted to these data (Fig. 17.1b), it will pass exactly through all ofthe points. However, because of the variability in these data, the curve oscillates widelyin the interval between the points. In particular, the interpolated values at x 5 1.5 andx 5 6.5 appear to be well beyond the range suggested by these data.A more appropriate strategy for such cases is to derive an approximating functionthat its the shape or general trend of the data without necessarily matching the individual points. Figure 17.1c illustrates how a straight line can be used to generally characterize the trend of these data without passing through any particular point.One way to determine the line in Fig. 17.1c is to visually inspect the plotted dataand then sketch a “best” line through the points. Although such “eyeball” approacheshave commonsense appeal and are valid for “back-of-the-envelope” calculations, they aredeicient because they are arbitrary. That is, unless the points deine a perfect straightline (in which case, interpolation would be appropriate), different analysts would drawdifferent lines.To remove this subjectivity, some criterion must be devised to establish a basis forthe it. One way to do this is to derive a curve that minimizes the discrepancy betweenthe data points and the curve. A technique for accomplishing this objective, called leastsquares regression, will be discussed in the present chapter.17.1LINEAR REGRESSIONThe simplest example of a least-squares approximation is itting a straight line to a setof paired observations: (x1, y1), (x2, y2), . . . , (xn, yn). The mathematical expression forthe straight line isy 5 a0 1 a1x 1 e456(17.1)[80]

17.1457LINEAR REGRESSIONy5005x5x5x(a)y500(b)y5FIGURE 17.1(a) Data exhibiting significanterror. (b) Polynomial fitoscillating beyond the range ofthe data. (c) More satisfactoryresult using the least-squares fit.00(c)where a0 and a1 are coeficients representing the intercept and the slope, respectively,and e is the error, or residual, between the model and the observations, which can berepresented by rearranging Eq. (17.1) ase 5 y 2 a0 2 a1xThus, the error, or residual, is the discrepancy between the true value of y and the approximate value, a0 1 a1x, predicted by the linear equation.[81]

458LEAST-SQUARES REGRESSION17.1.1 Criteria for a “Best” FitOne strategy for itting a “best” line through the data would be to minimize the sum ofthe residual errors for all the available data, as innna ei 5 a (yi 2 a0 2 a1 xi )i51(17.2)i51where n 5 total number of points. However, this is an inadequate criterion, as illustratedby Fig. 17.2a which depicts the it of a straight line to two points. Obviously, the bestFIGURE 17.2Examples of some criteria for “best fit” that are inadequate for regression: (a) minimizes the sumof the residuals, (b) minimizes the sum of the absolute values of the residuals, and (c) minimizesthe maximum error of any individual point.yMidpointx(a)yx(b)yOutlierx(c)[82]

17.1459LINEAR REGRESSIONit is the line connecting the points. However, any straight line passing through the midpoint of the connecting line (except a perfectly vertical line) results in a minimum valueof Eq. (17.2) equal to zero because the errors cancel.Therefore, another logical criterion might be to minimize the sum of the absolutevalues of the discrepancies, as innna Zei Z 5 a Zyi 2 a0 2 a1xi Zi51i51Figure 17.2b demonstrates why this criterion is also inadequate. For the four pointsshown, any straight line falling within the dashed lines will minimize the sum of theabsolute values. Thus, this criterion also does not yield a unique best it.A third strategy for fitting a best line is the minimax criterion. In this technique,the line is chosen that minimizes the maximum distance that an individual pointfalls from the line. As depicted in Fig. 17.2c, this strategy is ill-suited for regression because it gives undue influence to an outlier, that is, a single point with alarge error. It should be noted that the minimax principle is sometimes well-suitedfor fitting a simple function to a complicated function (Carnahan, Luther, andWilkes, 1969).A strategy that overcomes the shortcomings of the aforementioned approaches is tominimize the sum of the squares of the residuals between the measured y and the ycalculated with the linear modelnnnSr 5 a e2i 5 a (yi, measured 2 yi, model ) 2 5 a (yi 2 a0 2 a1xi ) 2i51i51(17.3)i51This criterion has a number of advantages, including the fact that it yields a unique linefor a given set of data. Before discussing these properties, we will present a techniquefor determining the values of a0 and a1 that minimize Eq. (17.3).17.1.2 Least-Squares Fit of a Straight LineTo determine values for a0 and a1, Eq. (17.3) is differentiated with respect to each coeficient:0Sr5 22 a (yi 2 a0 2 a1xi )0a00Sr5 22 a [(yi 2 a0 2 a1xi )xi ]0a1Note that we have simpliied the summation symbols; unless otherwise indicated, allsummations are from i 5 1 to n. Setting these derivatives equal to zero will result in aminimum Sr. If this is done, the equations can be expressed as0 5 a yi 2 a a0 2 a a1xi0 5 a yi xi 2 a a0 xi 2 a a1xi2[83]

460LEAST-SQUARES REGRESSIONNow, realizing that Sa0 5 na0, we can express the equations as a set of two simultaneous linear equations with two unknowns (a0 and a1):()(na0 1 a xi a1 5 a yi2a xi a0 1 a x i a1 5 a xi yiThese are called the normal equations. They can be solved simultaneously(a1 5))no xi yi 2 o xi o yi(17.4)(17.5)(17.6)no xi2 2 ( o xi ) 2This result can then be used in conjunction with Eq. (17.4) to solve fora0 5 y 2 a1x(17.7)where y and x are the means of y and x, respectively.EXAMPLE 17.1Linear RegressionProblem Statement.of Table 17.1.Fit a straight line to the x and y values in the irst two columnsSolution. The following quantities can be computed:2a xi yi 5 119.5a xi 5 14028x554a xi 5 28724y55 3.428571a yi 5 247Using Eqs. (17.6) and (17.7),n57a1 57(119.5) 2 28(24)7(140) 2 (28) 25 0.8392857a0 5 3.428571 2 0.8392857(4) 5 0.07142857TABLE 17.1 Computations for an error analysis of the linear fit.xiyi(yi 2 y )(yi 2 a0 2 a1xi 730.32650.58960.79720.19932.9911[84]

17.1461LINEAR REGRESSIONTherefore, the least-squares it isy 5 0.07142857 1 0.8392857xThe line, along with the data, is shown in Fig. 17.1c.17.1.3 Quantification of Error of Linear RegressionAny line other than the one computed in Example 17.1 results in a larger sum of thesquares of the residuals. Thus, the line is unique and in terms of our chosen criterion isa “best” line through the points. A number of additional properties of this it can beelucidated by examining more closely the way in which residuals were computed. Recallthat the sum of the squares is deined as [Eq. (17.3)]nnSr 5 a e2i 5 a (yi 2 a0 2 a1xi ) 2i51(17.8)i51Notice the similarity between Eqs. (PT5.3) and (17.8). In the former case, the squareof the residual represented the square of the discrepancy between the data and a singleestimate of the measure of central tendency—the mean. In Eq. (17.8), the square of theresidual represents the square of the vertical distance between the data and another measure of central tendency—the straight line (Fig. 17.3).The analogy can be extended further for cases where (1) the spread of the pointsaround the line is of similar magnitude along the entire range of the data and (2) thedistribution of these points about the line is normal. It can be demonstrated that if thesecriteria are met, least-squares regression will provide the best (that is, the most likely)estimates of a0 and a1 (Draper and Smith, 1981). This is called the maximum likelihoodFIGURE 17.3The residual in linear regression represents the vertical distance between a data point and thestraight line.yyiMeasurementeyi – a0 – a1xilinesgrRensioa0 a1xixi[85]x

462LEAST-SQUARES REGRESSIONprinciple in statistics. In addition, if these criteria are met, a “standard deviation” for theregression line can be determined as [compare with Eq. (PT5.2)]syyx 5SrAn 2 2(17.9)where syyx is called the standard error of the estimate. The subscript notation “yyx” designates that the error is for a predicted value of y corresponding to a particular value of x.Also, notice that we now divide by n 2 2 because two data-derived estimates—a0 anda1—were used to compute Sr; thus, we have lost two degrees of freedom. As with ourdiscussion of the standard deviation in PT5.2.1, another justiication for dividing by n 2 2is that there is no such thing as the “spread of data” around a straight line connecting twopoints. Thus, for the case where n 5 2, Eq. (17.9) yields a meaningless result of ininity.Just as was the case with the standard deviation, the standard error of the estimatequantiies the spread of the data. However, sy/x quantiies the spread around the regressionline as shown in Fig. 17.4b in contrast to the original standard deviation sy that quantiiedthe spread around the mean (Fig. 17.4a).The above concepts can be used to quantify the “goodness” of our it. This is particularly useful for comparison of several regressions (Fig. 17.5). To do this, we returnto the original data and determine the total sum of the squares around the mean for thedependent variable (in our case, y). As was the case for Eq. (PT5.3), this quantity isdesignated St. This is the magnitude of the residual error associated with the dependentvariable prior to regression. After performing the regression, we can compute Sr, the sumof the squares of the residuals around the regression line. This characterizes the residualerror that remains after the regression. It is, therefore, sometimes called the unexplainedFIGURE 17.4Regression data showing (a) the spread of the data around the mean of the dependent variableand (b) the spread of the data around the best-fit line. The reduction in the spread in going from(a) to (b), as indicated by the bell-shaped curves at the right, represents the improvement due tolinear regression.(a)(b)[86]

17.1463LINEAR REGRESSIONyx(a)yx(b)FIGURE 17.5Examples of linear regression with (a) small and (b) large residual errors.EXAMPLE 17.2Estimation of Errors for the Linear Least-Squares FitProblem Statement. Compute the total standard deviation, the standard error of theestimate, and the correlation coeficient for the data in Example 17.1.Solution. The summations are performed and presented in Table 17.1. The standarddeviation is [Eq. (PT5.2)]sy 522.71435 1.9457A 721and the standard error of the estimate is [Eq. (17.9)]syyx 52.99115 0.7735A722[87]

468LEAST-SQUARES REGRESSION17.1.5 Linearization of Nonlinear RelationshipsLinear regression provides a powerful technique for itting a best line to data. However,it is predicated on the fact that the relationship between the dependent and independentvariables is linear. This is not always the case, and the irst step in any regressionanalysis should be to plot and visually inspect the data to ascertain whether a linearmodel applies. For example, Fig. 17.8 shows some data that is obviously curvilinear. Insome cases, techniques such as polynomial regression, which is described in Sec. 17.2,are appropriate. For others, transformations can be used to express the data in a formthat is compatible with linear regression.FIGURE 17.8(a) Data that are ill-suited for linear least-squares regression. (b) Indication that a parabola ispreferable.yx(a)yx(b)[88]

17.1469LINEAR REGRESSIONOne example is the exponential modely 5 a1e b1x(17.12)where a1 and b1 are constants. This model is used in many ields of engineering tocharacterize quantities that increase (positive b1) or decrease (negative b1) at a rate thatis directly proportional to their own magnitude. For example, population growth or radioactive decay can exhibit such behavior. As depicted in Fig. 17.9a, the equation represents a nonlinear relationship (for b1 ? 0) between y and x.Another example of a nonlinear model is the simple power equationy 5 a2 x b2(17.13)FIGURE 17.9(a) The exponential equation, (b) the power equation, and (c) the saturation-growth-rateequation. Parts (d ), (e), and (f ) are linearized versions of these equations that resultfrom simple transformations.yyy 1e 1xy 3 x 3 xy 2 x 2xxln yx(c)Linearization(b)Linearization(a)log y1/ySlope 2Slope 3 / 3Slope 1Intercept 1/ 3Intercept ln 1xlog x1/xIntercept log 2(d)Linearizationy(e)(f)[89]

470LEAST-SQUARES REGRESSIONwhere a2 and b2 are constant coeficients. This model has wide applicability in all ieldsof engineering. As depicted in Fig. 17.9b, the equation (for b2 ? 0 or 1) is nonlinear.A third example of a nonlinear model is the saturation-growth-rate equation [recallEq. (E17.3.1)]y 5 a3xb3 1 x(17.14)where a3 and b3 are constant coeficients. This model, which is particularly well-suited forcharacterizing population growth rate under limiting conditions, also represents a nonlinearrelationship between y and x (Fig. 17.9c) that levels off, or “saturates,” as x increases.Nonlinear regression techniques are available to it these equations to experimentaldata directly. (Note that we will discuss nonlinear regression in Sec. 17.5.) However, asimpler alternative is to use mathematical manipulations to transform the equations intoa linear form. Then, simple linear regression can be employed to it the equations to data.For example, Eq. (17.12) can be linearized by taking its natural logarithm to yieldln y 5 ln a1 1 b1x ln eBut because ln e 5 1,ln y 5 ln a1 1 b1x(17.15)Thus, a plot of ln y versus x will yield a straight line with a slope of b1 and an interceptof ln a1 (Fig. 17.9d).Equation (17.13) is linearized by taking its base-10 logarithm to givelog y 5 b2 log x 1 log a2(17.16)Thus, a plot of log y versus log x will yield a straight line with a slope of b2 and anintercept of log a2 (Fig. 17.9e).Equation (17.14) is linearized by inverting it to giveb3 11115ya3 xa3(17.17)Thus, a plot of 1Yy versus lYx will be linear, with a slope of b3Ya3 and an intercept of1Ya3 (Fig. 17.9f ).In their transformed forms, these models can use linear regression to evaluate theconstant coeficients. They could then be transformed back to their original state andused for predictive purposes. Example 17.4 illustrates this procedure for Eq. (17.13). Inaddition, Sec. 20.1 provides an engineering example of the same sort of computation.EXAMPLE 17.4Linearization of a Power EquationProblem Statement. Fit Eq. (17.13) to the data in Table 17.3 using a logarithmictransformation of the data.Solution. Figure 17.10a is a plot of the original data in its untransformed state. Figure17.10b shows the plot of the transformed data. A linear regression of the log-transformeddata yields the resultlog y 5 1.75 log x 2 0.300[90]

17.1471LINEAR REGRESSIONTABLE 17.3 Data to be fit to the power equation.xylog xlog 2260.5340.7530.922FIGURE 17.10(a) Plot of untransformed data with the power equation that fits these data. (b) Plot of transformeddata used to determine the coefficients of the power equation.y505x0.5log x0(a)log y0.5(b)[91]

the i t. One way to do this is to derive a curve that minimizes the discrepancy between the data points and the curve. A technique for accomplishing this objective, called least-squares regression, will be discussed in the present chapter. 17.1 LINEAR REGRESSION The simplest example of a least-squares approximation is i tting a straight line to .

Related Documents:

3.2 Least-squares regression, Interpreting a regression line, Prediction, Technology: Least-Squares Regression Lines on the Calculator Interpret the slope and y intercept of a least-squares regression line in context. Use the least-squares regression line to predict y f

independent variables. Many other procedures can also fit regression models, but they focus on more specialized forms of regression, such as robust regression, generalized linear regression, nonlinear regression, nonparametric regression, quantile regression, regression modeling of survey data, regression modeling of

Linear Least Squares ! Linear least squares attempts to find a least squares solution for an overdetermined linear system (i.e. a linear system described by an m x n matrix A with more equations than parameters). ! Least squares minimizes the squared Eucliden norm of the residual ! For data fitting on m data points using a linear

ordinary-least-squares (OLS), weighted-least-squares (WLS), and generalized-least-squares (GLS). All three approaches are based on the minimization of the sum of squares of differ-ences between the gage values and the line or surface defined by the regression. The OLS approach is

Least-Squares Regression Lest-squares regression is drived from a curve that minimized the discrepancy between the data points and the curve. Linear Regression A least-squares approximation is fitting a straight line to a set of paired observation. The mathematical expression for the straight line is

Linear Regression Linear regression with one predictor Assess the fit of a regression model –Total sum of squares –Model sum of squares –Residual sum of squares –R2 Test . Microsoft PowerPoint - Biometry Lec

LINEAR REGRESSION 12-2.1 Test for Significance of Regression 12-2.2 Tests on Individual Regression Coefficients and Subsets of Coefficients 12-3 CONFIDENCE INTERVALS IN MULTIPLE LINEAR REGRESSION 12-3.1 Confidence Intervals on Individual Regression Coefficients 12-3.2 Confidence Interval

programming Interrupt handling Ultra-low power Cortex-M4 low power. STM32 F4 Series highlights 1/4 ST is introducing STM32 products based on Cortex M4 core. Over 30 new part numbersOver 30 new part numbers pin-to-pin and software compatiblepin and software compatible with existing STM32 F2 Series. Th DSP d FPU i t ti bi d tThe new DSP and FPU instructions combined to 168Mhz performance open .