Chapter 10 Notes, Regression And Correlation

3y ago
21 Views
2 Downloads
364.93 KB
10 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Noelle Grant
Transcription

Chapter 10 Notes, Regression and CorrelationRegression analysis allows us to estimate the relationship of a response variableto a set of predictor variablesLetx1 , x2 , · · · xny1 , y2 , · · · ynbe settings of x chosen by the investigator andbe the corresponding values of the response.Assume yi is an observation of rv Yi (which depends on xi , where xi is not ran dom).We model each Yi byYi β0 β1 xi tiwhere ti is iid noise with E(ti ) 0 and Var(ti ) σ 2 . We usually assume that tiis distributed as N (0, σ 2 ), so Yi is distributed as N (β0 β1 xi , σ 2 ).Note: it is not true for all experiments that Y is related to X this way of course!Always scatterplot to check for a straight line.1

For a good fit, choose β0 , β1 to minimize the sum of squared errors.MinimizeQ nn2(yi f (xi )) i 1nn(yi (β0 β1 xi ))2 “least squares”i 1To minimize Q, set derivatives to 0 and solve for β ' s. Call the solutions βˆ0 , andβˆ1 .nn Q(1)0 2yi (βˆ0 βˆ1 xi ) β0i 1nn Q0 2xi yi (βˆ0 βˆ1 xi ) . β1i 1Rewrite equation (1):nnnnnnyi βˆ0 βˆ1 xi 0i 1nni 1yi nβˆ0 βˆ1i 1nn1ni 1i 1nnxi 0 (pull β’s out of the sums)i 1n1nˆˆxi 0 (divide by n)yi β0 β1n i 1ȳ βˆ0 βˆ1 x̄ 0βˆ0 ȳ βˆ1 x. 2(2)

What does this mean about the least square line?Solve equation (2) for βˆ1nni 1nni 1nnxi yi nnxi βˆ0 i 1xi yi βˆ0nnxi βˆ1xi yi (ȳ βˆ1 x̄)nnxi2 0i 1nni 1nnxi2 βˆ1 0i 1i 1i 1nnnnxi βˆ1nnx2i 0 (using previous page)i 12nnnn1xi yi ȳxi βˆ1xi βˆ1x2i 0 (using definition of x̄)n i 1i 1i 1i 1 n n n1x i yi i 1 xii 1 yi n 2 n 1 (using definition of ȳ)βˆ1 i 1n2x (x)ii 1i 1 inConsider the expressions (which we’ll substitute in later):s̃xynnnnn1n (xi x̄)(yi ȳ) x i yi xi yi (skipping some steps)ni 1i 1i 1s̃xxnnnnn1n 222 (xi x̄) xi xi (just sub in x for y in previous eqn)ni 1i 1i 1where s̃xy is the sample covariance from Chapter 4 times n 1. Look whathappened:s̃xy.βˆ1 s̃xxPut it together with the previous result and we get these two little (but importantequations):s̃xyβˆ1 s̃xxβˆ0 ȳ βˆ1 x̄Now there is an easy way to find the LS line.3

**********Procedure for finding LS line************Given:x1 , · · · , xny1 , · · · , ynwe compute x̄, ȳ, s̃xy , s̃xy . Then computes̃xyβˆ1 s̃xxβˆ0 ȳ βˆ1 x̄.And the answer is:y β̂1 x β̂0 .Then if you want to make predictions you can use this formula - just plug in thex you want to make a prediction for.Let’s examine the goodness of fit. We will define SSE, SST, and SSR. Consider:SSE sum of squares error nn(yi ŷi )2i 1where ŷi β̂1 xi β̂0 , these are your model’s predictions. Recall βˆ0 and βˆ1 werechosen to minimize the sum of squares error (SSE).The total sum of squares (SST) measures the variation of y’s around their mean:SST sum of squares total nn(yi ȳ)2 s̃yy .i 1It turns out:nnSST (yi ȳ)2i 1nnnn2 (yi ŷi ) (ŷi ȳ)2 SSE SSRi 1i 1where SSR is called the “regression sum of squares.” This is the model’s variationaround the sample mean.4

Considerr2 SSR model’s variation “coefficient of determination.”SSTtotal variationIt turns out that r2 is the square of the sample correlation coefficient r Let’s show that. First simplify SSR: sxysxx syy .nnSSR (ŷi ȳ)2 i 1nnβˆ0 βˆ1 xi (βˆ0 βˆ1 x̄)2note that the β̂0 ’s cancel outi 1 βˆ1n2n2(xi x̄)2 βˆ1 s̃xx .(3)i 1And plugging this in,r2222s̃2xy s xxs̃xysxySSR βˆ1 s̃xx 2 ,SSTs̃yys̃xx s̃yys̃xx s̃yysxx syywhere we just cancelled a normalizing factor in that last step. So after we takethe square root, that shows r2 really is the square of the sample correlationcoefficient.2Back to SST SSR SSE and r2 SSRSST . If r 0.953, most of the totalvariation is accounted for by the regression, so the least square fit is a good fit.That is, r2 tells you how much better a regression line is compared to fitting witha flat line at the sample mean ȳ.Note: Compute r using this formula:from taking the square root, r sxysxx syy ,SSRSST .5so you do not get the sign wrong

To summarize, We derived an expression for the LS lineSxyy βˆ1 x βˆ0 , where βˆ1 and βˆ0 ȳ βˆ1 x. Sxx We showed that r2 SSRSST . Its value indicates how much of the total variation is explained by the regression.One more definition before we do inference. The variance σ 2 measures disper sion of the yi ’s around their means µi β0 β1 xi . An unbiased estimator of σ 2turns out to be nP2SSE2i 1 (yi ŷi )s n 2n 2We lose two degrees of freedom from estimating β0 and β1 , that is why we divideby n 2.Chapter 10.3 Statistical InferenceWe want to make inferences on the values of β0 and β1 . Assume again that wehave:Yi β0 β1 xi tiwhere ti is iid noise and is distributed as N (0, σ 2 ). Then it turns out that βˆ0 andβˆ1 are normally distributed with n2i 1 xiE(βˆ0 ) β0 , SD(βˆ0 ) σE(βˆ1 ) β1 , SD(βˆ0 ) nS̃xxσS̃xx22It also turns out that S , which is the random variable for s (n 2)S 2 χ2n 2 .2σ6P n2i 1 (yi ŷi )n 2obeys:

We can do hypothesis tests on β0 and β1 using βˆ0 and βˆ1 as estimators for themeans of β0 and β1 . We can uses Pn2si 1 xiSE(βˆ0 ) s, SE(βˆ1 ) (4)ns̃xxs̃xxas estimators for the SD’s. So we can ask for 100(1 α)% CI for β0 and β1 :β0 [βˆ0 tn 2,α/2 SE(βˆ0 ), βˆ0 tn 2,α/2 SE(βˆ0 )]β1 [βˆ1 tn 2,α/2 SE(βˆ1 ), βˆ1 tn 2,α/2 SE(βˆ1 )]Hypothesis tests (usually we do not test hypotheses on β0 , just β1 )H0 : β1 β10H1 : β1 β10 .Reject H0 at level-α ifβˆ1 β10 t tn 2,α/2 .SE(βˆ1 )***Important: If you choose choose β10 0, you are testing whether there is alinear relationship between x and y. If you reject β10 0, it means y depends on x.Note that when β10 0, t βˆ1.SE(βˆ1 )Analysis of Variance (ANOVA)We’re going to do this same test another way. ANOVA is useful for decomposingvariability in the yi ’s, so you know where the variability is coming from. Recall:SST SSR SSE SST is the total variability (df n 1 from constraintPni 1 (ŷi ȳ)2 ), SSR is the variability accounted for by regression and SSE is the error variability (df n 2). This leaves one df for SSR.A sum of squares divided by df is called a “mean square”.7

M SR SSR1 M SE SSEn 2“mean square regression”2 s Pnyi )2i 1 (yi ˆn 2“mean square error”Consider the ratioM SRSSR 2M SEs2β̂ s̃xx 12from (3)s!2β̂ 1 s/ s̃xx!2β̂1 t2SE(β̂1 )F from (4).Hey look, the square of a Tv r.v is an F1,v r.v. Actually that’s always true:Consider:X̄ µ0 ZX̄ µ0σ/ n pT pS/ nS 2 /σ 2S 2 /σ 2Z 2 /1T 2 2 2 F1,vS /σsince Z 2 χ21 andS2σ2 χ2νν .Therefore we have t2n 2,α/2 f1,n 2,α .How come α/2 turned into α?Back to testing:H 0 : β1 0H1 : β1 0We’ll reject H0 when F M SRM SE f1,n 2,α .Note: This is just the square of the previous test. We also do it this way becauseit is a good introduction to multiple regression in Chapter 11.8

ANOVA (Analysis of Variance)ANOVA table - A nice display of the calculations we did.Source of variationSSd.f.MSFRegressionSSR1MSR SSR1ErrorSSEn 2MSE SSEn 2TotalSSTn 1F pMSRMSEp-value for testThe pvalue is for the F-test for H0 : β1 0, H1 : β1 0.9

MIT OpenCourseWarehttp://ocw.mit.edu15.075J / ESD.07J Statistical Thinking and Data AnalysisFall 2011For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

Chapter 10 Notes, Regression and Correlation. Regression analysis allows us to estimate the relationship of a response variable to a set of predictor variables. Let. x. 1,x. 2, ··· x. n. be settings of x chosen by the investigator and y. 1,y. 2, ··· y. n. be the corresponding values of the response. Assume y. i is an observation of rv Y i .

Related Documents:

independent variables. Many other procedures can also fit regression models, but they focus on more specialized forms of regression, such as robust regression, generalized linear regression, nonlinear regression, nonparametric regression, quantile regression, regression modeling of survey data, regression modeling of

Part One: Heir of Ash Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18 Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 23 Chapter 24 Chapter 25 Chapter 26 Chapter 27 Chapter 28 Chapter 29 Chapter 30 .

TO KILL A MOCKINGBIRD. Contents Dedication Epigraph Part One Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 Part Two Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18. Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 23 Chapter 24 Chapter 25 Chapter 26

LINEAR REGRESSION 12-2.1 Test for Significance of Regression 12-2.2 Tests on Individual Regression Coefficients and Subsets of Coefficients 12-3 CONFIDENCE INTERVALS IN MULTIPLE LINEAR REGRESSION 12-3.1 Confidence Intervals on Individual Regression Coefficients 12-3.2 Confidence Interval

DEDICATION PART ONE Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 PART TWO Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18 Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 23 .

3 LECTURE 3 : REGRESSION 10 3 Lecture 3 : Regression This lecture was about regression. It started with formally de ning a regression problem. Then a simple regression model called linear regression was discussed. Di erent methods for learning the parameters in the model were next discussed. It also covered least square solution for the problem

Interpretation of Regression Coefficients The interpretation of the estimated regression coefficients is not as easy as in multiple regression. In logistic regression, not only is the relationship between X and Y nonlinear, but also, if the dependent variable has more than two unique values, there are several regression equations.

Its simplicity and flexibility makes linear regression one of the most important and widely used statistical prediction methods. There are papers, books, and sequences of courses devoted to linear regression. 1.1Fitting a regression We fit a linear regression to covariate/response data. Each data point is a pair .x;y/, where