529 views

Uploaded by Fanny Sylvia C.

- M20 Finding the Line of Best Fit - EkLeastSquares
- ASTM E1448/E1448M − 09
- Confidence Interval
- UJI LAGI
- e224o608
- LRT thesis example
- Regression Analysis Multiple Choice
- Introduction to Econometrics, Tutorial (5)
- Correlation 2
- Ets Regression
- Lecture_4.pdf
- Fundamentals of Statistics 3
- Cigarettes Case
- Linear Regression for Machine Learning
- Calibration Factors MEPDG
- Section-5
- Adecuacion Del Modelo en Reg
- 1 IJAEBM Volume No 1 Issue No 1 Factors Contributing to Perish Ability 001 005 2
- The Mathematical Derivation of Least Squares
- IS Plastic Money Matter for Consumer Buying Behavior? An Empirical Analysis from Pakistan

You are on page 1of 8

7, page 1

Chapter 7 The Simple Linear Regression Model

A common model for modeling the relationship between two quantitative variables is the linear

regression model. Don’t be fooled by the “linear” part: as we’ll see, linear regression models can

often be used to model relationships which aren’t linear.

Although we looked at the linear regression model last semester, we only looked at one part of it

– the part that models the mean response Y as a linear function of X. We’ll extend the model to

model the scatter of the individual data points around the line. The way we extend it makes the

linear regression model exactly like the ANOVA model, except that the explanatory variable is

quantitative instead of categorical.

We assume that at each X, the distribution of Y values is normal with mean β 0 + β1 X and

standard deviation σ.

µ (Y X ) = β 0 + β 1 X

σ (Y X ) = σ 2

Least squares estimates of β 0 and β 1 are denoted by β̂ 0 and βˆ1 . The predicted or fitted value

of Y for a particular X is:

µˆ (Y X ) = βˆ 0 + βˆ1 X .

By modeling the distribution of data points around the line, we can make inferences from the

sample data about the regression parameters.

Chap. 7, page 2

Case Study 7.2: Meat Processing and pH

ANOVAb

Sum of

Model Squares df Mean Square F Sig.

1 Regression 3.00647 1 3.00647 444.306 .000a

Residual .05413 8 .00677

Total 3.06060 9

a. Predictors: (Constant), Log(hours)

b. Dependent Variable: pH

Coefficientsa

Unstandardized Standardized

Coefficients Coefficients

Model B Std. Error Beta t Sig.

1 (Constant) 6.9836 .0485 143.897 .000

Log(hours) -.7257 .0344 -.991 -21.079 .000

a. Dependent Variable: pH

1 7.02 0 6.9836 0.0364

1 6.93 0 6.9836 -0.0536

2 6.42 0.69 6.4806 -0.0606

2 6.51 0.69 6.4806 0.0294

4 6.07 1.39 5.9777 0.0923

4 5.99 1.39 5.9777 0.0123

6 5.59 1.79 5.6834 -0.0934

6 5.8 1.79 5.6834 0.1166

8 5.51 2.08 5.4747 0.0353

8 5.36 2.08 5.4747 -0.1147

Chap. 7, page 3

Yi = β 0 + β 1 X i + ε i

n

∑ ( X i − X )(Yi − Y )

i =1

βˆ1 = n

, βˆ 0 = Y − βˆ1 X

∑ ( X i − X )2

i =1

∑ resi2

Estimate of σ is σˆ = = i =1 .

degrees of freedom n−2

Degrees of freedom = n - #parameters in the model for the means = n –2 for simple linear

regression

The ANOVA table gives the sum of squared residuals and the mean square residual which is

σˆ 2 = 0.00677 so σˆ = 0.0823.

The standard errors of β̂ 0 and βˆ1 represent the estimated standard deviations of the sampling

distributions of β̂ and βˆ . The sampling distributions refer to how the least squares estimates

0 1

would vary from sample to sample. We view the X i ’s as fixed; they are viewed to remain the

same from sample to sample while the Yi ’s are random.

1 1 X2

SE ( βˆ1 ) = σˆ , SE ( βˆ 0 ) = σˆ +

(n − 1) s X2 n (n − 1) s X2

Chap. 7, page 4

Example: Steer carcass data

Mean pH is estimated to decrease by .7257 for every one unit increase in Log(Hours). A one

unit increase in Log(Hours) is an increase in Hours by a factor of e ≈ 2.72. If we had used

Log10(Hours) instead, the interpretation would be easier: the slope represents the increase in

predicted pH for every 10-fold increase in time since slaughter.

A 95% confidence interval for β 1 is -.7257 ± t 8 (.975) (.0344) = -.7257 ± 2.306 (.0344) =

-.7257± .0793 = -.805 to -.646. So we are 95% confident that the decrease in mean pH is

between .646 and .805 for every 2.72-fold increase in time since slaughter.

The confidence interval can also be obtained from SPSS by choosing Options in the

Analyze…Regression…Linear window.

Coefficientsa

Unstandardized Standardized

Coefficients Coefficients 95% Confidence Interval for B

Model B Std. Error Beta t Sig. Lower Bound Upper Bound

1 (Constant) 6.984 .049 143.897 .000 6.872 7.096

Log(hours) -.726 .034 -.991 -21.079 .000 -.805 -.646

a. Dependent Variable: pH

The intercept β 0 represents the mean value of Y when X = 0. Usually, this is not particularly

meaningful. It is usually more meaningful to estimate the mean value of Y at particular values of

X which are meaningful and interesting, which is covered next.

Inferences about the slope of the regression line tell us about how big the change is in the mean

response (Y) for a 1-unit increase in X. Sometimes, we are interested in a confidence interval for

the mean response at a particular X, say X 0 . According to the model, the true mean of Y at X 0

0 0 1 0 ( 0

)

is µ (Y X ) = β + β X . The estimate of this is µˆ Y X = βˆ + βˆ X . The standard error of

0 1 0

µˆ (Y X 0 ) is

[( )]

SE µˆ Yˆ X 0 = σˆ

1 ( X 0 − X )2

+

n (n − 1) s X2

Note that the standard error is bigger for values of X 0 further from X and is smallest at X .

Chap. 7, page 5

Steer data: What is the estimated mean pH for carcasses 3 hours old? Give a 95% confidence

interval for the mean pH after 3 hours.

First, remember that the X variable in the regression model is log(Hours), so X 0 = log(3) =

( )

1.0986 (natural logarithm). Therefore, µˆ Y X 0 = 1.0986 = 6.9836 - .7257(1.0986) = 6.186.

To calculate the standard error, we need to compute X , the mean of the log(Hours) for the 10

data points and s X2 , the sample variance of log(Hours). From SPSS,

Descriptive Statistics

LogTime 10 1.19013 .796480 .63438

Valid N (listwise) 10

Therefore,

[( )]

SE µˆ Yˆ X 0 = 1.0986 = 0.0823

1 (1.0986 − 1.1901) 2

10

+

5.709

= 0.0262

and a 95% confidence interval for the mean pH among all steers after 3 hours is

If we want simultaneous confidence intervals at several different values of X, we can use

Bonferroni if the number of values is small. We can compute simultaneous confidence intervals

at every possible value of X using a Scheffe procedure. The result is a set of confidence bands

for the regression line. We are 95% (or whatever the chosen confidence level) that the

regression line lies entirely within the bands. Thus, we are 95% confident that the true means at

all possible values of X are all within the confidence band limits. The formula for the

simultaneous confidence bands is

βˆ 0 + βˆ1 X ± 2 F2,n−2 (1 − α ) SE[µ̂ (Y X )]

This is referred to as the Workman-Hotelling procedure. In practice, you compute these limits at

a large number of X values, then join the limits to make a smooth curve on the scatterplot. Some

programs will do this automatically, but SPSS will not. It will, however, plot the individual

confidence intervals for all X’s using the t coefficient rather than the Scheffe coefficient.

Steer data: for simultaneous 95% confidence intervals, F2,n −2 (.1 − α ) = F2,8 (.95) = 4.46. The

confidence interval for the mean pH after 3 hours is therefore (see above):

Chap. 7, page 6

The confidence intervals above is for the mean pH for all steer 3 hours after slaughter. A 95%

prediction interval for the pH of an individual steer 3 hours after slaughter is an interval in which

you are 95% confident that the pH of a particular steer will lie 3 hours after slaughter. A

confidence interval is for a mean; a prediction interval is for an individual.

Pred(Y X 0 ) = µˆ (Y X 0 ) = βˆ 0 + βˆ1 X 0

1 ( X 0 − X )2

SE[Pred(Y X 0 )] = σˆ 2 + SE[µˆ (Y X 0 )] = σˆ 1 +

2

+

n (n − 1) s X2

The standard error of prediction has two parts: the uncertainty due to estimating the mean

response at X 0 and the uncertainty due to the fact that individual observations vary around that

mean with standard deviation σ. Note that while the standard error of the mean response at X 0

goes to 0 as n increases, the standard error of prediction never goes to 0. An individual 100(1-

α)% prediction interval for the response of an individual at X 0 is

For the steer data, a 95% prediction interval for the pH of a particular steer 3 hours after

slaughter is:

1 (1.0986 − 1.1901) 2

6.186 ± 2.306 (.0823) 1 + + = 6.186 ± 2.306(.08637) = 6.186 ± .1992 =

10 5.709

5.99 to 6.39.

Simultaneous prediction intervals can be computed for several different X values using

Bonferroni, but there is no analog to the Working-Hotelling Scheffe-based procedure for

simultaneous prediction intervals at all possible values of X.

Chap. 7, page 7

SPSS commands

Analyze…Regression…Linear

Under Statistics button, you can choose to get confidence intervals for β 0 and β1 .

• Unstandardized Predicted Values

• Unstandardized Residuals

• Prediction Intervals: Mean: this isn’t a prediction interval, it’s an individual confidence

interval for the mean response at each X. SPSS does not compute the Working-Hotelling

simultaneous confidence intervals

• Prediction Intervals: Individual: this is a prediction interval for an individual response at

each X

To obtain predicted values, confidence intervals and prediction intervals for a value of X not in

the data set, add a case to the data with the desired X value, but leave the value of Y blank (it

should display a period which indicates a missing value).

SPSS can plot the individual confidence intervals for mean response and the prediction

intervals for an individual response. Create a scatterplot and double-click the plot to get into

Chart Editor. Select one of the data points and click on the “Add fit line” icon. Under the “Fit

line” tab you can select “Mean” or “Individual” confidence intervals. The first gives individual

(not simultaneous) confidence intervals for the mean response at each X and the second gives

prediction intervals.

Chap. 7, page 8

95% individual confidence intervals for the mean, 95% Working-Hotelling simultaneous

confidence bands for the mean, and 95% individual prediction intervals for a single response

(this graph is from S-Plus,; SPSS will only do the first and last of the three).

0.95 bands

7.0

6.5

y

6.0

5.5

x

- M20 Finding the Line of Best Fit - EkLeastSquaresUploaded byNora
- ASTM E1448/E1448M − 09Uploaded byLupita Ramirez
- Confidence IntervalUploaded byDebasmita Nandy
- UJI LAGIUploaded byAnnisa Aisyha Malik
- e224o608Uploaded byKarissa
- LRT thesis exampleUploaded byDave Albero
- Regression Analysis Multiple ChoiceUploaded byAugust Mshingie
- Introduction to Econometrics, Tutorial (5)Uploaded byagonza70
- Correlation 2Uploaded byJust Mahasiswa
- Ets RegressionUploaded bylawjames
- Lecture_4.pdfUploaded byIves Lee
- Fundamentals of Statistics 3Uploaded byDeepika Kohli
- Cigarettes CaseUploaded bySahil Nayar
- Linear Regression for Machine LearningUploaded byJohn Green
- Calibration Factors MEPDGUploaded byRicardo Flores
- Section-5Uploaded bysaiful khalid
- Adecuacion Del Modelo en RegUploaded byJairo Rueda
- 1 IJAEBM Volume No 1 Issue No 1 Factors Contributing to Perish Ability 001 005 2Uploaded byiserp
- The Mathematical Derivation of Least SquaresUploaded byPCCPatron
- IS Plastic Money Matter for Consumer Buying Behavior? An Empirical Analysis from PakistanUploaded byinventionjournals
- Drazin 1985 (2)Uploaded byDwi Narullia
- linear regression and geographically weighted regressionUploaded byapi-281352897
- Atp ExamplesUploaded bySameerbaskar
- Impact of m-banking.pdfUploaded byWadhwa Shobhit
- description: tags: NTM21041Uploaded byanon-153305
- LAMPIRAN 8,9.rtfUploaded byMulyadi
- Tata_vistara.pdfUploaded bydev1853
- dougherty_chap14Uploaded byfarsad1383
- Analysis of Hydrocarbon Data - Application of LASSO RegressionUploaded byAnirban Ray
- Lampiran 14 Hasil PenelitianUploaded byTry Noor Izzatul Fitri

- Chapter 11Uploaded byFanny Sylvia C.
- Non%26ParaBootUploaded byFanny Sylvia C.
- SampleSizeCalcRevisitedUploaded byFanny Sylvia C.
- Model- vs. design-based sampling and variance estimationUploaded byFanny Sylvia C.
- Hypo%26PowerLectureUploaded byFanny Sylvia C.
- ReviewChaps1-2Uploaded byFanny Sylvia C.
- Chapter 13Uploaded byFanny Sylvia C.
- Chapter 21Uploaded byFanny Sylvia C.
- Charles TaylorUploaded byFanny Sylvia C.
- Chapter 8Uploaded byFanny Sylvia C.
- ReviewChaps3-4Uploaded byFanny Sylvia C.
- Chapter 10Uploaded byFanny Sylvia C.
- Chapter 14Uploaded byFanny Sylvia C.
- Chapter 20Uploaded byFanny Sylvia C.
- Chapter 12Uploaded byFanny Sylvia C.
- Chapter5p2LectureUploaded byFanny Sylvia C.
- Data Modeling: General Linear Model &Statistical InferenceUploaded byFanny Sylvia C.
- Chapter 6Uploaded byFanny Sylvia C.
- The not so Short Introduction to LaTeXUploaded byoetiker
- Bio Math 94 CLUSTERING POPULATIONS BY MIXED LINEAR MODELSUploaded byFanny Sylvia C.
- Chapter 5Uploaded byFanny Sylvia C.
- Chapter 9Uploaded byFanny Sylvia C.
- GRM: Generalized Regression Model for Clustering Linear SequencesUploaded byFanny Sylvia C.
- Clustering in the Linear ModelUploaded byFanny Sylvia C.
- Good Article on Standard Error vs Standard DeviationUploaded byAshok Kumar Bharathidasan
- An Ova PowerUploaded byFanny Sylvia C.
- Chapter 7Uploaded byFanny Sylvia C.
- R Matrix TutorUploaded byFanny Sylvia C.
- Intro BootstrapUploaded byMichalaki Xrisoula

- Engineering Equipment FundamentalsUploaded byamarnetha
- Forces KS3Uploaded byJonathan Wilmshurst
- C++ Programming in Visual StudioUploaded bymuthu26219897884
- Universal Beams to BS4 Part 1Uploaded bykfctco
- A. DeBenedictis and A. Das- On a General Class of Wormhole GeometriesUploaded byRtpom
- Option Games_ the Key to Competing in Capital-Intensive IndustriesUploaded bywegr
- Multistage 2pUploaded byAHMED BAKR
- Blast Pattern – Mining and BlastingUploaded byBerk Toroslu
- 32LS340S, 340T, 3400, 3450Uploaded byKiran Veesam
- IJMECE160702Uploaded byAzmi
- Survey Wireless Mesh Networking Security Technology Threats 1657Uploaded byHemal
- refrigrationairconditioning-140423035013-phpapp01Uploaded bysuryavigne
- setup time problems1.pdfUploaded bybibincm
- Minimum Length of Thread Engagement Formula and Calculations Per FED-STD-H28_2B _ Engineers EdgeUploaded bygunjandpatel05
- Building Automation System Ass.Uploaded byMichael Manuel
- Diseño de BushingsUploaded byIWue
- Dell Latitude e6230 Spec SheetUploaded byoon
- PWM generation using 8051 microcontrollerUploaded byveeramaniks408
- LTEGeneral PresentationUploaded byballav
- 1 Computer HistoryUploaded byEdnalyn Cruzada Santos
- atoma-kiUploaded byAnonymous TxPyX8c
- Gen Chem Experiment 9Uploaded byjungwoohan72
- 003 Wittgenstein-Tractatus Logico-Philosophicus -- Gutenberg EditionUploaded byAlthea M. Suerte
- Raman Scattering Smiconducitres Magneticos2006Uploaded bycesarionfisica
- 400-101 prepaway dumpsUploaded byShan Malik
- APEF Electrochem Mc AnsUploaded byalyaa sheir
- Branch predictorUploaded bySanriomi Sintaro
- RuggedCom Products at a GlanceUploaded bybabiso
- whirlpool_awg_3200-1avs_150-2.pdfUploaded byUnni Chakyat
- Manual Simulasi Mikromagnetik dengan VampireUploaded byelrohm