Professional Documents
Culture Documents
1
Learning Objectives
1. Describe the Linear Regression Model
2. State the Regression Modeling Steps
3. Explain Ordinary Least Squares
4. Compute Regression Coefficients
5. Understand and check model assumptions
6. Predict Response Variable
7. Comments of Python Output
2
Models
3
What is a Model?
Non-Math/Stats Model
4
What is a Math/Stats Model?
1. Often Describe Relationship between Variables
2. Types
- Deterministic Models (no randomness)
5
Deterministic Models
1. Hypothesize Exact Relationships
2. Suitable When Prediction Error is Negligible
3. Example: Body mass index (BMI) is measure of
body fat based
7
Types of
Probabilistic Models
Probabilistic
Models
8
Regression Models
9
Regression Models
• Relationship between one dependent variable and
explanatory variable(s)
• Use equation to set up relationship
• Numerical Dependent (Response) Variable
• 1 or More Numerical or Categorical Independent (Explanatory)
Variables
• Used Mainly for Prediction & Estimation
10
Regression Modeling Steps
• 1. Hypothesize Deterministic Component
• Estimate Unknown Parameters
• 2. Specify Probability Distribution of Random
Error Term
• Estimate Standard Deviation of Error
• 3. Evaluate the fitted Model
• 4. Use Model for Prediction & Estimation
11
Model Specification
12
Specifying the deterministic
component
• 1. Define the dependent variable and
independent variable
Simple
Simple Multiple
Simple Multiple
Linear
Simple Multiple
Non-
Linear
Linear
Simple Multiple
Non-
Linear Linear
Linear
Simple Multiple
Non- Non-
Linear Linear
Linear Linear
Simple Multiple
Non- Non-
Linear Linear
Linear Linear
Y i 0 1 X i i
Dependent Independent (Explanatory)
(Response) Variable
Variable (e.g., Years s. serocon.)
(e.g., CD+ c.)
Population & Sample Regression
Models
EPI 809/Spring 2008 30
Population & Sample Regression
Models
Population
Unknown
Relationship
Yi 0 1X i i
EPI 809/Spring 2008 31
Population & Sample Regression
Models
Population Random Sample
Unknown
Relationship
Yi 0 1X i i
EPI 809/Spring 2008 32
Population & Sample Regression
Models
Population Random Sample
Unknown
Yi 0 1X i i
Relationship
Yi 0 1X i i
EPI 809/Spring 2008 33
Population Linear Regression Model
Yi 0 1 X i i Observedv
alue
Y
i = Random error
E YX 0 1 X i
Observed value
EPI 809/Spring 2008 34
Sample Linear Regression Model
Yi 0 1X i i
Y
^i = Random
error
Unsampled
observation
Yi 0 1X i
X
Observed value
EPI 809/Spring 2008 35
Estimating Parameters:
Least Squares Method
Y
60
40
20
0 X
0 20 40 60
EPI 809/Spring 2008 37
Thinking Challenge
Y
60
40
20
0 X
0 20 40 60
EPI 809/Spring 2008 38
Thinking Challenge
How would you draw a line through the points?
How do you determine which line ‘fits best’?
Slope changed
Y
60
40
20
0 X
0 20 40 60
Intercept unchanged
EPI 809/Spring 2008 39
Thinking Challenge
How would you draw a line through the points?
How do you determine which line ‘fits best’?
Slope unchanged
Y
60
40
20
0 X
0 20 40 60
Intercept changed
EPI 809/Spring 2008 40
Thinking Challenge
How would you draw a line through the points?
How do you determine which line ‘fits best’?
Slope changed
Y
60
40
20
0 X
0 20 40 60
Intercept changed
EPI 809/Spring 2008 41
Least Squares
• 1. ‘Best Fit’ Means Difference Between Actual Y
Values & Predicted Y Values Are a Minimum. But
Positive Differences Off-Set Negative ones
ˆ
n n
Yi Yˆi
2
2
i
i 1 i 1
ˆ
n n
Yi Yˆi
2
2
i
i 1
• 2. LS Minimizes the Sum ofi the
1 Squared
Differences (errors) (SSE)
i 1
Y Y2 0 1X 2 2
^4
^2
^1 ^3
Yi 0 1X i
X
EPI 809/Spring 2008 45
Coefficient Equations
• Prediction equation
yˆi ˆ0 ˆ1xi
• Sample slope
SS xy xi x yi y
ˆ1
SS xx i x x 2
• Sample Y - intercept
ˆ0 y ˆ1x
EPI 809/Spring 2008 46
Derivation of Parameters (1)
• Least Squares (L-S):
Minimize squared error
n n
yi 0 1 xi
2 2
i
i 1 i 1
yi 0 1 xi
2 2
0 i
0 0
2 ny n0 n1 x
ˆ0 y ˆ1x
EPI 809/Spring 2008 47
Derivation of Parameters (1)
• Least Squares (L-S):
Minimize squared error
i2 yi 0 1 xi
2
0
1 1
2 xi yi 0 1 xi
2 xi yi y 1 x 1 xi
1 xi xi x xi yi y
1 xi x xi x xi x yi y
ˆ SS xy
1
SS xx
EPI 809/Spring 2008 48
Computation Table
2 2
Xi Yi Xi Yi XiYi
2 2
X1 Y1 X1 Y1 X1 Y1
2 2
X2 Y2 X2 Y2 X2 Y2
: : : : :
2 2
Xn Yn Xn Yn Xn Yn
Xi Yi Xi2
Yi2
XiYi
Birthweight
4
3
2
1
0
0 1 2 3 4 5 6
Estriol level
X i Yi
n n
n
X iYi i 1 1510
i 1
37
n
ˆ1 i 1
5 0.70
n
2
15
2
Xi 55
5
X i2
n
i 1
i 1 n
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
^
^
0 1
M. Yield (lb.)
10
8
6
4
2
0
0 5 10 15
2 2
Xi Yi Xi Yi XiYi
4 3.0 16 9.00 12
6 5.5 36 30.25 33
10 6.5 100 42.25 65
12 9.0 144 81.00 108
32 24.0 296 162.50 218
X i Yi
n n
n
X iYi
i 1 i 1
218
3224
n
ˆ1 i 1
4 0.65
n
2
32
2
Xi 296
4
X i2
n
i 1
i 1 n
• 2. Y-Intercept (0)
• Average Milk yield (Y) Is Expected to Be 0.8 lb. When
Food intake (X) Is 0 ^