Professional Documents
Culture Documents
1995
0
53807
52847
960
1996
1
55217
54729.3
487.7
1997
2
55209
56611.6
-1402.6
1998
3
55415
58493.9
-3078.9
1999
4
63100
60376.2
2723.8
2000
5
63206
62258.5
947.5
2001
6
63761
64140.8
-379.8
2002
7
65766
66023.1
-257.1
2003
8
standard deviation of e
2004
9
2005
10
2006
11
2007
12
2008
13
2009
14
2010
15
81081.5
2011
16
2012
17
2013
18
percentage error
2014
19
Correlation (x,y)
e^2
x-xs.mean
y-ys.mean (x-xs.mean)2
(y-ys.mean)2
921600
-3.5 -5628.13
12.25 31675791
237851.29
-2.5 -4218.13
6.25 17792579
1967286.76
-1.5 -4226.13
2.25 17860133
9479625.21
-0.5 -4020.13
0.25 16161405
7419086.44
0.5 3664.875
0.25 13431309
897756.25
1.5 3770.875
2.25 14219498
144248.04
2.5 4325.875
6.25 18713195
66100.41
3.5 6330.875
12.25 40079978
21133554 RSS-Residual sum of squares
1876.7683 RSE from residual sum of squares
sum(x-xmean)
0.0315768 3.158%
incom
70000
60000
f(x) = 1882.25x +
R = 0.87563661
50000
40000
Axis Title
30000
20000
10000
0
Title
income
income
Linear (income )
70000
60000
50000
40000
30000
20000
10000
0
4
Axis Title
years
1995
1996
1997
1998
1999
2000
2001
2002
varience -In general, 2 is not known, but can be estimated from the data. This estimate is known a
If SE( 1) is small, then even relatively small values of 1 may provide strong eviden
In contrast, if SE( 1) is large, then 1 must be large in absolute value in order for us
t-statistics
t-statistics
The tdistribution has a bell shape and for values of n greater than approximately 30 it i
it is a simple matter to compute the probability of observing any value equal to |t| or la
If we see a small p-value then we can infer that there is an association between the pre
that is, we declare a relationship to exist between X and Y if the p-value is small enou
The RSE is an estimate of the standard deviation of . Roughly speaking, it is the avera
If the predictions obtained using the model are very close to the true outcome values
On the other hand, if yi is very far from yi for one or more observations, then the RSE
The RSE provides an absolute measure of lack of t of the model to the data. But since
TSS measures the total variance in the response Y , and can be thought of as the amount of variab
In contrast, RSS measures the amount of variability that is left unexplained after performing the re
Hence, TSSRSS measures the amount of variability in the response that is explained (or removed
we might be able to use r = Cor(X,Y) instead of R 2 in order to assess the t of the linear
In fact, it can be shown that in the simple linear regression setting, R 2 = r2. In other wo
In the next section we will discuss the multiple linear regression problem, in which we
The concept of correlation between the predictors and the response does not extend au
since correlation quanties the association between a single pair of variables rather tha
correlation is denoted by 'r ' and it is also called pearson correlation coeffieicent
e average of 0 and 1 s over many data sets will be very close to 0 and 1 ,
mate or overestimateof 0 and 1. How far o will that single estimateof 0 and 1 be?
0 and 1.
2052.956436661 45.30956
14031839605659 3745910
1876.768250655
strong evidence that 1 = 0, and hence that there is a relationship between X and Y .
n order for us to reject the null hypothesis
tween the predictor and the response. We reject the null hypothesis
is small enough
it is the average amount that the response will deviate from the true regression line.
ome valuesthat is, if yi yi for i =1,...,nthen RSE will be small, and we can conclude that th
then the RSE may be quite large, indicating that the model doesnt t the data well.
ata. But since it is measured in the units of Y , it is not always clear what constitutes a good RSE
e^2
921600
237851.29
1967286.76
9479625.21
7419086.44
897756.25
144248.04
66100.41
21133554 RSS
(x-xaverage)^2
12.25
6.25
2.25
0.25
0.25
2.25
6.25
12.25
42 summation
0 and 1 ,
eof 0 and 1 be?
tween X and Y .
regression line.
tical
onse.
R2 lls this role.
x-xs.mean
e (residual value) e^2
960
921600
487.7 237851.29
-1402.6 1967286.76
-3078.9 9479625.21
2723.8 7419086.44
947.5 897756.25
-379.8 144248.04
-257.1
66100.41
21133554
1876.7683
y-ys.mean (x-xs.mean)2
-3.5
-2.5
-1.5
-0.5
0.5
1.5
2.5
3.5
-5628.13
-4218.13
-4226.13
-4020.13
3664.875
3770.875
4325.875
6330.875
12.25
6.25
2.25
0.25
0.25
2.25
6.25
12.25
42
sum
70000
f(x) = 1882.25x + 52
R = 0.87563661
60000
50000
40000
0.0315767528
3.158%
0.9357545672
0.87563661 equals to R squared
Axis Title
30000
20000
10000
0
(y-ys.mean)2
31675791
17792579
17860133
16161405
13431309
14219498
18713195
40079978
933887
sum
19698.4375
10545.3125
6339.1875
2010.0625
1832.4375
5656.3125
10814.6875
22158.0625
79054.5
11293.5
income
income
Linear (income )
70000
f(x) = 1882.25x + 52847.25
R = 0.87563661
60000
50000
40000
30000
20000
10000
0
4
Axis Title
Type of Repair
Electrical
Mechanical
Electrical
Mechanical
Electrical
Electrical
Mechanical
Mechanical
Electrical
Electrical
SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
0.7816138423
0.6109201985
0.5622852234
0.713792687
10
ANOVA
df
Regression
Residual
Total
SS
1
8
9
6.4
4.076
10.476
4.62
-1.6
Standard Error
0.319217794
0.4514421336
Residuals
-0.12
-0.02
0.18
-1.22
-0.12
0.28
-0.42
0.18
-0.22
1.48
Coefficients
Intercept
Coded
RESIDUAL OUTPUT
Observation
1
2
3
4
5
6
7
8
9
10
MS
F
Significance F
6.4 12.561334642 0.0075733033
0.5095
t Stat
P-value
Lower 95%
Upper 95%
Lower 95.0%
14.4728774115 5.08347E-007
3.883882447 5.356117553 3.883882447
-3.5441973198 0.0075733033 -2.6410274269 -0.5589725731 -2.6410274269
Standard Residuals
-0.178313988
-0.029718998
0.267470982
-1.8128588779
-0.178313988
0.416065972
-0.624098958
0.267470982
-0.326908978
2.1992058518
Upper 95.0%
5.356117553
-0.5589725731
Repair time(Hours)
2.9
3
4.8
1.8
2.9
4.9
4.2
4.8
4.4
4.5
3.82 (x-xbar)
(x-xbar)2
(x-xbar )2
(y-ybar)
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
2.5 (y-ybar)
-0.92
-0.82
0.98
-2.02
-0.92
1.08
0.38
0.98
0.58
0.68
### -2.66454E-015
Beta1
6.25
-0.92
-0.82
0.98
-2.02
-0.92
1.08
0.38
0.98
0.58
0.68
-2.66454E-015