Chapter 21 Business Statistic

QBM101 Module 4 - CHAPTER 21(19 Pages)
Page 1
Chapter 21 : Simple Linear Regression and correlation.

The regression analysis is used to predict the value of one variable(Dependent) on the
basis of other variables(Independent).
This technique may be the most commonly used statistical procedure because, almost all
companies and government institutions forecast variables. Example include weather
forecasting, stock market analyses, sales predictions, crop prediction and sports
prediction and oil price prediction.
Prediction are made in all areas.
Some predictions are more accurate than others due to the strength of the
relationship. That is, the stronger the relationship is between variables, the more
accurate the prediction is. Eg Prediction of temperature in degree F based on
degree C using equation F=32 +
9
C is 100% accurate because this is an area of
5
pure science.
Regression analysis provide a Best Fit mathematical equation for the values of
the two variables using method of least square
21.1
Correlation analysis measures the strength of the relationship

Dependent and independent variables
The Dependent Variable(Y) is the variable being predicted or estimated, It is also
referred to as the Response Variable
The Independent Variable(X) provide the basis for estimation. It is the also
referred to as the predictor variable.
21.2
Regression Analysis model of the Population
The regression equation: Y = 0 + 1X + , where:
Y is the average predicted value of Y for any given X. It is called the dependent
or response variable. It is refers to as the average predicted or Estimated value of
Y for any given value of X.
Page 2
X is called the independent or predictor variable. It provides the basis for

estimation.
o 0 is the Y-intercept, or the estimated Y value when X = 0
o 1 is the slope of the line, or the average change in Y for each change of one
unit in X.
o is epsilon or the random error phrase.
Regression equation of the sample:
^
y = b o + b1 x
21.3
where
b1 =
SS xy
SS x
and
b o = y - b1 x
Assumption underlying Simple Linear Regression
The four assumptions of regression(known by

the acronym LINE) are as follows:
Linearity
Independence of Errors
Normality of Errors
Equal Variance(also called homoscedaticity)
21.3.1 Linearity
It states that the relationship between variables is linear
21.3.2 Independence of Errors
The errors variables are independent from one another.
21.3.3 Normality of Errors
The error variable
e is normally distributed at each value of x.
21.3.4 Equal Variance(Homoscedasticity)

The variance of the error variable is a constant(condition of homoscedasticity) for all
values of X.
Page 3
Example 21.1
A consultant was employed to study the relationship between annual sales and annual
advertising expenditure of business firms in order to build a model to predict annual
sales based on annual advertising expenditures.
A simple regression analysis of the relationship between the annual sales($ million) and
annual advertising expenditure($ thousand) of a random sample of 30 firms is shown
below.
Raw Data:
NO
Annual Advertising Expenditures($000)
Annual Sales($million)
22
4.5
28
5.1
31
5.3
31
5.4
35
5.9
43
43
6.5
48
6.6
43
6.6
10
49
6.8
11
56
6.9
12
52
13
57
14
58
7.5
15
61
8.7
16
60
8.9
17
62
9.2
18
66
9.5
19
64
9.6
20
69
10
21
67
10.2
22
72
10.4
23
75
10.5
24
78
10.8
25
77
11
26
82
11.2
Page 4
27
81
11.5
28
83
11.7
29
85
12
30
89
12.4
a.
Determine the Least square Regression equation to predict annual sales based
on annual advertising expenditures.
b.
Calculate the standard error of estimate.
c.
Calculate the coefficient of determination
In order to calculate values of a,b and c, need to work out the tables of values below:
X
22
28
31
31
35
43
43
48
43
49
56
52
57
58
61
60
62
66
64
69
67
72
75
78
77
82
81
83
y
4.5
5.1
5.3
5.4
5.9
6
6.5
6.6
6.6
6.8
6.9
7
7
7.5
8.7
8.9
9.2
9.5
9.6
10
10.2
10.4
10.5
10.8
11
11.2
11.5
11.7
x2
y2
484
784
961
961
1225
1849
1849
2304
1849
2401
3136
2704
3249
3364
3721
3600
3844
4356
4096
4761
4489
5184
5625
6084
5929
6724
6561
6889
20.25
26.01
28.09
29.16
34.81
36.00
42.25
43.56
43.56
46.24
47.61
49.00
49.00
56.25
75.69
79.21
84.64
90.25
92.16
100.00
104.04
108.16
110.25
116.64
121.00
125.44
132.25
136.89
Xy
99.00
142.80
164.30
167.40
206.50
258.00
279.50
316.80
283.80
333.20
386.40
364.00
399.00
435.00
530.70
534.00
570.40
627.00
614.40
690.00
683.40
748.80
787.50
842.40
847.00
918.40
931.50
971.10

85
89
1767
12
12.4
254.7
7225
7921
114129
144.00
153.76
2326.17
Page 5
1020.00
1103.60
16255.90
x = 1767 y = 254.7 xy =16255.90

2
x = 114129
2
y = 2326.17
From the above, the following are calculated
( x) 2
1767 2
SS x = x = 114129 = 10052.70
n
30
2
( y ) 2
254.7 2
SS y = y = 2326.17 = 163.767
n
30
2
SS xy = xy -
(1767)(254.7)
x y
= 16255.9 = 1254.07
n
30
^
a.
The required regression equation is

^
Where
b1 =
^
And
SS xy
SS x
^
y = b o + b1 x
1254.07
= 0.124749569 0.12475
10052.7
b o = y - b1 x =
254.7
1767
- 0.124749569
30
30
= 9.49 7.347749614 = 2.142250386

^
Therefore Least square regression equation is
y = 2.1422504 + 0.12475x
Where x is annual advertising expenditure in $000

And y is annual sales in $million.
b.
Standard error of estimate is the measurement of variation of the actual values of

y about the regression line and is written as
Se =
SSE
n-2
Where SSE =
SS y -
2
SS xy
SS x
Page 6
1254.07 2
= 163.767 10052.7
= 163.767 -156.445 = 7.322

Therefore
c.
SS e =
SSE
7.322
=
= 0.5114 $million
n-2
30 - 2
Coefficient of determination (R ) = 1 -
2
SS xy
SS x SS y
1254.07 2
1572691.565
1=
= 0.9553
(10052.70)(163.767) 1646300.521
Unlike the standard error of estimate which is measured by the units of

measurement of y, the coefficient of determination do not have any units of
measurement and it measures the proportion of variation of y which have been
explained by the amount of variation of x.
The values of the coefficient ranges from 0 to 1.
21.4 Interpretation of computer output
You will notice that the above calculations are very tedious. The EXCEL data analysis
program can be used to generate the above outputs in just a split second. Therefore
students are encouraged to learn to use the Data Analysis program although students will
not be tested on using the program.
The emphasis is to test students on the interpretation of the outputs generated by the
EXCEL program
To illustrate the above, the following outputs are generated by the EXCEL program.
1.
Scatter Plot
2.
Summary Output
3.
ANOVA Table
4.
Residual Plot
5.
Histogram of the residual
6.
Residual output
Page 7
Example 21.2
A consultant was employed to study the relationship between annual sales and annual
advertising expenditure of business firms in order to build a model to predict annual sales
based on annual advertising expenditure. A simple linear regression analysis of the
relationship between the sales ($ millions) and advertisement expenditure ($ thousands)
of a random sample of 30 firms was performed using EXCEL. The summary output and
charts for this analysis follow.
Annual Sales($millions)
Scatter plot showing Annual Sales($'million) vs sdvertising

expenditures($'000)
14
12
10
8
6
4
2
0
20
30
40
50
60
70
80
90
100
Annual Advertising expenditures($'000)
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.977388491
R Square
0.955288263
Adjusted R Square
0.953691415
Standard Error
0.511381429
Observations
30
ANOVA
Df
Regression
Residual
Total
Intercept
Annual advertising
expenditure($'000)
MS
156.4447
0.261511
F
598.2338
Significance
F
1.94824E-20
1
28
29
SS
156.444693
7.322307042
163.767
Coefficients
1.142250341
Standard
Error
0.314587143
t Stat
3.63095
P-value
0.00112
Lower 95%
0.497847798
Upper 95%
1.786652883
0.12474957
0.005100392
24.45882
1.95E-20
0.11430189
0.13519725
Page 8
Annual advertising expenditure($'000)

Residual Plot
Residuals
1
0.5
0
-0.5
20
40
60
80
100
-1
-1.5
Frequency
Annual advertising expenditure($'000)
Histogram of residuals
15
10
5
0
-1.15 -0.85 -0.55 -0.25 0.05

Residuals
0.35
0.65
RESIDUAL OUTPUT
a
Interpret the scatter plot identifying the independent and dependent variables
In the scatter plot, annual advertising expenditures in $000 is the independent variable
and annual sales in $million is the dependent variable.
There is a positive linear relationship between advertising expenditures and sales
meaning that when advertising expenditures increased, sales is expected to increase and
vice versa.
b
Write down the regression equation and interpret the slope coefficient
^
The regression equation is y = 1.14225 + 0.12475x

Where x is the annual advertising expenditures in $000
And y is the annual sales in $million
1.14225 is the y-intercept and
0.12475 is the slope coefficient which means that when annual advertising expenditure
Page 9
increase by $1,000, annual sales is expected to increase by $0.12475($million)

c.
Estimate the annual sales with the following annual advertising expenditure and comment
on the reliability of the estimates
i
ii.
annual advertising expenditure of $60,000

annual advertising expenditure of $100,000
^
i.
y = 1.14225 + 0.12475(60) = 8.62725($' million )

The estimate is quite reliable because $60,000 is within the range of the data
^
ii.
y = 1.14225 + 0.12475(100) = 13.61725($' million)

The estimate is not reliable because $100,000 is outside the range of the data.
d.
What is the value of the coefficient of determination and interpret its meaning
The coefficient of determination is the R square(if not given) can be calculated by
either (Multiple R)2 i.e 0.9773884912 = 0.955288262
or
R2 =
SS Re gression 156.444693
=
= .955288262 0.9553
SSTotal
163.767
It means that approximately 95.53% of the variation in annual sales have been
explained variation in annual advertising expenditures. There are still 4.47% of
variation in annual sales that have not been explained by variation in annual advertising
expenditure. Therefore the above regression model to predict sales when given
advertising expenditure is useful.
e.
What is the value of the standard error of estimate and interpret its meaning?
The standard error of estimate, S e is 0.511381429 $million.
It measure the fluctuations of the actual value of y about the regression line.
f.
What is the value of the coefficient of correlation and interpret its meaning:
The coefficient of correlation ranges from -1 to +1. In the above question the coefficient
of correlation is 0.977388491
It means that there is a high degree of positive linear correlation between annual
advertising expenditures and annual sales.
Note; The Multiple R is not the correlation coefficient. We need to decide whether
correlation is positive or negative because the Multiple R is always given as a positive
figure. In the above example, since the slope coefficient is positive, correlation
coefficient is also positive.

g.
Page 10
Test whether there is any significant linear relationship between annual advertising
expenditures and annual sales at the 5% level of significance?
H o : b 1 = 0 There is no sig linear relationship between adv. exp and sales

H 1 : b 1 0 There is sig linear relationship between adv exp and sales
^
Test statistic :
t=
b1 - b1
S^
b1
p-value = 1.95E -20 = 1.95 (10-20 )
Since p-value(0) < 0.05, reject H0
h.
There is significant linear relationship between annual advertising expenditure and

annual sales at the 5% significant level
Set up the 95% confidence interval estimate of the slope coefficient between annual
advertising expenditure and annual sales.
^
CI ( b1 ) = b1 t 0.05
, 28
(S ^ )
b1
= 0.12475
2.048(0.0051)
= 0.12475
0.01044 = 0.114302 , 0.135197
95% CI for population slope is 0.114302 < b1 < 0.13519
i.
Which diagrams can be used to check the assumption of normality of the error
Variable and constant variance of the error variable?
To determine whether the error variable is normally distributed, we have to examine the
histogram of the error variable.
The histogram given indicate that it is not bell shape and therefore the assumption of
normality of the error variable has been violated.
To evaluate the condition of constant variance of the error variable, we have to examine
the residual plot. The residual plot given indicate that with increasing value of x, the
residual follows a pattern of increasing and decreasing values.
Therefore the assumption of constant variance of the error variable(homoscedascity) has
been violated. Or there is condition of heteroscedasticity.
Page 11
Example 21.3
A study was conducted to study the relationship between the marks scored in the statistics final
examination and the marks scored in the accounting final examination. Data were collected from a
random sample of 20 students with the following results.
Final examination marks scored in statistics and accounting
Observation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Statistics
10
15
18
24
35
38
45
48
50
55
65
68
71
76
82
85
88
89
92
94
Accounting
13
12
22
25
30
36
48
44
54
50
62
66
69
74
85
87
86
92
90
96
a.i.
Calculate the regression coefficients and hence write down the regression equation to predict
accounting marks.
ii.
Calculate the standard error of estimate indicating the units of measurement.
iii.
Calculate the coefficient of correlation.
b.
MS EXCEL was used to generate the following linear regression outputs and appropriate charts.
Page 12
SUMMARY OUTPUT
Multiple R
A
R Square
0.9874
Adjusted R Square
0.9868
Standard Error
3.1845
Observations
20
ANOVA
Regression
Residual
Total
Intercept
Statistics
Df
1
B
19
SS
14404.41397
182.53603
14586.95
Coefficients
-0.2935
0.9990
MS
14404.41397
10.14089
Standard Error
1.6799
0.0265
RESIDUAL OUTPUT
Observation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Predicted Accounting
9.69664
14.69172
17.68876
23.68286
34.67204
D
44.66220
47.65925
49.65728
54.65236
64.64252
67.63957
70.63662
75.63170
81.62580
84.62285
87.61989
88.61891
91.61596
93.61399
Residuals
3.30336
-2.69172
4.31124
1.31714
-4.67204
-1.66909
3.33780
-3.65925
4.34272
E
-2.64252
-1.63957
-1.63662
-1.63170
3.37420
2.37715
-1.61989
3.38109
-1.61596
2.38601
F
1420.429
t Stat
-0.1747
C
Significance F
1.40343E-18
P-value
0.8632
1.40343E-18
Page 13
Scatter Diagram Of Marks Of Accounting And Statistics
Accounting Marks
120
100
80
60
40
20
0
0
20
40
60
80
100
Statistics Marks
i. Interpret the scatter diagram, indicating the dependent and the independent variable.
Statistics mark is the independent variable and Accounting mark is the dependent variable.
There is a positive linear relationship between statistics and accounting meaning that when
statistics mark increases, accounting mark will also increase.
ii. Use the output provided to write down the regression equation?
^
y = -0.2935 + 0.999 x
where x is the statistics marks and y is the accounting marks
iii. Interpret the slope coefficient.

0.999 is the slope coefficient which means that when statistics mark increase by 1, accounting
mark will increase by 0.999(approximately 1).
iv. Find the missing values of A, B, C, D and E in the given computer outputs.
A = 0.9874 = 0.9937
B = 19 1 = 18
C=
0.999
= 37.70
0.0265
D=
y = -0.2935 + 0.999(38) = 37.6685
OR
y = y - R = 36 - ( -1.66909) = 36 + 1.66090 = 37.66909
Page 14
E=R=
y - y = 50 - 54.65236 = -4.6523
v. What is the value of the coefficient of determination? Interpret its meaning?

Coefficient of determination, R square = 0.9874
It means that approximately 98.74% of the variation in accounting marks have been explained by
variation in statistics marks.
Therefore 1.26% of the variation in accounting mark have not been explained by variation in
statistics mark.
vi. Can we check the assumptions of constant variance of the error variable and normality of the
distribution of the error variable based on the outputs given above and if so, check whether the
assumption(s) is(are) satisfied?
The assumption of normality of the error variable can be checked by looking at the histogram of
the error variable. Since this question did not provide the histogram the assumption of normality
of the error variable cannot be checked.
The other assumption of constant variance of the error variable can be checked by looking at the
residual plot. In the residual plot as statistics mark increase, there is no pattern on the movement
of the residuals. Therefore the assumption of constant variance of the error variable has not been
violated.
vii. At the 1% level of significance, is there evidence of a linear relationship between the final
examination marks of statistics and accounting?
H o : b1 = 0 , There is no sig relationship between statics and accounting marks
H1 : b1 0 , There is sig relationship between statistics and accounting marks

^
Test statistics ,
t=
b1 - b1
S^
b1
p-value (1.40343 E-18) =1.40343(10-18) = 0 < 0.01, reject Ho

Therefore there is sig linear relationship between statistics and accounting marks.
Page 15
Exercise 1
In a small fishing town the daily catches were sold locally. Recently, the fishermen have
complained about price fluctuations and reduced catches and hence requested the
government to introduce a minimum fish price. It was suspected that fluctuations in fish
prices were related to fish catches. A statistician was asked to study the relationship
between daily prices and daily catches in the fishing town. A random sample of 30 weeks
were selected and the prices of fish in ($) and the daily catches in kilograms were
recorded.
The prices range from a low of $3.00 to a high of $17.50 per kg.
The daily catches range from a low of 300 kg. to a high of 1,000 kg.
The sample data were analyzed using EXCEL, and the summary output and appropriate
charts were generated and provided below. However because of the printer malfunction,
some of the data values are missing and they are indicated as A, B, C and D.
SUMMARY OUTPUT
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
Regression
Residual
Total
Prices($)
Intercept
Average Daily
Catch(kg.)
A
0.9646
0.9634
0.8426
30
df
SS
MS
Significance
F
541.9942
0.7100
7.32626E-22
Coefficients
24.5698
541.9942
19.8808
561.875
Standard
Error
0.5406
t Stat
45.4453
P-value
8.8279E-28
-0.0222
0.0008
7.326E-22
28
29
Scatterplot showing relationship between average

daily catch (kg.) and price ($)
20.00
15.00
10.00
5.00
0.00
0
200
400
600
800
1000
Average Daily Catch (kg.)
1200
Page 16
Average daily catch residual Plot
Residuals
2
0
0
500
1000
1500
-2
Average Daily Catch (kg.)
Frequency
Histogram of residuals
8
6
4
2
0
-1.9
-1.5
-1.1
-0.7
-0.3
0.1
0.5
0.9
1.3
Residuals
Use the EXCEL output provided to answer the following questions.
(a)
Find the missing values of A, B, C, and D in the given computer output.
(b)
Interpret the scatter plot and identify the dependent and the independent variables.
Independent variable is average daily catch and price is the dependent cariable.
The scatter plot shows that there is a negative linear relationship between daily
catch and price. It means that when daily catch increase, price will decrease.
(c)
Write down the regression equation?

Where x is daily catch and y is the price
(d)
(e)
Interpret the slope coefficient.

What is the value of the coefficient of determination? Interpret its meaning?
Coefficient of determination is 0.9646.
Page 17
It means that 96.46% of variation in price has been explained by variation in daily
catch. Therefore there are still 3.54% of unexplained variation.
(f)
Predict the price for a given day with a daily catch of 850 kg. Is your estimate
reliable?
(g)
Is there any linear relationship between daily catch and price at the 5%
significance level?
(h)
Which graph is used to check the assumption of constant variance of the error
variable? Is there any evidence that this assumption has been violated?
(i)
Which graph is used to check the assumption that the error variable must be
normally distributed? Comment on whether this assumption has been violated.
Exercise 2
A real estate company in a city would like to establish a model to predict the monthly
rent (RM) based on the size of the apartments measured in square feet(sq. ft.) in a
selected city.
A random sample of 15 apartments in the selected city was selected and the information
relating to monthly rent in RM and size in square feet were recorded.
MS EXCEL was used to produce the following charts and diagrams with some missing
figures labeled a to e.
Observation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Monthly Rent(RM)
1200
1700
1200
1500
850
1700
1500
900
650
1150
1400
1500
2200
1800
1400
Size(square feet)
850
1450
1085
1232
718
1485
1136
500
300
956
1100
1285
1985
1800
1400
Page 18
SUMMARY OUTPUT
Multiple R
0.9656
R Square
a
Adjusted R Square
0.9271
Standard Error
108.4298
Observations
15
ANOVA
Df
Regression
Residual
Total
Intercept
Size(square. feet)
SS
1 2106492
b 152841.2
14 2259333
Coefficients
390.56
0.8559
Standard
Error
78.81
0.0639
Predicted
Monthly
Rent(RM)
1118.071
1631.61
1319.207
1445.024
1005.093
1661.567
1362.858
818.5066
647.3269
e
1332.046
1490.387
2089.516
1931.175
1588.815
Residuals
81.92885
68.38965
-119.207
54.97556
-155.093
d
137.1418
81.49339
2.67312
-58.7964
67.95418
9.61293
110.4839
-131.175
-188.815
RESIDUAL OUTPUT
Observation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
MS
2106492
11757.01
F
179.169
Significance F
5.57758E-09
t Stat
4.96
C
P-value
0.0003
5.58E-09
Lower 95%
220.2969
0.7178
Upper 95%
560.8177
0.9940
Page 19
i. Write down the regression equation and interpret the slope coefficient.
ii. What is the value of R square(4 decimal places) marked a in the regression
statistics and explain what it means.
iii. What are the values of the other missing values marked b to e ( 2 decimal places)?
iv. What is the value of the coefficient of correlation and explain what does it
measures?
v. At the 5% level of significance, is there any significant linear relationship between
size of the apartments and rent?
vi. Which chart or diagram can be used to check whether the assumption of
homoscedasticity has been violated and what is your conclusion?
vii. Estimate the monthly rental for the apartments with size of
a)2,000 sq ft and b)2,500 sq ft. and comment on the reliability of your
estimates.

Chapter 21 Business Statistic

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 21 Business Statistic

Uploaded by

Copyright:

Available Formats

QBM101 Module 4 - CHAPTER 21(19 Pages)

Chapter 21 : Simple Linear Regression and correlation.

Prediction are made in all areas.

Correlation analysis measures the strength of the relationship

Regression Analysis model of the Population

The regression equation: Y = 0 + 1X + , where:

QBM101 Module 4 - CHAPTER 21(19 Pages)

X is called the independent or predictor variable. It provides the basis for

Assumption underlying Simple Linear Regression

The four assumptions of regression(known by

Equal Variance(also called homoscedaticity)

e is normally distributed at each value of x.

21.3.4 Equal Variance(Homoscedasticity)

QBM101 Module 4 - CHAPTER 21(19 Pages)

Annual Advertising Expenditures($000)

QBM101 Module 4 - CHAPTER 21(19 Pages)

Calculate the standard error of estimate.

Calculate the coefficient of determination

QBM101 Module 4 - CHAPTER 21(19 Pages)

x = 1767 y = 254.7 xy =16255.90

From the above, the following are calculated

The required regression equation is

= 9.49 7.347749614 = 2.142250386

Therefore Least square regression equation is

Where x is annual advertising expenditure in $000

Standard error of estimate is the measurement of variation of the actual values of

QBM101 Module 4 - CHAPTER 21(19 Pages)

= 163.767 -156.445 = 7.322

Unlike the standard error of estimate which is measured by the units of

Histogram of the residual

QBM101 Module 4 - CHAPTER 21(19 Pages)

Scatter plot showing Annual Sales($'million) vs sdvertising

Annual Advertising expenditures($'000)

QBM101 Module 4 - CHAPTER 21(19 Pages)

Annual advertising expenditure($'000)

Annual advertising expenditure($'000)

-1.15 -0.85 -0.55 -0.25 0.05

The regression equation is y = 1.14225 + 0.12475x

QBM101 Module 4 - CHAPTER 21(19 Pages)

increase by $1,000, annual sales is expected to increase by $0.12475($million)

annual advertising expenditure of $60,000

y = 1.14225 + 0.12475(60) = 8.62725($' million )

y = 1.14225 + 0.12475(100) = 13.61725($' million)

QBM101 Module 4 - CHAPTER 21(19 Pages)

H o : b 1 = 0 There is no sig linear relationship between adv. exp and sales

p-value = 1.95E -20 = 1.95 (10-20 )

Since p-value(0) < 0.05, reject H0

There is significant linear relationship between annual advertising expenditure and

QBM101 Module 4 - CHAPTER 21(19 Pages)

Calculate the standard error of estimate indicating the units of measurement.

Calculate the coefficient of correlation.

QBM101 Module 4 - CHAPTER 21(19 Pages)

QBM101 Module 4 - CHAPTER 21(19 Pages)

Scatter Diagram Of Marks Of Accounting And Statistics

iii. Interpret the slope coefficient.

y = -0.2935 + 0.999(38) = 37.6685

y = y - R = 36 - ( -1.66909) = 36 + 1.66090 = 37.66909

QBM101 Module 4 - CHAPTER 21(19 Pages)

v. What is the value of the coefficient of determination? Interpret its meaning?

H1 : b1 0 , There is sig relationship between statistics and accounting marks

p-value (1.40343 E-18) =1.40343(10-18) = 0 < 0.01, reject Ho

QBM101 Module 4 - CHAPTER 21(19 Pages)

Scatterplot showing relationship between average