You are on page 1of 6

1

Bankole Oni

Project Part C

Keller Graduate School of Management


MATH 533: Managerial Statistics

Project Part C: Regression and Correlation Analysis


1. Generate a scatterplot for income ($1,000) versus credit balance ($), including the graph
of the best fit line. Interpret.
Scatterplot of Credit Balance($) vs Income ($1,000)
6000

Credit Balance($)

5000

4000

3000

2000
20

30

40

50
Income ($1,000)

60

70

80

From the plot, as the income increases, the credit balance is also increasing and vice
versa and we can conclude that there is a positive relationship between the income and
credit balance. The slope of the line is positive
2. Determine the equation of the best fit line, which describes the relationship between
income and credit balance.
Regression Analysis: Income ($1,000) versus Credit Balance($)
The regression equation is
Income ($1,000) = - 3.52 + 0.0119 Credit Balance($)
Predictor
Coef SE Coef T P
Constant
-3.516 5.483 -0.64 0.524
Credit Balance($) 0.011926 0.001289 9.25 0.000
S = 8.40667 R-Sq = 64.1% R-Sq(adj) = 63.3%
Analysis of Variance
Source
DF
SS
MS
F
P
Regression
1 6052.7 6052.7 85.65 0.000
Residual Error 48 3392.3 70.7

Total

49 9445.0

Unusual Observations
Credit Income
Obs Balance($) ($1,000) Fit SE Fit Residual St Resid
2
2047 25.00 20.90 2.96
4.10
0.52 X
4
3913 26.00 43.15 1.23 -17.15 -2.06R
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large leverage.
Based on the results of the regression, the relationship between the two variables is
Income = -3.156 + 0.011926 * credit balance
3. Determine the coefficient of correlation. Interpret.
The co-efficient of correlation is 0.801. This means that there is a relationship
between the two variables and as one variable increases, the other increases and
vice versa.
4. Determine the coefficient of determination. Interpret.
The coefficient of determination is 64.1% which means thats the proportion of
variability in our case the income is accounted for is by the regression results.
5. Test the utility of this regression model (use a two tail test with =.05). Interpret your
results, including the p-value.
H 0 There is no significant correlation. =0
Significance Level, = 0.05
Rule:

Reject H 0 if the pvalue< 0.05( significance level , alpha)

Since the p-value is less than 0.05, we can reject the null hypothesis that there is
no significant correlation. Meaning that the regression model is valid.

6. Based on your findings in 15, what is your opinion about using credit balance to predict
income? Explain.
Based on the findings, we can conclude that its significant and the independent
variable is significant in predicting the dependent variable. We can predict the
income using the credit balance and credit balance to predict income.
7. Compute the 95% confidence interval for beta-1 (the population slope). Interpret this
interval.

The 95% confidence interval for beta-1 is (14.96, 26.86) so we can conclude that
beta-1 falls within this range
8. Using an interval, estimate the average income for customers that have credit balance of
$4,000. Interpret this interval.
The interval is (41.77, 46.61) therefore we can conclude that the average mean
income is between $41,770 and $46,610 that have a balance of $4,000.
9. Using an interval, predict the income for a customer that has a credit balance of $4,000.
Interpret this interval.
The predicted interval is (27.11, 61.27) and we can conclude that the average
predicted income is between $27,110 and $61,270 that have a balance of $4,000.
10. What can we say about the income for a customer that has a credit balance of $10,000?
Explain your answer.
We have to solve for $10,000 in the regression equation
Income ($1,000) = - 3.52 + 0.0119 ($10,000)
So the expected income is $115,748.
11. In an attempt to improve the model, we attempt to do a multiple regression model
predicting income based on credit balance, years, and size.
Regression Analysis: Income ($1,0 versus Credit Balan, Years, Size
The regression equation is
Income ($1,000) = - 13.2 + 0.0108 Credit Balance($) + 1.21 Years + 0.615 Size
Predictor
Coef SE Coef T P
Constant
-13.186
3.608 -3.65 0.001
Credit Balance($) 0.0107922 0.0008184 13.19 0.000
Years
1.2097 0.2322 5.21 0.000
Size
0.6151 0.4178 1.47 0.148
S = 5.26121 R-Sq = 86.5% R-Sq(adj) = 85.6%
Analysis of Variance
Source
DF
SS
MS
F
P
Regression
3 8171.7 2723.9 98.41 0.000
Residual Error 46 1273.3 27.7
Total
49 9445.0
Source
DF Seq SS
Credit Balance($) 1 6052.7
Years
1 2059.0

Size

60.0

Unusual Observations
Credit Income
Obs Balance($) ($1,000) Fit SE Fit Residual St Resid
2
2047 25.000 13.786 2.415 11.214
2.40R
R denotes an observation with a large standardized residual.
Income ($1,000) = - 13.2 + 0.0108 * Credit Balance($) + 1.21 *Years + 0.615
*Size
12. Using MINITAB, run the multiple regression analysis using the variables credit balance,
years, and size to predict income. State the equation for this multiple regression model.
Income ($1,000) = - 13.2 + 0.0108 * Credit Balance($) + 1.21 *Years +
0.615 *Size
13. Perform the global test foruUtility (F-Test). Explain your conclusion.
The F-test is 98.41 and p-value of 0. The null hypothesis is rejected and thus the
model is significant in predicting the dependent variable income
14. Perform the t-test on each independent variable. Explain your conclusions and clearly
state how you should proceed. In particular, state which independent variables should we
keep and which should be discarded.
The test statistic is 13.18 and a p-value of 0
Size 1.47 and p-value of 0.148
Years 5.21 and a p-value of 0
For the credit balance and years, the p-value is less than 0.05 so they are
significant in predicting the income and should be kept in the model.
For the size, the p-value is greater than 0.05 hence it is insignificant in predicting
the income and should be removed from the model.
15. Is this multiple regression model better than the linear model that we generated in parts
110? Explain.
The coefficient of determination for the multiple regression is 86.5% and for the
linear model is 64.1% thus the multiple linear regression has a higher variance than the
simple linear model making the multiple linear regression better than the simple linear
regression.
Summary

Based on the analysis, we discover that the income increases with the credit balance. These
values are important for the company to grow also based on the analysis, income and years good
predictor that credit balances will increase.

You might also like