Professional Documents
Culture Documents
SECTION-A
Isha Walia 025
Ishu Bhardwaj 026
Karan Kakkar - 030
Vinit Durshetti - 061
Gaurav Kushwaha 371
Checking distribution of response variable (total cost to hospital)
Exploratory Analysis of Log (TOTAL.COST.TO.HOSPITAL) with respect to predictors AGE, GENDER and
MARITAL STATUS
Log (TOTAL.COST.TO.HOSPITAL) seems to increase with AGE, for male gender it is more and married
people it is more
Fitting Linear Regression Model
1. Develop a suitable simple linear regression model to check if there is any relationship
between Total Cost to Hospital and AGE. For the fitted model, interpret the regression
coefficient corresponding to AGE.
A1. On fitting linear regression model for log (TOTAL.COST.TO.HOSPITAL) with AGE predictor
Residuals:
Coefficients:
The regression coefficient suggests that as the age increases by one year, the cost to the hospital in
creases by 1.99% of the previous cost. Thus the increase in cost in geometric.
2. At the time of admission, suppose a patients age is 50 years. Based on the fitted model in (1),
what will be the minimum cost of treatment for this patient at 95% confidence level?
A2. On predicting at an AGE=50 with 95% confidence interval
fit lwr upr
1 12.24298 11.34373 13.14223
On back transformation
So, minimum cost for hospital for a patient of age 50 can be 84434.41
3. Suppose Mission Hospital is planning to introduce a package price for the treatment and has
decided to charge INR 250,000 for patients of age 50 years. What is the probability that the
treatment cost will exceed the package price? Do you think that the Mission Hospital should revise
the package price?
That is 65% probability is that the cost will be more than INR 250,000
Since the probability of exceeding the package price is more than 50%, the hospital might end up
getting a loss. The package price should be such that the probability of exceeding is 50% and the
probability of the cost being less than the price is 50%. Thus, in that case the hospital would not
suffer any extra cost.
4. Build a simple linear regression model between Total Cost to Hospital and GENDER.
Interpret the results.
A4.
Residuals:
Min 1Q Median 3Q Max
-1.31142 -0.28273 -0.08258 0.26109 1.57082
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.93436 0.05503 216.865 < 2e-16 ***
GENDERM 0.19082 0.06726 2.837 0.00493 **
---
Signif. Codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
The results of the regression model suggest that the cost to the hospital for a male is 55% higher for
a male than a female.
5. Build a simple linear regression model between Total Cost to Hospital and MARITAL
STATUS. Interpret the results.
A5.
Residuals:
Min 1Q Median 3Q Max
-1.3608 -0.2360 -0.0334 0.2396 1.4042
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 12.29182 0.04466 275.229 <2e-16 ***
MARITAL.STATUSUNMARRIED -0.40697 0.05944 -6.847 6e-11 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
For the model we can infer that people who are married lead to higher cost to the hospital by 1.55
times.
6. Build a multiple linear regression model with Total Cost to Hospital as dependent variable,
and AGE, GENDER and MARITAL STATUS as predictors. Compare the results with that of (4)
and (5).
A6.
Residuals:
Min 1Q Median 3Q Max
-1.5285 -0.2603 -0.0104 0.2470 1.3529
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.790187 0.151136 78.011 < 2e-16 ***
AGE 0.007637 0.002555 2.989 0.00308 **
GENDERM 0.104211 0.062490 1.668 0.09667 .
MARITAL.STATUSUNMARRIED -0.032630 0.132570 -0.246 0.80578
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
On running the model with age, gender and marital status we discover that the coefficient of
age hasnt changed much but there has been a significant change in the coefficients of gender
and marital status.
The above table suggests that married males cost higher to the hospital than expected. Thus
there is an interaction effect between gender and marital status leading to the change in
coefficients.
7. Build a multiple linear regression model with appropriate set of predictors. Identify the
statistically significant predictors that the Mission Hospital can use in predicting Total Cost to
Hospital. Comment on the performance of the fitted model. How does the fitted model help
Mission Hospital to take managerial decisions?
A7.
Residuals:
Min 1Q Median 3Q Max
-0.96533 -0.18093 -0.01659 0.19462 1.19165
P-value: 5.174e-13
The significant variables are age, complaint codes, pulse of the patient at time of admission and
creatinine level of patient.
The R square of the fitted model is only 53% while the adjusted R square is 42.8%. This suggests that
the model is not a very good predicted of the cost to hospital.
Using the above model, the managers could identify factors that lead to higher costs and hence build
packages that would not lead to the actual cost exceeding the package cost.
APPENDIX: R CODE