Professional Documents
Culture Documents
A researcher has collected data for 125 patients discharged from the Emergency Department (ED) of
Hospital XYZ in December 2015 (Excel file: 2300assignment5.xls). The data includes the patient
unique identifier, age (years), gender (female = 1; male = 2), arrival date/time, discharge date/time
and resource intensity weights (RIW) at three different time points.
a. Input the data in SPSS
b. Is age skewed (ie non-normal)? Copy and paste the generated SPSS output table and show your
calculations to determine skewness.
Analyze, Descriptive Statistics, Frequencies
Skewness calculation:
IF skewness > 2(Std. Error of Skewness)
THEN variable is skewed
Since -.022 = .022 < 2(.217) = .434, then age is not skewed.
c. Create z-scores for age. Generate a case summary report of the variables age and ZAge (zscores of the variable age) for the first 10 patients. Copy and paste the generated SPSS case
summary report.
Analyze, Descriptive Statistics, Descriptives
d. Using the z-score for Patient 1, what percentage of patients were younger than Patient 1? Show
your work.
z-score for Patient 1 = 1.22505
To determine what percent of patients were younger than Patient 1, we need to find:
P(z < 1.22505)
Since Table 1 lists z scores up to two decimal places:
P(z < 1.22505)
= P(z < 1.23)
= .8907
= .89 (rounded to 2 decimal places) or 89.07% (rounded to 2 decimal places)
e. Create a new variable called meanRIW using the MEAN function in SPSS that returns the
average of the three RIWs for patients that satisfy the following condition: age 90. Copy and
paste the screenshot of the Compute Variable window used to create this new variable.
Transform, Compute Variable
f.
Create a new variable called LOShr_retainfrac that is the length of stay in hours with the
fractional part retained (LOShr_retainfrac = discharge date/time arrival date/time). Generate
a descriptive table for LOShr_retainfrac that includes the mean, standard deviation, median,
quartiles and skewness. Copy and paste the generated SPSS output table.
Transform, Date and Time Wizard
g. Create a new variable called ED_arrival_hour that extracts the hour that the patient arrived to
the ED. Copy and paste the screenshots of the Date and Time Wizard windows used to create
this new variable.
Transform, Date and Time Wizard
h. Create a new variable called absZAge that returns the absolute value of ZAge for patients that
satisfy the following condition: age 95 and ED_arrival_hour 13. Copy and paste the
screenshot of the Compute Variable window used to create this new variable.
Transform, Compute Variable
2. In a test of significance, the p value of the test statistic is .063. For each of the below scenarios,
determine whether the statement is true or false (answer true or false):
a. The data are statistically significant at both = .05 and = .01 levels.
FALSE (to be statistically significant at both = .05 and = .01 levels, the p value must be <.01.
Since p = .063 > .01, this statement is false)
b. The data are statistically significant at = .05 level but not at = .01 level.
FALSE (to be statistically significant at = .05 level but not at = .01 level, the p value must be
in between .01 and .05, ie., 0.01 < p < 0.05. Since .063 > .05, this statement is false)
c. The data are statistically significant at = .01 level but not at = .05 level.
FALSE (this statement can never be true; ie., p cannot be < 0.01 and > 0.05 at the same time).
d. The data are neither statistically significant at = .05 nor = .01 levels.
TRUE (since p = .063, .063 > .05 and .063 > .01, therefore it is neither statistically significant at
= .05 nor = .01 levels)
3. The mean total cholesterol level is reported as 218.7 mg/dl with a standard deviation of 18.5 mg/dl.
Investigators hypothesize that people who eat oatmeal daily for breakfast would have different total
cholesterol. Investigators plan to run a test of hypothesis and want 90% power to detect a
difference of 10 points in mean total cholesterol levels. How many participants must be enrolled in
the study? Assume = .05.
1 + 1 2
= 2
ES
Recall that is a probability (ie area underneath the curve, values inside the table), therefore we
look inside the table for the value 0.975 which corresponds to a z-score of 1.96.
.975 = 1.96
1 = .9
Power is also a probability (ie area underneath the curve, values inside the table), therefore we
look inside the table for the value 0.9. The closest value to 0.9 is 0.8997 which corresponds to a zscore of 1.28.
.9 = 1.28
ES =
|1 0 |
10
=
= 0.54054
18.5
1.96 + 1.28 2
=
= 35.92811
0.54054
73 1.96
11
83
10
73 2.575
11
83
For a 95% CI, we will look in the T table with degrees of freedom = n 1 = 10 1 = 9.
t = 2.262
4.67 2.262
1.74
10
11
For a 99% CI, we will look in the T table with degrees of freedom = n 1 = 10 1 = 9.
t = 3.250
4.67 3.250
1.74
10
For a 95% CI, we will look in the T table with degrees of freedom = n 1 = 5 1 = 4.
t = 2.776
4.67 2.776
1.74
5
4.67 4.604
1.74
5
12
6. The following data were collected in a clinical trial to compare a new drug to a placebo for its
effectiveness in lowering total serum cholesterol.
New Drug
(n=75)
185.0 (24.5)
78.00%
Placebo
(n=75)
204.3 (21.8)
65.00%
Total Sample
(n=150)
194.7 (23.2)
71.50%
a. Generate a 95% confidence interval for the proportion of all patients with total cholesterol <
200.
We are asked to generate a CI for a proportion not a mean. Therefore, we are not concerned
with the mean cholesterol but instead are concerned with the percent or proportion with total
cholesterol < 200. Additionally, we are asked to generate a CI for all patients, not just those on
the new drug or those on placebo. Therefore n = 150.
Since min[ , (1 )]
= min [150(.715), 150(1-.715)]
= min [107.25, 42.75]
= 42.75 5, the CI formula is:
(1 )
0.715 (1 0.715)
0.715 1.96
150
0.72 0.07 or equivalently (0.65, 0.79)
13