You are on page 1of 13

HH/HLST 2300: Quantitative Research Methods in Health Studies

Fall Term Assignment 5


Assigned: Saturday November 19, 2016; Due 5PM Friday December 2, 2016
Submit 1 file for Assignment 5: PDF
PDF document name: LASTNAME_FIRSTNAME_FTAssignment5
Submit via Moodle
1.

A researcher has collected data for 125 patients discharged from the Emergency Department (ED) of
Hospital XYZ in December 2015 (Excel file: 2300assignment5.xls). The data includes the patient
unique identifier, age (years), gender (female = 1; male = 2), arrival date/time, discharge date/time
and resource intensity weights (RIW) at three different time points.
a. Input the data in SPSS
b. Is age skewed (ie non-normal)? Copy and paste the generated SPSS output table and show your
calculations to determine skewness.
Analyze, Descriptive Statistics, Frequencies

Skewness calculation:
IF skewness > 2(Std. Error of Skewness)
THEN variable is skewed
Since -.022 = .022 < 2(.217) = .434, then age is not skewed.
c. Create z-scores for age. Generate a case summary report of the variables age and ZAge (zscores of the variable age) for the first 10 patients. Copy and paste the generated SPSS case
summary report.
Analyze, Descriptive Statistics, Descriptives

Analyze, Reports, Case Summaries

d. Using the z-score for Patient 1, what percentage of patients were younger than Patient 1? Show
your work.
z-score for Patient 1 = 1.22505
To determine what percent of patients were younger than Patient 1, we need to find:
P(z < 1.22505)
Since Table 1 lists z scores up to two decimal places:
P(z < 1.22505)
= P(z < 1.23)
= .8907
= .89 (rounded to 2 decimal places) or 89.07% (rounded to 2 decimal places)
e. Create a new variable called meanRIW using the MEAN function in SPSS that returns the
average of the three RIWs for patients that satisfy the following condition: age 90. Copy and
paste the screenshot of the Compute Variable window used to create this new variable.
Transform, Compute Variable

f.

Create a new variable called LOShr_retainfrac that is the length of stay in hours with the
fractional part retained (LOShr_retainfrac = discharge date/time arrival date/time). Generate
a descriptive table for LOShr_retainfrac that includes the mean, standard deviation, median,
quartiles and skewness. Copy and paste the generated SPSS output table.
Transform, Date and Time Wizard

Analyze, Descriptive Statistics, Frequencies

g. Create a new variable called ED_arrival_hour that extracts the hour that the patient arrived to
the ED. Copy and paste the screenshots of the Date and Time Wizard windows used to create
this new variable.
Transform, Date and Time Wizard

h. Create a new variable called absZAge that returns the absolute value of ZAge for patients that
satisfy the following condition: age 95 and ED_arrival_hour 13. Copy and paste the
screenshot of the Compute Variable window used to create this new variable.
Transform, Compute Variable

2. In a test of significance, the p value of the test statistic is .063. For each of the below scenarios,
determine whether the statement is true or false (answer true or false):
a. The data are statistically significant at both = .05 and = .01 levels.
FALSE (to be statistically significant at both = .05 and = .01 levels, the p value must be <.01.
Since p = .063 > .01, this statement is false)
b. The data are statistically significant at = .05 level but not at = .01 level.
FALSE (to be statistically significant at = .05 level but not at = .01 level, the p value must be
in between .01 and .05, ie., 0.01 < p < 0.05. Since .063 > .05, this statement is false)
c. The data are statistically significant at = .01 level but not at = .05 level.
FALSE (this statement can never be true; ie., p cannot be < 0.01 and > 0.05 at the same time).
d. The data are neither statistically significant at = .05 nor = .01 levels.
TRUE (since p = .063, .063 > .05 and .063 > .01, therefore it is neither statistically significant at
= .05 nor = .01 levels)
3. The mean total cholesterol level is reported as 218.7 mg/dl with a standard deviation of 18.5 mg/dl.
Investigators hypothesize that people who eat oatmeal daily for breakfast would have different total
cholesterol. Investigators plan to run a test of hypothesis and want 90% power to detect a
difference of 10 points in mean total cholesterol levels. How many participants must be enrolled in
the study? Assume = .05.

1 + 1 2
= 2

ES

1 = 1.05 = 1.025 = .975


2

Recall that is a probability (ie area underneath the curve, values inside the table), therefore we
look inside the table for the value 0.975 which corresponds to a z-score of 1.96.

.975 = 1.96

Since the definition of power = 1 -

1 = .9

Power is also a probability (ie area underneath the curve, values inside the table), therefore we
look inside the table for the value 0.9. The closest value to 0.9 is 0.8997 which corresponds to a zscore of 1.28.

.9 = 1.28
ES =

|1 0 |
10
=
= 0.54054
18.5

1.96 + 1.28 2
=
= 35.92811
0.54054

Therefore, the minimum number of participants to enroll is 36.


4. A hypertension study with n = 83 found a mean diastolic blood pressure is 73 with a sample
standard deviation of 11.
a. Calculate the 95% and 99% confidence intervals for the population mean.
Since n 30, the CI formula is:

For a 95% CI:

73 1.96

11

83

73 2.37 or equivalently (70.63, 75.37)

10

For a 99% CI:

73 2.575

11

83

73 3.11 or equivalently (69.89, 76.11)


b. In a sentence, interpret what the 95% and 99% confidence intervals mean.
A point estimate for the true mean diastolic blood pressure in the population is 73 and we are
95% confident that the true mean will range from 70.63 to 75.37 units.
A point estimate for the true mean diastolic blood pressure in the population is 73 and we are
99% confident that the true mean will range from 69.89 to 76.11 units.
5. A therapy study with a sample size of 10 found that = 4.67 and s =1.74.
a. Calculate the 95% and 99% confidence intervals for the population mean.
Since n < 30, the CI formula is:

For a 95% CI, we will look in the T table with degrees of freedom = n 1 = 10 1 = 9.
t = 2.262

4.67 2.262

1.74
10

4.67 1.24 or equivalently (3.43, 5.91)

11

For a 99% CI, we will look in the T table with degrees of freedom = n 1 = 10 1 = 9.
t = 3.250

4.67 3.250

1.74
10

4.67 1.79 or equivalently (2.88, 6.46)


b. Suppose instead that n = 5, calculate the 95% and 99% confidence intervals for the population
mean.
Since n < 30, the CI formula is:

For a 95% CI, we will look in the T table with degrees of freedom = n 1 = 5 1 = 4.
t = 2.776

4.67 2.776

1.74
5

4.67 2.16 or equivalently (2.51, 6.83)


For a 99% CI, we will look in the T table with degrees of freedom = n 1 = 5 1 = 4.
t = 4.604

4.67 4.604

1.74
5

4.67 3.58 or equivalently (1.09, 8.25)


c. What happens to the confidence intervals as we decrease the sample size?
When we decreased the sample size, to obtain the same level of confidence, the confidence
intervals widened or became larger.

12

6. The following data were collected in a clinical trial to compare a new drug to a placebo for its
effectiveness in lowering total serum cholesterol.
New Drug
(n=75)
185.0 (24.5)
78.00%

Mean (SD) Total Serum Cholesterol


% Patients with Total Cholesterol < 200

Placebo
(n=75)
204.3 (21.8)
65.00%

Total Sample
(n=150)
194.7 (23.2)
71.50%

a. Generate a 95% confidence interval for the proportion of all patients with total cholesterol <
200.
We are asked to generate a CI for a proportion not a mean. Therefore, we are not concerned
with the mean cholesterol but instead are concerned with the percent or proportion with total
cholesterol < 200. Additionally, we are asked to generate a CI for all patients, not just those on
the new drug or those on placebo. Therefore n = 150.
Since min[ , (1 )]
= min [150(.715), 150(1-.715)]
= min [107.25, 42.75]
= 42.75 5, the CI formula is:

(1 )

For a 95% CI:

0.715 (1 0.715)
0.715 1.96
150
0.72 0.07 or equivalently (0.65, 0.79)

13

You might also like