You are on page 1of 2

BIOSTATISTICS lab

INFERENCE FOR NOMINAL DATA


Assumptions in tests for proportions:

From binomial experiments: There are only two possible outcomes.

Sample size n must be large.

Each proportion being tested are independent.


INFERENCE ABOUT A PROPORTION

INFERENCE ABOUT TWO PROPORTIONS

- Data Analysis Plus Z-test: Proportion


- determines if sample proportion is a
good estimation of the population
proportion.
z

- Data Analysis Plus Z-test: Two Proportions


- determines if the two proportions have
significant difference.
z

p p

x
where p
n
p(1 p )
n

p1 p 2
p (1 p ) p (1 p )

n1
n2

where p

INFERENCE ABOUT TWO OR MORE


PROPORTIONS

- Chi-Square Test of Homogeneity


- for raw data, Data Analysis Plus
Contingency Table (Raw Data )
- for cross tabs or Contingency Table,
Data Analysis Contingency Table

x1 x2
n1 n2

(O E) 2 , where
E
E ij

Rowi total Column j total


GrandTotal

df = (columns 1) (rows 1)

INFERENCE FOR BIVARIATES


PEARSON PRODUCT MOMENT CORRELATION COEFFICIENT
- assumes normal distribution
- used for two independent sets of data in interval / ratio
- may also be used for dichotomous data (nominal)

rp

N X

N XY X Y
2

X N Y 2 Y
2

SPEARMAN CORRELATION COEFFICIENT


- may take the role of Pearson r for interval/ratio data without assuming
normal distribution
- used for two independent sets of data in ordinal measurement.
2

rs 1

N(N2 1)

- for inferences, a significant relationship between the two variables exist if the null hypothesis (r = 0.00) is rejected.
Ho: no significant relationship
Ha: has significant relationship
CHI-SQUARE: TEST OF ASSOCIATION (TEST OF INDEPENDENCE)
- may take the role of Pearson r for dichotomous data (nominal), or Spearmans role for ordinal data, but less powerful.
- for raw data, Data Analysis Plus Contingency Table (Raw Data)
- for cross tabs or Contingency Table, Data Analysis Contingency Table
- a procedure that tests whether frequencies are according to categories.
Ho means equal frequencies.
Thus, not associated, independent, or do not differ.
(O E)2 ;
O observed frequencies
2

E expected frequencies
E
Ha means not equal frequencies.
Thus, associated, dependent, or differ.
df = (no. of columns 1) (no. of rows 1)
-

If situation occurs that when the expected frequency is so small (less than 5), it is recommended to apply any of the following:
a) Collapse some rows. Meaning, combine it with the other row with the same characteristic(s). You may also disregard the row
or column with frequencies that are relatively very small (most especially if zero).
or the Fishers exact, where
b) For df = 1, use Yates correction, where

O E 0 .5

x2

A B !C D ! A C !B D !
A! B!C! D! N !

highly recommended if frequencies are less than or equal to 2.


Otherwise, the Chi-squares test of null hypothesis may likely fail to reject the null hypothesis.
- Chi-square test of association may NOT be applied for multiple-response type of questions. Each sample must represent one
frequency only.

Written by: Asst. Prof. Xandro Alexi A. Nieto of UST Faculty of Pharmacy

EXAMPLES
1. In a random sample of 150 pirated DVDs at Quiapo, it was found that 25 are defective. On the basis of this sample
(dvd.xls; 1 defective, 0 - nondefective), have we any reason to believe that more than 8% of all the pirated DVDs at
Quiapo have defects?
Hypotheses:
Ho:

________________________________________________________________________________

Ha:

________________________________________________________________________________

Statistical Test to Use: _____________________ Test Statistic: __________ Critical Value: __________

2.

p-value:

___________

Conclusion:

Decision:

___________

___________________________________________________

In a random survey done at UST, 70 Commerce students and 60 Pharmacy students were asked. Results are in
breakpharcom.xls (1-enjoying; 0-not enjoying). Test the hypothesis at 5% level of significance that the rates of enjoying
during summer break are the same on both colleges.
Hypotheses:
Ho:

________________________________________________________________________________

Ha:

________________________________________________________________________________

Statistical Test to Use: _____________________ Test Statistic: __________ Critical Value: __________

3.

p-value:

___________

Conclusion:

Decision:

___________

___________________________________________________

A researcher wants to determine if there is a significant difference on the proportion of customers who take the three different
brands of vitamins. The data are in vitaminsabc.xls, in which 1 Taker, 2 Non-Taker; 1 Brand A, 2 Brand B,
3 Brand C) . Test the hypothesis at 10% level of significance..
Hypotheses:
Ho:

________________________________________________________________________________

Ha:

________________________________________________________________________________

Statistical Test to Use: _____________________ Test Statistic: __________ Critical Value: __________

4.

p-value:

___________

Conclusion:

Decision:

___________

___________________________________________________

A researcher wants to determine if the hangout mall of Manila students depend on the school where they study. The results
are in the contingency table of hangout1.xls. Test the hypothesis at 5% significance level.
Hypotheses:
Ho:

________________________________________________________________________________

Ha:

________________________________________________________________________________

Statistical Test to Use: _____________________ Test Statistic: __________ Critical Value: __________

5.

p-value:

___________

Conclusion:

Decision:

___________

___________________________________________________

Modify your answers in #4, using handout2.xls.


Statistical Test to Use: _____________________ Test Statistic: __________ Critical Value: __________
p-value:

___________

Conclusion:

Decision:

___________

___________________________________________________

Written by: Asst. Prof. Xandro Alexi A. Nieto of UST Faculty of Pharmacy

You might also like