You are on page 1of 5

Appendix B

Sampling Design

Sampling Design:

The sampling design will use a stratified cluster random sampling design. The stratification variables and
domain of analysis is the university, College and year level. Sample classes (clusters) will be randomly
selected and all the students in the selected class will be invited to participate. The sample size
computation will be described on the next page.

Selection of the sampling units

The enrolment population in every University will be listed by College, Department and year level. The
cluster is a class and the estimated average class size is 40. The cumulative population will be listed from
the first College and Department per year level. The selection of the sample class will use the systematic
sampling. First the sampling interval (k) will be computed by dividing the total population per year level
and the sample size for the year level. A random start (r) will be drawn from 1 to k.

Table 1. Enrollment Population by College and Department, First Year


Name of Enrolled population Cumulative sample department
College/Department population based on r, r+k,
(r+k)+k, etc
College A
Department A
Department B
Department C
College B
Department A
Department B
………

The first department in the list with cumulative population greater than the random start will be the first
sample department. The sample class in the sample department will be drawn by simple random
sampling. Then the second sample department will be the first department in the list with cumulative
class size greater than r+k. The sample class in the second sample department will again be drawn by
simple random sampling. Then the third sample department will be the first department in the list with
cumulative class size greater than (r+k) + k. The sample class in the third sample department will again
be drawn by simple random sampling. The succeeding sample classes will be drawn by repeating the
above process. Thus the sample departments will be the first department with population greater than r
(first sample department), r+1k (second sample department, r+2k (third sample department), r+3k (fourth
sample department), r+4k (fifth sample department), and so on. The sample class per department will be
drawn by simple random sampling. For example, if there are 20 classes of first year students in
Department A, this will be numbered from 1 to 20. A random number from 1-20 will be drawn and the
random number drawn will correspond to number of the class in the list. So if the number 13 number is
drawn, then the number 13 class in the list will be the sample class.

All the students in the sample class drawn will be interviewed for this study.
Sample Size Calculations

There were two sample size formulae used in this study. One uses sample size formula for estimation of
proportion based on objectives of the study. Most of the objectives can be translated into proportion of
the students with adequate practices related to student engagement (adequacy may be computed by using
cut-off and definition like the proportion of student who very often practice 75% of the items in Construct
A of student engagement).

The other formula is for the determination of factors associated with the adequate level of student
engagement.

The sample size will be computed for each University, college and year level for the first computation of
estimates of the level of student engagement as the objective will allow conclusions per university,
college and year level of students in that university.

A. Sample size per year level for the objectives on the Estimation of proportion

The formula used for this sample size determination is the simple random sampling which was adjusted
for design effect of the cluster sampling design1.

SRS Formula using Cluster Design Effect (DEFF)2

The formulas used are:

Formula 1:

Z2PQ
nSRS= -------------------- - SRS Design
d2

Formula 2:

nCluster= nSRS * DEFF-------------------- - Cluster Design

Where:
nSRS – sample size for an SRS
Z- the standard normal distribution variable equivalent to the confidence level specified
P- the proportion (expressed in percent), which is also the parameter being estimated, for example, the
proportion of students with adequate level of student engagement
Q- 100-P
d- maximum tolerable error, the error one is willing to commit, that the sample may not be able to get the
true value. This maybe expressed as percentage of the P. Example: if P chosen is 50, you may use 10%
of P which is 5% (NOTE: P=50 means that a rough estimate of student with adequate engagement is 50%.

1
Lemeshow, PS Adequacy of Sample Size, WHO, 19__.
2
Cochran, W.G. (1977). Sampling techniques. New York: John Wiley & Sons.
This can be gathered from previous studies, if there is any or from experts like school administrators or
teachers.
nCluster - Sample size for the cluster design
DEFF- Design Effect which is a multiplier greater than 1. This means that the cluster sampling design is
less efficient (higher standard errors) compared to the simple random sampling design by this factor. In
the case of the parameter being estimated, if one of the students in the sample class has adequate SE, there
is a high probability that another student in the same class will also have adequate SE. Their being part of
the same class may be related to the parameter being investigated. This is because there may be
characteristics in the class that may be similar among students like same teachers, same resources
exposed to, common peers, etc)

Computation:
Given:
Z= 1.96 for 95% Confidence Level
P= 50%, for maximum estimate of sample size (used in the absence of proportion data from
literature)
Q= 100-50= 50%
d= 5%
DEFF= 2, it is assumed that there is not much difference in the level of KAP on student
engagement within clusters (class) of students

1.962 * 050 * 050


nSRS= ------------------------------ = 385
52

ncluster1= 385 *2 =770

Adjusting for a 10% refusal rate, the final sample size is

ncluster2= 770 /0.90= 856

One final adjustment to the sample size may be done based on the population per university and year
level. The sample size ncluster2 may be adjusted for the population size in the University and year elvel
using the following finite population correction formula=
ncluster2
nfinal= -----------------------
1+ncluster2/N

For example, if the sample size is 856 and the population of first year in University A is 3,000, the final
sample size for the first year level in this university is

856
nfinal= ----------------------- = 666
1+856/3000

Thus we will need to interview 666 first year students in University A.


B. Sample Size Calculation for Logistic Regression Analysis3, 4, 5

The sample size will also be computed to be able to attain the objective on determining the factors
associated with student engagement as well as the association between the level of student engagement
and the outcome in each University. The sample size will be computed with the assumption that the
logistic regression analysis will be used.

The sample size was computed first using the following formula for the crude analysis of association.

( zα 2 P (1−P ) + zβ P1 (1−P1 ) +P2 (1−P2 ) )


2

ncrude = 2
(P1 - P2 )

Where
zα = normal deviate corresponding to α error
zβ = normal deviate corresponding to β error or (1-β ) Power
P1 = estimated proportion who have adequate student engagement in one group, say female
P2 = estimated proportion who have adequate student engagement in the other group, say male

The computation used the following assumptions-

Alpha error= 0.05


Beta Error = 0.20 or Power of the Test =0.80 or 80%
Odds Ratio to be detected of having adequate SE between two groups=1.5
P1= Proportion of unexposed to the different factors in the population with adequate SE = at least 30%

From these assumptions, the sample size computed using Epi-Info version 6 software is 892

The final sample size was adjusted for multivariate analysis using Hsieh formula for logistic regression
analysis. Using the Multiple coefficient determination R2=0.6, the final sample size is 1487.

Using the above computed sample size requirements, the initial estimate of the sample size chosen will be
that for the estimation of the level of student engagement which is maximum of 856 per year level or
3424 per university which is more than the requirements of the test for association (n=1487). The final
sample size requirement will be adjusted based on the finite population correction as shown above. Using
the population of each College, the final sample sizes are shown in the following table.

3
Hsieh, FY, Bloch DA and Larsen MD, A simple method of sample size calculation for linear and logistic
regression, Statistics in Medicine, 17. 1623-1634 (1998).
4
Whittenmore, A. Sample size for logistic regression with small response probability, J Am Stat Asso. 76, 27-32
(1981).
5
Hsieh, FY Sample size tables for logistic regression, Stat in Medicine, 8, 795-802 (1989).
Table Population and Sample Size by University
Partner Institutions Population Final
Size* Sample Size
1. Adamson University 15,966 2,819
2. College of St. Benilde 8,509 2,442
3. De la Salle University 16,850 2,846
4. Emilio Aguinaldo College 8,770 2,463
5. Lyceum of the Philippines 7,687 2,369
6. Philippine Normal University 6,583 2,252
7. Philippine Women Univerity 3,386 1,702
8. Philippine Christian University 4,713 1,983
9. St. Paul College 1,762 1,163
10. St. Scholastica College 2,572 1,469
11. University of the Philippines, 3,000 1,599
Manila
Total 79,798 23,107
*
2005 Population

You might also like