You are on page 1of 49

SAMPLING AND SAMPLE SIZE CALCULATION

Danaida B. Marcelo, MS Clinical Epidemiology Unit, Research Division De La Salle Health Sciences Institute

THE RESEARCH PROCESS


Problem Identification Objective Formulation Review of Related Literature Research Design Sampling Design and Sample Size Data Collection Method Data Analysis Writing the Report Dissemination of Result

Learning Objectives:
At the end of this session, learners should be able to: 1. Understand the concept of sampling, sample size 2. Define sampling and sampling error 3. Know the different sampling methods 4. Know the requirements for sample size calculation 5. Recognize OPEN EPI/EPIINFO for sample size calculation for cross-sectional, cohort and casecontrol studies

What is sampling?


a procedure of drawing a fraction of a population for the purpose of determining certain characteristics of the population

TARGET POPULATION SAMPLE POPULATION

Why do we need to sample?




we cannot study all elements of the population we are interested in Advantages


  

quicker less expensive more efficient

Basic Concepts in Sampling


 

target population - group of interest sample population - representative subset sampling frame - list of sampling unit (ex. List of names, or places) sampling unit - the unit of selection elementary unit - unit of measurement

 

The Concept of Sampling


Example: The researcher wants to determine prevalence of Positive PPD among 1-10 yr old children in Muntinlupa target population - all 1-10 yr old children in Muntinlupa sample population - ex. 100 children (1-10 yrs old) living in Muntinlupa sampling frame - list of names of all 1-10 yr old children or list of the barangays, or list of the households the sampling units sampling unit - the unit of selection barangays or households or the children elementary unit - unit of measurement child, 1-10 yrs old

Sampling Error


 

SAMPLING ERROR - error due to chance - random error - the difference between the sample value and the unknown true value - cannot be eliminated, but can be minimized

How do we do sampling?


Non-probability sampling
 

Judgment or purposive Accidental or haphazard Simple random Systematic random Stratified random Cluster random Multi-stage random

Probability sampling
    

How do we do sampling?
Probability -random selection -sampling frame is needed -can compute for sampling error -results can be generalize Non-probability -non random selection -sampling frame is not required Cant compute sampling error -results cannot be generalize

How do we do sampling?


Non-probability


Judgment or purposive
Expert sampling involves the assembling of a sample of persons with known or demonstrable experience and expertise in some area.  In snowball sampling, the process starts by identifying someone who meets the criteria for inclusion in the study. The respondent is then asked to recommend others whom they may know who also meet the criteria.


How do we do sampling?


Probability Sampling
    

Simple random Systematic random Stratified random Cluster random Multi-stage random

Simple Random Sampling (SRS)


Example: The researcher wants to determine prevalence of Positive PPD among 1-10 yr old children in Muntinlupa

target population - all 1-10 yr old children in Muntinlupa Assume N = 1000 sample population - ex. 100 1-10 yr old children in Muntinlupa sampling frame - list of names of all the 1-10 yr old children (assign numbers - 0001 to 1000) Generate randomly 100 numbers (between 0001 to 1000) By using calculators By using table of random numbers By using softwares

Simple Random Sampling

Stratified random sampling




the population is first divided into groups or strata a Simple Random Sample is then selected from each stratum subgroups of interest are represented adequately

Stratified random sampling


Example: The researcher wants to determine prevalence of benign febrile convulsions among the infants in Dasmarias, Cavite
target population - all 1-10 yr old children in Muntinlupa Assume N = 1000 sample population - ex. 100 1-10 yr old children in Muntinlupa

Sampling frame list of 1-10 yr old children per barangay - N= 1000: Bgy A=500; Bgy B=300; Bgy C=200 From each Barangay - select number of children using SRS - proportionate sampling - n=100: Bgy A=50; Bgy B=30; Bgy C=20

Systematic random sampling




selection of every kth unit in the population k = total # in population calculated sample size

the first unit is selected randomly from among the first k units

Systematic random sampling


Example: The researcher wants to determine prevalence of Positive PPD among 1-10 yr old children in Muntinlupa target population - all 1-10 yr old children in Muntinlupa Assume N = 1000 sample population - ex. 100 1-10 yr old children in Muntinlupa

Sampling frame list of all 1-10 yr old children in Muntinlupa k=1000/100 = 10 Choose the random start (from nos 1 to 10) Chosen Random start= 3; then the child with id no 3 is included in the sample, then 13th in the list, then 23rd

Cluster Sampling


the population is first divided into clusters, usually based on geographical proximity a random sample of such clusters is selected all units in the clusters are selected

Cluster Sampling
Example: The researcher wants to determine prevalence of Positive PPD among 1-10 yr old children in Muntinlupa target population - all 1-10 yr old children in Muntinlupa Assume N = 1000 sample population - ex. 100 1-10 yr old children in Muntinlupa

Clusters=barangay Sampling frame list of barangays Select clusters (barangays) using simple random sampling Include all children living in the selected barangays

Multi-stage sampling design




for sample surveys of wide coverage, i.e. nationwide surveys


15 regions 2 provinces/region 4 towns/province/region 50 elderly/towns/province/region

RANDOM ALLOCATION in EXPERIMENTAL Studies




Random Allocation the process of assigning subjects to different treatments by using random numbers Example:


Effect of Probiotic Treatment of Acute Tonsillopharyngitis in children 2-5 years of age: A randomized double blind trial  Assuming sample size calculation 50 per group, which patient will receive probiotic?  Use softwares: http://mahmoodsaghaei.tripod.com/Softwares /randalloc.html

What makes a good sample population?


A GOOD SAMPLE must be (1) selected at random to reduce bias (2) representative to improve validity and (3) large enough to increase precision.


How many subjects are to be included in the sample?


 

SAMPLE SIZE CALCULATION Why calculate?


  

for planning purposes for power of the study meaningful results




To minimize sampling error

Sample size calculation




Things to know:


   

type of the study: descriptive or analytic (cohort, case-control, clinical trial)? study objective: proportions or means? usual values? amount of deviation from the true value? Clinically important difference? confidence level? power? one-tailed or two-tailed hypotheses?

Confidence level, Power


Errors in Hypothesis Testing
DATA Support TRUTH Groups are the same Do not reject Ho: Groups are the same Reject Ho: Groups differ OK Type I error E Groups differ Type II error F

OK (1-F) or power

Confidence level, Power


 

 

Type I error -- rejecting a true Ho E -- probability of committing Type I error 1- E -- the confidence level usual values: E= 0.05, 1- E = .95 Type II error -- not rejecting a false Ho F -- probability of committing Type II error 1- F -- power of the study; ability to detect a true difference usual values: F= 0.20, 1- F = 0.80

How do we calculate sample size?


-- Using formulas -- Using tables of sample sizes -- Using statistical calculators (StatCalc of EpiInfo, Open EPI)

How do we calculate sample size?


A.J. Dobsons formula (SIMPLE RANDOM SAMPLE)

Sample size for descriptive studies

1. Estimation of a population proportion

p(100 p) n! v f (1E ) 2 (
where n = computed sample size p = estimate of the proportion ( = the desired width of the confidence interval 1- E = confidence level

Sample size for descriptive studies

1. Estimation of a population proportion

Table 1 Values for f(1-E) for various confidence levels 100 (1-E) %

(1-E)

0.8

0.9

0.95 0.99

f(1-E)* 1.642 2.706 3.842 6.635


* f(1-E) is the square of the upper 1/2 E point of the std. Normal Distribution

Sample size for descriptive studies

1. Estimation of a population proportion


A researcher wants to estimate the prevalence of positive PPD among 1-10 yr old children in Muntinlupa . What is the sample size if it is expected that prevalence is 15%, and a 95% confidence interval will be used for an interval of 4% (11-19%)?

p(100 p) n! v f (1E ) 2 (

Sample size for descriptive studies

1. Estimation of a population proportion

Table 1 Values for f(1-E) for various confidence levels 100 (1-E) %

(1-E)

08

09

0 95 0 99

f(1 E)* 1 642 2 706 3 842 6 635


* f(1-E) is the square of the upper 1/2 E point of the std. Normal Distribution

Sample size for descriptive studies

1. Estimation of a population proportion


A researcher wants to estimate the prevalence of positive PPD among 1-10 yr old children in Muntinlupa . What is the sample size if it is expected that prevalence is 15%, and a 95% confidence interval will be used for an interval of 4% (11-19%)?

p(100 p) n! v f (1E ) 2 (
15(100  15) n! v 3.842 2 4 n ! 306

Sample size for descriptive studies

1. Estimation of a population proportion


A researcher wants to estimate the prevalence of positive PPD among 1-10 yr old children in Muntinlupa . What is the sample size if it is expected that prevalence is 15%, and a 95% confidence interval will be used for an interval of 4% (11-19%)?

n ! 306
To estimate the prevalence of positive PPD among 1-10 yr old children in Muntinlupa with a 4% margin of error at a 95% confidence level, assuming that the population prevalence is 15%, 306 children should be included in the sample.

Sample size calculation using EPI-Info6


http://www.cdc.gov/epiinfo/Epi6/ei6.htm


STATCALC program

http://www.openepi.com/Menu/OpenEpiMenu.htm

Calculate sample size: RCT


Example: Efficacy of VCO as an adjunct in primary TB Therapy among children ages 2-9 years old Objective: To compare resolution of radiologic signs for patients given with VCO and those with placebo

+ resolution

VCO group (Exposed)

(-) resolution + resolution

Placebo group (Unexposed)

(-) resolution

Calculate sample size: RCT


Example: Efficacy of VCO as an adjunct in primary TB Therapy among children ages 2-9 years old Objective: To compare resolution of radiologic signs for patients given with VCO and those with placebo

VCO group (Exposed)

+ resolution
75% (from related literature)

(-) resolution + resolution


50% (from related literature)

Placebo group (Unexposed)

(-) resolution

50% with (+) resolution in Placebo group

75% with (+) resolution in VCO group

SUMMARY


Statistical inference allows us to generalize sample results to the target population random sampling ensures the representativeness of the sample sample size is based on the  research objectives/design  sample estimates, variability from previous studies  power, level of confidence  operational constraints (time, resources)

THANK YOU

You might also like