You are on page 1of 78

Sampling Theory and

Methods
{ Lesson 1
Basics of Sampling Theory
Learn the reasons for sampling

Develop an understanding about different


sampling methods

Distinguish between probability and non-


probability sampling

Discuss the relative advantage and


disadvantages of each sampling methods

Learning Objectives
Scientific research is systematic, controlled,
empirical, and critical investigation of natural
phenomena guided by theory and hypotheses
about the presumed relations among such
phenomena.- Kerlinger, 1986

Research is an organized and systematic way of


finding answers to questions.

What is research?
Problem statement, research questions,
purposes, benefits
Theory, assumptions, background literature
Variables and hypotheses
Operational definitions and measurement
Research design and methodology
Instrumentation, sampling
Data analysis
Conclusions, interpretations, recommendations

Important Components of
Empirical Research
Very important part of the research process.

Concerned with selecting a relatively small


number of elements (sample) from a larger
group of elements (population).

Often used in a survey research when it is


difficult or impossible to conduct a census
where data are gathered from every element of
the population.

Sampling
Sample

A sample is a smaller (but hopefully


representative) collection of units from a
population used to determine truths about that
population(Field, 2005)

Synonymous to an element of the population

SAMPLING
TERMINOLOGIES
Sampling frame
List from which the potential respondents are
drawn
E.g. registrars office, class rosters (must assess
sampling frame errors)

Sampling units

Elements of the target population available for


selection during the sampling process

SAMPLING
TERMINOLOGIES
Population

A set of elements usually grouped (e.g. people,


organizations) from which data pertinent to a
problem may be collected.

Defined target population consists of


elements identified as key informants essential
for investigation based on the limitations of the
research project.

SAMPLING
TERMINOLOGIES
What is your population of interest?
To whom do you want to generalize your
results?
E.g. all doctors, school children, women aged
15-45 years, etc.
Can you sample the entire population?

Population of interest
Coverage of a target
population by a frame
Undercoverage
Ineligible units
Elements in the Frame population
target population
missing from the
frame Ineligible units
i.e.:non telephone
household, using a
Elements in the
Covered population frame that are no
telephone frame to
cover the full member of the
household population target population
Undercoverage
i.e.:business telephone
numbers, using a
telephone frame to
cover the full household
Target population population
Sampling Distribution

Theoretical concept referring to the frequency


distribution of a specific sample statistics.

Sampling Error

Any type of bias that results from mistakes


either in the determination of the sample size
or in the utilization of the sampling technique
or method.

SAMPLING
TERMINOLOGIES
Confidence Interval (Margin of Error)

Statistical range of values within which the true


value of the target population parameter is
expected to lie. The plus-or-minus figure
usually reported in opinion poll results.

For example, if you use a confidence interval of 4


and 47% of your sample picks an answer you can
be sure that if you had asked the question of
the entire relevant population between 43% (47-4)
and 51% (47+4) would have picked that answer.

SAMPLING
TERMINOLOGIES
Confidence Level

Tells you how sure you can be. Expressed as a


percentage and represents how often the true
percentage of the population who would pick
an answer lies within the confidence interval.

Most common confidence intervals 90%


confident, 95% confident, and 99% confident.

SAMPLING
TERMINOLOGIES
Central Limit Theorem

For almost all target populations, the sampling


distributions of the means or the percentage value
derived from a simple random sample will be
approximately be normally distributed provided
that the sample size is sufficiently large. (Hair, 2006)

The probability is high when the mean of any


sample taken from the target population closely
approach as one increases the size of the sample.

SAMPLING
TERMINOLOGIES
Standard Deviation

Variance expected in responses

Use 0.5, a safe decision and ensures that the


sample will be large enough

SAMPLING
TERMINOLOGIES
3 Factors that influence sample
representativeness
Sampling procedure
Sample size
Participation (response)
When might you sample the entire population?
When your population is very small
When you have extensive resources
When you dont expect a very high response
Factors that determine sample size are:
Nature of data and data analysis
Kind and number of comparisons
Number of variables to be investigated
Desired degree of accuracy
Homogeneity of samples

Determining Sample size for:


Unknown population
Known population

SIZE
DETERMINATION
Confidence level corresponds to a Z-score, a constant
value needed for the equation.

Z-scores for the most common confidence levels:


80% - Z-score = 1.281
90% - Z-score = 1.645
95% - Z-score = 1.96
97% - Z-score = 2.17
99% - Z-score = 2.576

For other values


(http://www.utdallas.edu/dept/abp/zscoretable.pdf Or
http://www.sjsu.edu/faculty/gerstman/StatPrimer/z-
two-tails.pdf)

SIZE
DETERMINATION
Sample sizes as small as 30 are generally
adequate to ensure that the sampling
distribution of the mean will approximate the
normal curve (Shott, 1990 as cited in Cristobal,
2013)

When the total population is equal to or less


than 100, this same number may serve as the
sample size (universal sampling).

SIZE
DETERMINATION
Used in determining sample sizes from
unknown population

Necessary Sample Size =


(Z-score)2 * StdDev * (1 StdDev) / (margin of
error)2

Sample size = 0.25 (desired certainty / acceptable


error) 2

FORMULA
Unknown Population Size
Assuming you chose 95% confidence level, 0.5
standard deviation, and a margin of error
(confidence interval) of +/- 5%. How many
respondents are needed?

recall: Necessary Sample Size =


(Z-score)2 * StdDev * (1 StdDev) / (margin of
error)2

(1.96)2 * 0.5 (0.5) / (0.05)2


(3.8416 * 0.25) / 0.0025
0.9604 / 0.0025
384.16 (385 respondents rounded up)

Example 1
If a survey of a group of teachers requires a
confidence level of 95% and 5 % acceptable
error, then the researcher needs a sample size
of 384 teachers that will have to be surveyed.

For 95% certainty, use 1.96

0.25 (1.96 / 0.05)2 =?


0.25 (1,536.64) =?
385 teachers (rounded up) =?

Example 2
If a researcher wants to examine a group of
students with 97% certainty, then.

For 97% certainty, use 2.17

0.25 (2.17 / 0.03)2 =?


0.25 (5,232.111) =?
1,308 students = ?

Example 3
Assume you wish to randomly sample teachers
across the southeast USA. With a 90%
confidence level, what sample size would be
needed?

0.25 (1.645/0.10)2 =?
0.25 (270.6025) =?
67.65 =?
68 teachers (rounded up)

Exercise
You wish to estimate the proportion of all
buyers that are young buyers with 95%
confidence. How many observations are
needed to estimate the population proportion
within 0.06 error?
Assuming that nothing is known about the
population proportion.
Assuming that the population proportion has
recently been estimated to be 0.34.

Exercise
Assuming that nothing is known about the
population proportion.

(1.96)2 * 0.5 (0.5) / (0.06)2


(3.8416 * 0.25) /0.0036
0.9604 / 0.0036
266.7778 (267 respondents rounded up)

Exercise
Assuming that the population proportion has
recently been estimated to be 0.34.

Since a prior estimate of the population proportion is


known (0.34), use it as the standard deviation.

(1.96)2 * 0.34 (1 0.34) / (0.06)2


(3.8416) * 0.34(0.66) / 0.0036
(3.8416 * 0.2244) /0.0036
0.8620/ 0.0036
240 respondents rounded up

Exercise
Use of Slovins Formula (used to calculate an
appropriate sample size from a population).

n = N / (1 + Ne2)
Where:
n = sample size
N = population
e = estimate of error (acceptable error)
* The error tolerance, e, can be given (in a question) or
if you want to figure it out on your own, just
subtract the confidence level from 1.

FORMULA
Known Population Size
Estimate the sample size from a population of
5000 freshmen students using 5% acceptable
error.

n = N / (1+ Ne2)
= 5000 / (1 + (5000)(0.05)2)
= 5000 / (1 + 12.5)
= 5000 / 13.5
= 370 students

Example 1
A researcher plans to conduct a survey. If the
population on Smith City is 1,000,000 find the
sample size if the margin of error is 15%

n = N / (1+ Ne2)
= 1,000,000 / (1 + (1,000,000)(0.15)2)
= 1,000,000 / (1 + (1,000,000)(0.0225))
= 1,000,000 / (1 + 22,500)
= 1,000,000 / 22,501
= 45 people (rounded up)

Exercise
Sampling Theory
and Methods
{ Lesson 2
Sampling Methods
Suppose a school would like to determine the
weekly food expenditure of its students, if
there are 1,000 students and the guidance
counselor decided to use only 100 students as a
sample, who will be included in the sample?

The Problem
SAMPLING BREAKDOWN
.
SAMPLING

STUDY POPULATION

SAMPLE

TARGET POPULATION
Sampling Method or Sampling Technique is the
process of determining or selecting those
sample units which would provide the
required estimates with associated margins of
error, arising from examining only a part and
not the whole.

Two methods are used: probability sampling


and non-probability sampling.

Sampling Method
Sampling
Technique

Probability Non-Probability
Sampling Sampling

Simple Random Sampling Convenience Sampling

Systematic Random Sampling Quota Sampling

Stratified Random Sampling Purposive Sampling

Cluster Sampling

Multi-stage Sampling

Sampling Method
Each eligible member of the population has a
specific and known chance (greater than zero)
of being included in the sample.

Also termed as scientific sampling.

When using this technique, it is important to


have a complete list of the members of the
population.

Probability Sampling
When every element in the population does
have the same probability of selection, this is
known as an equal probability of selection
(EPS) design.

Also referred to as self-weighting because all


sampled units are given the same weight.

Probability Sampling
Applicable when population is small,
homogenous & readily available

Lottery method
Write names or codes on a piece of paper
Put in a container
Randomly select the desired number of samples

Fishbowl technique

Probability Sampling
Simple Random Sampling
Advantage
Estimates are easy to calculate
Simple random sampling is always an EPS
design but not all EPS designs are simple
random sampling.
Disadvantage
If sampling frame is large, the method is
impracticable
Minority subgroups of interest in population
may not be present in sample in sufficient
numbers for study

Probability Sampling
Simple Random Sampling
Relies on arranging the target population
according to some ordering scheme and then
selecting elements at regular intervals through
that ordered list.

Choose a starting point then select every kth


element of the population using k =(population
size/sample size).

Probability Sampling
Systematic Random Sampling
Ex. The town of Fairfax is divided up into 576
blocks which are numbered consecutively. A 10
percent sample blocks is to be taken for the
study. If the random number chosen between 1
to k is 3, what are the blocks considered for the
sample?

Probability Sampling
Systematic Random Sampling
Solution:
Determine k; k = 576 / 10 percent of 576
= 576 / 57.6 = 10
this means you have to include every 10th member
of the population after choosing a random start.
Random number = 3 (can be determined via the
lottery method)
Blocks considered for the ample are: 03, 13, 23,
33, 43, , 573

Probability Sampling
Systematic Random Sampling
systematic sampling is an EPS method, because all
elements have the same probability of selection (in
the example given, one in ten). It is not 'simple
random sampling' because different subsets of the
same size have different selection probabilities - e.g.
the set {4,14,24,...,994} has a one-in-ten probability of
selection, but the set {4,13,24,34,...} has zero
probability of selection.

Probability Sampling
Systematic Random Sampling
Advantages
Sample is easy to select
Suitable sampling frame can be identified easily
Sample evenly spread over entire reference
population.

Disadvantages
Sample may be biased if hidden periodicity in
population coincides with that of selection.
Difficult to asses precision of estimate from one
survey.

Probability Sampling
Systematic Random Sampling
Where population embraces a n umber of
distinct categories, the frame can be organized
into separate strata.
Each stratum is then sampled as an
independent sub-population, out of which
individual elements can be randomly selected.
Adequate representation of minority
subgroups of interest can be ensured by
stratification and varying sampling fraction
between strata as required.

Probability Sampling
Stratified Random Sampling
Ex. Assuming a barangay of 10,000 families
belonging to different income brackets, a
survey to find out how many are in favor of the
RH bill is to be conducted. To ensure that all
income groups are represented, respondents
will be divided into High-Income, Average-
Income, and Low-Income.

Probability Sampling
Stratified Random Sampling
Number of
Strata
families
High Income 2,000
Average Income 5,000
Low Income 3,000
N = 10,000

Probability Sampling
Stratified Random Sampling
Using a 5% margin of error, how many families
should be included in the sample?

Using either proportional or equal allocation,


how many from each group should be taken as
samples?

Probability Sampling
Stratified Random Sampling
Using a 5% margin of error, how many families
should be included in the sample?

n = 10,000 / (1 + (10,000)(0.05)2 )
n = 10,000 / (1 + (10,000)(0.0025)
n = 10,000 / (1 + 25)
n = 10,000 / 26
n = 385 families (rounded up)

Probability Sampling
Stratified Random Sampling
Using either proportional or equal allocation,
how many from each group should be taken as
samples?

Probability Sampling
Stratified Random Sampling
Number of
Strata Percent Proportional Equal
families
High 2,000 / 10,000
2,000 0.2(385) = 77 385/3 = 128
Income = 0.20 or 20%
Average 5,000 / 10,000
5,000 0.5(385) = 193 129
Income = 0.50 or 50%
Low 3,000 / 10,000
3,000 0.3(385) = 115 128
Income = 0.30 or 30%
N = 10,000 100% 385 385

Probability Sampling
Stratified Random Sampling
Drawbacks to using stratified sampling.
Sampling frame of entire population has to be
prepared separately for each stratum
When examining multiple criteria, stratifying
variables may be related to some, but not to
others, further complicating the design and
potentially reducing the utility of the strata
In some cases (such as design with a large
number of strata, or those with a specified
minimum sample size per group), stratified
sampling can potentially require a larger sample
than would other methods.

Probability Sampling
Stratified Random Sampling
Normally used in large-scale studies in which
the population is geographically spread out
where sampling procedures may be difficult
and time consuming.
Example of two-stage sampling
Process:
First stage, divide the population area into
sections (or clusters) then randomly select a few
of those sections
Second stage, randomly select a sample from
those sections (within those areas selected)

Probability Sampling
Cluster Sampling
Ex. Suppose we want to determine the average
daily expenses of families in the City of Santa
Rosa. We can draw a random sample of 5
barangays from the total 15 using random
sampling and then get a certain number of
families from each of the 5 barangays.

Probability Sampling
Cluster Sampling
Advantages
Cuts down on the cost of preparing a sampling
frame
Can reduce travel and other administrative costs

Disadvantages
Sampling error is higher for a simple random
sample of same size

Probability Sampling
Cluster Sampling
Complex form of cluster sampling in which
two or more levels of units are embedded one
in the other.

Combination of several sampling techniques


discussed

Essentially the process of taking random


samples of preceding random samples

Probability Sampling
Multi-Stage Sampling
Not as effective as true random sampling, but
probably solves more of the problems inherent
to random sampling.

Used frequently when a complete list of all


members of the population does not exists and
is in appropriate.

Probability Sampling
Multi-Stage Sampling
Members of the sample size are drawn based
on their availability or because of the judgment
of the researcher.

It is a process of selection in which not all


members of the entire population are given a
chance of being selected as samples.

Sometimes referred to as subjective sampling or


non-scientific sampling.

Non-Probability Sampling
Example: We visit every household in a given
street, and interview the first person to
answer the door. In any household with more
than one occupant, this is a nonprobability
sample, because some people are more likely to
answer the door (e.g. an unemployed person
who spends most of their time at home is
more likely to answer than an employed
housemate who might be at work when the
interviewer calls) and it's not practical to
calculate these probabilities.

Non-Probability Sampling
A type of nonprobability sampling which
involves the sample being drawn from that part
of the population which is close to hand.

Simply use results that are readily available.

Sometimes known as grab or opportunity


sampling. Also called accidental or incidental
sampling.

Non-Probability Sampling
Convenience Sampling
Non-Probability Sampling
Convenience Sampling
Similar to stratified sampling in which the
researcher first identifies the strata and their
proportions as they are represented in the
population. Then judgment is used to select
subjects or units from each segment based on a
specified proportion.

Convenience or judgment sampling is used to


select the required number of subjects from
each stratum.

Non-Probability Sampling
Quota Sampling
Handpicking of subjects; also called judgmental
sampling

Selects members or elements based on the


particular purpose of the experiment or study.

Useful for situations where you need to reach a


targeted sample quickly and where sampling
for proportionality is not the primary concern.

Non-Probability Sampling
Purposive Sampling
The sampling process comprises of several
stages:
Defining the population of concern
Specifying a sampling frame, a set of items or
events possible to measure
Specifying a sampling method for selecting items
or events from the frame.
Determining the sample size
Implementing the sampling plan
Sampling and data collecting
Reviewing the sampling process

Sampling Process
Suppose a school would like to determine the
weekly food expenditure of its students, if there are
1,000 students and the guidance counselor decided
to use only 100 students as a sample, who will be
included in the sample?

Find a partner and help the school


decide on which sampling technique
should they use in selecting students
for the sample. Present three (3)
choices.

The Problem
What sampling method do you recommend?
Determining proportion of undernourished five
year olds in a village.
Investigating nutritional status of preschool
children.
Selecting maternity records for the study of
previous abortions or duration of postnatal stay.
In estimation of immunization coverage in a
province, data on seven children aged 12-23
months in 30 clusters are used to determine
proportion of fully immunized children in the
province.

Exercise
Identify which sampling technique was used in
the following scenarios:
1. When she wrote Women and Emoticons, author
KC Pascual based conclusions on 4,500
responses from 100,000 questionnaires
distributed to women.
2. The Guidance Counselor of a high school
surveys all students from each of the 20
randomly selected classes.
3. A sociologist at selects 15 men and 15 women
from each of 4 Math classes.

Exercise
Identify which sampling technique was used in
the following scenarios (contd):
4. IBM selects 200th compact disk from the
assembly line and conducts a thorough test of
quality
5. The court secretary writes the name of each
Municipal Judge on a separate card, shuffles
the cards, and then draws 3 names.
6. Pro-RH bill lobbyists polls 300 men and 300
women about their views concerning the use of
contraceptives.

Exercise
Identify which sampling technique was used in
the following scenarios (contd):
7. The marketing manager of ebay.ph tests a new
sales strategy by randomly selecting 150
consumers with less than P100,000 in gross
income and 150 consumers with gross income
of at least P100,000.
8. A market researcher for Champion Detergent
interviews all passengers on each of 10
randomly selected PUVs.

Exercise
Identify which sampling technique was used in
the following scenarios (contd):
9. A medical researcher from Unilever Phils.
interviews all leukemia patients in each of 20
randomly selected hospitals.
10. In conducting research for the evening news, a
reporter for ABS-CBN interviews 15 people as
they leave a shopping mall.

Exercise
Solve the following:
A researcher would like to investigate the
perception of students on mathematics. He
divided the population into sub-populations as
shown in the table. Use stratified random
sampling if the sample to be drawn consists of
500 students.

Exercise
Strata Number of Students
First year 1600
Second year 1500
Third year 1400
Fourth year 1000

Exercise
The end
Use of Slovins Formula (used to calculate an
appropriate sample size from a population).

n = N / (1 + Ne2)
Where:
n = sample size
N = population
e = estimate of error (acceptable error)
* The error tolerance, e, can be given (in a question) or
if you want to figure it out on your own, just
subtract the confidence level from 1.

FORMULA
Known Population Size
Cochrans Formula (William G. Cochran)

Where:

N is the population size, z is the standard normal


variate based on the confidence coefficient, p is
the estimate for P, and e is a specified margin of
error.
H

FORMULA
Known Population Size

You might also like