4 Sampling

Sampling
Why sampling:

Time available to take decision Cost of gathering data Reasonable accuracy of information Destructive testing
Extensively used in MR, QC, economic, biological & pharmaceutical studies
Sampling Errors
Sampling error Non-sampling error
Sample size
Census
Sampling Errors
Sampling errors: variation in
Mean and standard deviation of the survey sample against the population Inaccurate reporting by respondents Poor sampling design Misinterpretation of questions Respondents lying
Non-sampling errors:

Sampling Process
Defining population to be sampled

Element: Unit about which information is collected (consumer, company, dealer, household) Population: Aggregation of elements, relevant segment Sampling Unit: Elements available for selection at some stage of sampling process Survey population: Aggregation of elements from which actual survey sample is chosen
Defining frame (boundaries): Subset of population; geographical or within some published or available data like Rotary and such Clubs Method of selecting sample units Decide on size of sample Identifying & selecting actual members of sample
Determining Sample Size

e = tolerable error acceptable Confidence level:

90% = 1.645x (1.645 sigma/std deviation) 95% = 1.96x 99% = 2.58x
Approximate estimate of standard deviation:
=(Maximum minimum)/6
Sampling distribution of sample means

3x=99.73%
1x=68.27% 2x=95.45% 3x 2x 1x 1x 2x 3x
For continuous or interval scaled variables like: 1-5, 1-7, 1-10 etc
n = ((z*s)/e)2 where z = desired confidence level If 90% = 1.645 If 95% = 1.96 If 99% = 2.58 s = standard deviation; (max-min)/6 e = tolerable error in estimating the variable
Sample Sizes
Interval Scaled Variable: 1 to 7
Error level
Confidence level
Confidence level
Confidence level
0.10 0.05 0.02 0.01
90% 270 1080 6800 27000
95% 385 1540 9600 38500
99% 670 2680 16600 67000
When estimating proportions

n = p*q*(z/e)2 , where, p = Frequency of occurrence expressed as proportion. Example:

1 in 4 = 0.25 1 in 10 = 0.10 Represents things like market share or proportion of target market with respect to variables like age, gender, profession etc p is always less than 1
q = 1-p z = confidence level factor e = tolerable error expressed in (%/100)

3% error = 0.03 5% error = 0.05
Sample size is maximum at p=0.50 for a given z and e
Sample Sizes
Proportions:
e=error level in % at various FOQ: Frequency of Occurrence
Sample size
Confidence level 90%

FOQ 10 FOQ 20 FOQ 30


50 100 500 1000 5000
7.8 10.7 12.3 6.2 4.7 7.2 1.8 1.4 2.1 1.0 0.7 0.4 0.6 0.6
8.5 11.4 13.0 8.0 6.0 9.2 2.7 4.1 1.9 3.6 2.6
9.0 12.0 13.7 9.8 7.3 11.2 5.4 4.0 6.1 2.8 4.3 1.3 3.8 1.7
2.9 0.85 1.1
Cell size analysis:

Sample size should be more than 10 times the required cell Cell: Total category market combination; say age=4groups, income category=4groups; then sample size > 4*4*10 >160
In multiple questions with varying interval scaled variables, set the sample for the major variable If wider geographical coverage is required, insist on minimum sample size at each centre (if sample size obtained from formula become small) Time and budget constraint
Probability Sampling
Simple Random:

Picking out of lot by random Possible for smaller population Subdivide the population by sample size and choose at random one each from the unit You need to select at random 100 out of 2000; No of units=2000/100=20 For every 20 choose 1 or in the first unit let us say we picked 6, then add 20 like 26,36,46 etc
Systematic Sampling
Stratified Random Sampling:
Proportionate: Dividing into segments based on:

% of each segment (wi) Standard deviation of each segment (si)
n = (z/e)2 * sum (wi*si2)
Ex: z=1.96 for Conf. Level 95%; e=0.05 error segment distribution:

< 25 years w1=0.3 26 to 40 years w2=0.3 >40years w3=0.4
s1=1.2 s2=0.9 s3=0.7
Total sample size = 1341 approx

Sample size for <25 years = 666 Sample size for 26 to 40 years = 375 Sample size for >40 years = 300
Type of store Corporate chains
Disproportionate: Used in special cases to balance:

% of allfood stores 8%
% retail food sales 26% 30% 16% 19% 9%
Desired z & e Degree of heterogeneity Relevance of various strata to the study
Co-operatives 10% Large independent Medium independent Small independent 12% 30% 40%
Cluster Sampling / Area Sampling:
In the first stage clusters are identified and selected Sample elements are selected from these clusters Disadvantage: Clusters tend to behave similar Combining cluster sampling and stratified sampling
Multi-stage or Combination Sampling:
Selecting Sample Units
Non-probability samples
Convenience Sampling: on the basis of convenience or accessibility Snowball Sampling: Further samples relying on referrals of the earlier sample units Judgement Sampling: Opinion based on recommendation of experts or by our own assessment of spread of population based on previous studies or data Quota Control Sampling: conforms to chosen parameters of population Census: Total population

4 Sampling

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

4 Sampling

Uploaded by

Copyright:

Available Formats

Sampling

Extensively used in MR, QC, economic, biological & pharmaceutical studies

Sampling errors: variation in

Defining population to be sampled

Determining Sample Size

e = tolerable error acceptable Confidence level:

90% = 1.645x (1.645 sigma/std deviation) 95% = 1.96x 99% = 2.58x

Approximate estimate of standard deviation:

Sampling distribution of sample means

Determining Sample Size

0.10 0.05 0.02 0.01

90% 270 1080 6800 27000

95% 385 1540 9600 38500

99% 670 2680 16600 67000

Determining Sample Size

When estimating proportions

n = p*q*(z/e)2 , where, p = Frequency of occurrence expressed as proportion. Example:

q = 1-p z = confidence level factor e = tolerable error expressed in (%/100)

3% error = 0.03 5% error = 0.05

Sample size is maximum at p=0.50 for a given z and e

e=error level in % at various FOQ: Frequency of Occurrence

Confidence level 90%

Confidence level 95%

Confidence level 99%

50 100 500 1000 5000

2.9 0.85 1.1

Determining Sample Size

Cell size analysis:

Stratified Random Sampling:

Proportionate: Dividing into segments based on:

% of each segment (wi) Standard deviation of each segment (si)

n = (z/e)2 * sum (wi*si2)

Stratified Random Sampling:

< 25 years w1=0.3 26 to 40 years w2=0.3 >40years w3=0.4

s1=1.2 s2=0.9 s3=0.7

Total sample size = 1341 approx

Stratified Random Sampling:

Type of store Corporate chains

Disproportionate: Used in special cases to balance:

% retail food sales 26% 30% 16% 19% 9%

Desired z & e Degree of heterogeneity Relevance of various strata to the study

Cluster Sampling / Area Sampling:

Multi-stage or Combination Sampling:

Selecting Sample Units

You might also like

n = pq(z/e)2 , where, p = Frequency of occurrence expressed as proportion. Example: