You are on page 1of 17

Sampling

Why sampling:

Time available to take decision Cost of gathering data Reasonable accuracy of information Destructive testing

Extensively used in MR, QC, economic, biological & pharmaceutical studies

Sampling Errors
Sampling error Non-sampling error

Sample size
Census

Sampling Errors

Sampling errors: variation in

Mean and standard deviation of the survey sample against the population Inaccurate reporting by respondents Poor sampling design Misinterpretation of questions Respondents lying

Non-sampling errors:

Sampling Process

Defining population to be sampled


Element: Unit about which information is collected (consumer, company, dealer, household) Population: Aggregation of elements, relevant segment Sampling Unit: Elements available for selection at some stage of sampling process Survey population: Aggregation of elements from which actual survey sample is chosen

Defining frame (boundaries): Subset of population; geographical or within some published or available data like Rotary and such Clubs Method of selecting sample units Decide on size of sample Identifying & selecting actual members of sample

Determining Sample Size


e = tolerable error acceptable Confidence level:


90% = 1.645x (1.645 sigma/std deviation) 95% = 1.96x 99% = 2.58x

Approximate estimate of standard deviation:

=(Maximum minimum)/6

Sampling distribution of sample means


3x=99.73%

1x=68.27% 2x=95.45% 3x 2x 1x 1x 2x 3x

Determining Sample Size

For continuous or interval scaled variables like: 1-5, 1-7, 1-10 etc

n = ((z*s)/e)2 where z = desired confidence level If 90% = 1.645 If 95% = 1.96 If 99% = 2.58 s = standard deviation; (max-min)/6 e = tolerable error in estimating the variable

Sample Sizes
Interval Scaled Variable: 1 to 7

Error level

Confidence level

Confidence level

Confidence level

0.10 0.05 0.02 0.01

90% 270 1080 6800 27000

95% 385 1540 9600 38500

99% 670 2680 16600 67000

Determining Sample Size

When estimating proportions


n = p*q*(z/e)2 , where, p = Frequency of occurrence expressed as proportion. Example:


1 in 4 = 0.25 1 in 10 = 0.10 Represents things like market share or proportion of target market with respect to variables like age, gender, profession etc p is always less than 1

q = 1-p z = confidence level factor e = tolerable error expressed in (%/100)


3% error = 0.03 5% error = 0.05

Sample size is maximum at p=0.50 for a given z and e

Sample Sizes
Proportions:

e=error level in % at various FOQ: Frequency of Occurrence

Sample size

Confidence level 90%


FOQ 10 FOQ 20 FOQ 30

Confidence level 95%


FOQ 10 FOQ 20 FOQ 30

Confidence level 99%


FOQ 10 FOQ 20 FOQ 30

50 100 500 1000 5000

7.8 10.7 12.3 6.2 4.7 7.2 1.8 1.4 2.1 1.0 0.7 0.4 0.6 0.6

8.5 11.4 13.0 8.0 6.0 9.2 2.7 4.1 1.9 3.6 2.6

9.0 12.0 13.7 9.8 7.3 11.2 5.4 4.0 6.1 2.8 4.3 1.3 3.8 1.7

2.9 0.85 1.1

Determining Sample Size

Cell size analysis:


Sample size should be more than 10 times the required cell Cell: Total category market combination; say age=4groups, income category=4groups; then sample size > 4*4*10 >160

In multiple questions with varying interval scaled variables, set the sample for the major variable If wider geographical coverage is required, insist on minimum sample size at each centre (if sample size obtained from formula become small) Time and budget constraint

Probability Sampling

Simple Random:

Picking out of lot by random Possible for smaller population Subdivide the population by sample size and choose at random one each from the unit You need to select at random 100 out of 2000; No of units=2000/100=20 For every 20 choose 1 or in the first unit let us say we picked 6, then add 20 like 26,36,46 etc

Systematic Sampling

Probability Sampling

Stratified Random Sampling:

Proportionate: Dividing into segments based on:


% of each segment (wi) Standard deviation of each segment (si)

n = (z/e)2 * sum (wi*si2)

Probability Sampling

Stratified Random Sampling:

Ex: z=1.96 for Conf. Level 95%; e=0.05 error segment distribution:

< 25 years w1=0.3 26 to 40 years w2=0.3 >40years w3=0.4

s1=1.2 s2=0.9 s3=0.7

Total sample size = 1341 approx


Sample size for <25 years = 666 Sample size for 26 to 40 years = 375 Sample size for >40 years = 300

Probability Sampling

Stratified Random Sampling:

Type of store Corporate chains

Disproportionate: Used in special cases to balance:


% of allfood stores 8%

% retail food sales 26% 30% 16% 19% 9%

Desired z & e Degree of heterogeneity Relevance of various strata to the study

Co-operatives 10% Large independent Medium independent Small independent 12% 30% 40%

Probability Sampling

Cluster Sampling / Area Sampling:

In the first stage clusters are identified and selected Sample elements are selected from these clusters Disadvantage: Clusters tend to behave similar Combining cluster sampling and stratified sampling

Multi-stage or Combination Sampling:

Selecting Sample Units

Non-probability samples

Convenience Sampling: on the basis of convenience or accessibility Snowball Sampling: Further samples relying on referrals of the earlier sample units Judgement Sampling: Opinion based on recommendation of experts or by our own assessment of spread of population based on previous studies or data Quota Control Sampling: conforms to chosen parameters of population Census: Total population

You might also like