You are on page 1of 4

Sample size, the Margin of Error and the Coefficient of Variation

Robert M. Lynch
University of Northern Colorado
Brian Kim
Illinois State University
Sample size determination to estimate a population mean or population proportion is
usually covered in the first course in statistics. The student is expected to state an
acceptable Margin of Error (ME), a level of confidence, and an estimate for the
population standard deviation or an estimate of the population proportion. Once this is
determined the formulas for determining sample size can be applied.
Students usually find the ME for a population proportion easier to establish and usually
respond with a 3 percent (.03) or a 4 percent (.04) ME as acceptable. Perhaps this arises
from their exposure to the results from political polls and other polls reported in the
popular media. However, specifying the ME when estimating a population mean appears
to be more challenging for students and others.
Over the past several years, one of the authors has been asked by accountants, financial
administrators and others to determine the sample size required to estimate a population
mean. In each case, the author asked the individual for the ME he would deem
acceptable. In no case were the individuals initially able to specify a value. To facilitate
a solution, the author suggested the individual decide if he would like to be within 2, 3, 4
or 5 percent of the population mean. For example, in a receivables problem, the
individual might believe the average receivable is about $400.00. Applying these
percents to the estimated mean, the MEs would be $8, $12, $16 and $20 for 2, 3, 4, and 5
percent respectively. At this point, the individual found it easy to decide on $12 or 3
percent as an acceptable ME. The notion of 3 percent was much more acceptable than
specifying a particular dollar amount. Once this was decided, the determination of
sample size could then be determined.
The authors have examined numerous MEs, numerous Coefficients of Variation (CV)
and the resulting sample sizes. It became clear there was a relationship between the MEs,
the CVs and resulting sample sizes. In the next sections, the relationship among these
three are described and demonstrated. The algebraic equivalents are also presented.
Coefficient of Variation The Coefficient of Variation measures the ratio of the standard
deviation of a variable to its mean. It is reported as a proportion or percent and is used to
compare relative variability across several variables. For example, comparing the CVs
across several publicly traded stocks provides insight into relative dispersion across the
issues. The measure assumes the variable is of ratio scale with a mean greater than zero.
The definition provided in introductory texts is:
Coefficient of Variation = standard deviation / mean

Determination of sample size Suppose an individual wishes to estimate the population


mean for household income within a highly populated geographical area. The
investigator would like the sample mean to be within $1,000 of the population mean with
a level of confidence at 90 percent. From a one-year old survey based on 900 households
in the region, the sample mean was $45,000 with a standard deviation of $7,500 while a
more recent pilot study based of 50 households yielded a mean of $50,000. The final
question that is often asked is What size sample should I select to be within $1000 of the
population mean?.
For this problem, the investigator wishes to build a 90 percent confidence interval with a
ME at $1,000 or less. From the historical statistics, an estimate of the standard deviation
was provided at $7,500. Assume it is believed to be a good estimate and perhaps is higher
than one would expect for the population in his region. A z equal to 1.645 can be used
establish the 90 percent confidence interval. The end result would be a confidence
interval of the following form:
Sample Mean +/- Margin of Error
Sample Mean +/- (z * standard deviation) / N .5
Sample Mean +/- $1000
Introductory statistics textbooks provide the following formula for determining the
sample size required to meet a desired ME,
N = (z2 * s2) / e2
N = (1.645 * 1.645) * (7,500 * 7,500) / (1,000 * 1,000)
N = 152.21
where e is the ME, s is the estimated standard deviation, z is the value associated with his
desired level of confidence. This suggests a sample of size 153 or more should provide a
ME at or below $1000.
In the above problem, the investigator decided a $1,000 ME or less was satisfactory. The
$1,000 ME represented a 2 percent error based on the pilot mean of $50,000. If he was
willing to accept a 3 or 4 percent ME, the dollar ME would be $1,500 and $2,000
respectively. If so, the required sample sizes would be smaller.
Relationship: Coefficient of Variation and Margin of Error and sample size In Table
1, three means and three standard deviations that yield CVs at .20, .40, .60 .80 and 1.00
are noted. For each mean and standard deviation at each CV level, three MEs were
created, at 3 percent, 4 percent and 5 percent of the mean. Sample sizes were determined
using the formula from the previous section.

From Table 1, the sample sizes are the equal when the CVs are equal and the ME as a
percent of the error are the same. For example, when the CV = .20 and the ME is 3
percent of the mean, though the means and standard deviations differ, the sample size
remain the same at 120.27. This occurs at all levels of the CVs across the three levels of
MEs. This suggested a relationship between the ME and CV and resulting sample size..
A little algebra can identify the relationship. Begin with the following:
1.

CV = s / Mean  Mean = (s / CV)


ME = % * Mean  ME = % * (s /CV)

(substitution)

ME 2 = % 2 * s 2 / CV 2
2.

3.

N = z 2 * s 2 / ME 2

(Sample size formula, ME=e)

N = z 2 * s 2 / (% 2 * s 2 / CV 2)

(substitution)

N = z 2 * s 2 * CV 2 / (% 2 * s 2)

(simplify)

N = z 2 * CV 2 / (% 2)

From the last formula, one observes that the sample size can be determined by knowledge
of the CV and the percent to be applied to the mean. Table 2 presents the sample sizes
using (3) above for the percents used in Table 1. Observe they are the same as in Table 1.
Comments From (3) above, one conclusion and several corollaries can be stated:
a. If two variables have the same CVs and same percent is applied to determine the
MEs, the sample sizes would be the same for both variables.
b. If two variables have different means and different standard deviations but
identical CVs and identical percents are applied to determine the MEs, the sample
sizes would be the same.
c. If two variables have different CVs or different percents are applied to determine
the MEs, the sample sizes will not necessarily be the same.

Standard Deviation
Mean
Coefficient of Variation

Coefficient of Variation = .20


100
200
300
500
1,000
1,500
0.20
0.20
0.20

Table 1
Sample sizes and Margin of Errors at different Coefficients of Variation
Margin of Errors at 3%, 4% and 5% of the Mean
Coefficient of Variation = .40
Coefficient of Variation = .60
Coefficient of Variation = .80
200
400
600
300
600
900
400
800
1,200
500
1,000
1,500
500
1,000
1,500
500
1,000
1,500
0.40
0.40
0.40
0.60
0.60
0.60
0.80
0.80
0.80

Coefficient of Variation = 1.00


500
1,000
1,500
500
1,000
1,500
1.00
1.00
1.00

3% of Mean - ME
Sample Size

15.00
120.27

30.00
120.27

45.00
120.27

15.00
481.07

30.00
481.07

45.00
481.07

15.00
1,082.41

30.00
1,082.41

45.00
1,082.41

15.00
1,924.28

30.00
1,924.28

45.00
1,924.28

15.00
3,006.69

30.00
3,006.69

45.00
3,006.69

4% of Mean - ME
Sample Size

20.00
67.65

40.00
67.65

60.00
67.65

20.00
270.60

40.00
270.60

60.00
270.60

20.00
608.86

40.00
608.86

60.00
608.86

20.00
1,082.41

40.00
1,082.41

60.00
1,082.41

20.00
1,691.27

40.00
1,691.27

60.00
1,691.27

25.00
173.19

50.00
173.19

75.00
173.19

25.00
389.67

50.00
389.67

75.00
389.67

25.00
692.74

50.00
692.74

75.00
692.74

25.00
1,082.41

50.00
1,082.41

75.00
1,082.41

5% of Mean - ME
25.00
50.00
75.00
Sample Size
43.30
43.30
43.30
1. Coefficient of Variation = s / Mean
2. Margin of Error (ME) = x percent * Mean
3. Sample size = (z*z)(s*s)/(e*e) at 90 percent confidence

3% of Mean
4% of Mean
5% of Mean

Coefficient of Variation = .20


120.27
120.27 120.27
67.65
67.65
67.65
43.30
43.30
43.30

Table 2
Sample sizes and Margin of Errors at different Coefficients of Variation
N=(z*z)*(cv*cv)/(%mean*%mean)
Coefficient of Variation = .40
Coefficient of Variation = .60
Coefficient of Variation = .80
481.07
481.07 481.07
1,082.41 1,082.41 1,082.41
1,924.28 1,924.28 1,924.28
270.60
270.60 270.60
608.86
608.86
608.86
1,082.41 1,082.41 1,082.41
173.19
173.19 173.19
389.67
389.67
389.67
692.74
692.74
692.74

Coefficient of Variation = 1.00


3,006.69 3,006.69 3,006.69
1,691.27 1,691.27 1,691.27
1,082.41 1,082.41 1,082.41

You might also like