Professional Documents
Culture Documents
ASW, Chapter 8
Any population
A sample size n of greater than 100 is generally considered sufficiently large to use these results from the CLT.
Probability that a sample mean is within a specified distance of the population mean
Within $100 of the mean is from 2352 - 100 = 2252 to 2352 + 100 = 2452. The sampling distribution of the sample means is normal since the sample size n = 50 is large, and = 2352 and = 210. The required probability is the area under the normal curve between 2252 and 2452. Obtain the corresponding Z-values.
In the last example, if n = 200, the standard error is 1485 divided by the square root of 200, or 105. With this larger sample size, the probability that a sample mean is within $100 of the population mean is the area under a normal curve between z = -0.95 and z = 0.95, or 0.6578.
Probability of sample mean being within $100 of 0.37 0.66 0.87 0.97
From these calculations, note how the larger sample size produces sampling distributions where the sample mean is generally closer to the population mean . The last column shows how there is increased probability that the sample mean is within $100 of the population mean as n becomes larger.
The margin of error is estimated to be plus or minus 3.51 per cent, 19 times out of 20. From the Palliser electoral district poll reporting Conservative at 43.3%, NDP at 35.7%, Liberal at 17.3%, and Green at 3.5% of decided voters, conducted by Sigma Analytics.
Source: Leader-Post, Regina, October 3, 2008, pp. A1-A2.
Modified FIGURE 8.1 SAMPLING DISTRIBUTION OF THE SAMPLE MEAN AMOUNT SPENT FROM SIMPLE RANDOM SAMPLES OF 100 CUSTOMERS
A sampling distribution of the sample mean for a simple random sample of 100 individuals from a population with a standard deviation of 20. The mean of the sampling distribution of x is the population mean and its standard deviation, or standard error, is 2. This distribution can be used to construct an interval estimate of .
Select a confidence level. The most common level is 95%. Obtain the margin of error associated with the confidence level. For a normal distribution, the interval from Z = -1.96 to Z = 1.96 contains 95% of the area under the curve or of the sample means. See next slide to illustrate this. x 1 . 96 to x 1 . 96 The 95% interval estimate is
n n
Modified FIGURE 8.2. SAMPLING DISTRIBUTION OF x SHOWING THE LOCATION OF SAMPLE MEANS THAT ARE WITHIN 3.92 Z-values OF
In this example, the standard error is 2 and the margin of error is 2 x 1.96 = 3.92. For the general case, 1.96 is multiplied by the standard error to determine the margin of error.
Statistics of total income, Saskatchewan females employed full-time and full-year, by age, 2003 Age group Income in thousands of dollars Mean 25-34 35-44 45-54 55-64 33.3 40.3 45.1 40.1 Standard deviation 13.5 20.7 25.9 25.9 55 57 37 31 Sample size
Source: Data for this question adapted from Statistics Canada. General Social Survey of Canada, 2003. Cycle 17: Social Engagement [machine readable data file]. 1st Edition. Ottawa, ON: Statistics Canada [publisher and distributor] 10/1/2004. Obtained through University of Regina Data Library Services.
Analysis: The pattern in the samples is clear increased mean income from ages 25-34 to 45-54, then a decline for ages 55-64. However, the data from each of the four age groups is a sample, so interval estimates are necessary to comment on whether this pattern appears to hold for all females.
Explain why the margins of error differ as they do. Explain the pattern of mean income by age for all females of each age group, now that interval estimates are available.
Determination of
In order to construct an interval estimate, it is necessary to obtain some estimate of , the variability of the population from which the sample is drawn. This is required to obtain an estimate of the standard error of the sample mean x n Generally, the sample standard deviation s is used as an estimate of . For large sample size, assume the CLT holds and assume s provides a reasonable estimate of . For a small sample, where n < 30, the t-distribution should be used, again using s as an estimate of . In sections 8.1 and 8.2, ASW distinguish methods for when is known and unknown. In practice is rarely known and in note 1, p. 299, ASW state this. In addition, as n increases, the t-distribution approaches the normal distribution. Thus, so long as n > 30, it is acceptable to use s as an estimate of for purposes of constructing an interval estimate.
Next week
t-distribution (ASW, sections 8.1, 8.2). Sample size (ASW, section 8.3) Interval estimates for proportions (ASW, sections 6.3, 7.6, 8.4). Extra office hour Friday, October 10, 1-3 p.m., CL 237.