Lecture 7 N

Highlight the last lecture
27 September, 2011
STAT 101 -- Part VI
Highlight the last lecture (contd)
Assumption: Population is normal yes

27 September, 2011
No
n is large yes
No
Resampling
nonparametric
STAT 101 -- Part VI
27 September, 2011
STAT 101 -- Part VI
27 September, 2011
STAT 101 -- Part VI
27 September, 2011
STAT 101 -- Part VI
VII. Confidence Intervals

Point
and interval estimations of Means of normal distribution and nonnormal distribution Proportion parameter of binomial distribution Determining sample size For the mean For the proportion
27 September, 2011 STAT 101 -- Part VII 6
Confidence interval estimation
27 September, 2011
STAT 101 -- Part VII
Confidence Intervals for the population mean
Assumptions: Population is normally distributed Standard deviation of population is given
27 September, 2011
27 September, 2011
27 September, 2011
10
Example: protein intake

Find the 95% confidence interval for the average daily protein intake of men aged 20-25. Population standard deviation is 58.6 grams. The random sample with size 267 men aged 20-25 is observed. The margin of error is Before collected the data from 267 men, we can say that there are 95% chance of the random interval will include It is noted that sample mean is still a random variable before collecting any data and we are still talking about probability. After collecting the daily protein intake of these 267 men and calculating the sample mean of 72.1 grams.
27 September, 2011
11
After obtained the numerical result from sampling, we can not say that the population mean falls between 65.071g and 79.129g with 95% chance. The correct way to present the result is: The 95% confidence interval for the average daily protein intake for men aged 20-25 is (65.071g, 79.129g) Having determined a numerical result from one specific sample, it is no longer sensible to speak about the probability of its covering the fixed quantity If many repeated samples with same sample size were taken from the same population and the confidence intervals were constructed, the proportion of intervals containing would be approximately 0.95.
STAT 101 -- Part VII 12
27 September, 2011
Excel output of the example (Protein Intake):
27 September, 2011
13
Interpretation of confidence intervals
True population mean

Values below true mean Values above true mean
http://www.socr.ucla.edu/Applets.dir/ConfidenceInterval.html
Confidence intervals for the mean with unknown population variance

The only assumption is the population distribution is normal. The population standard deviation is unknown. It is reasonable to estimate the population standard deviation from the sample standard deviation.
27 September, 2011
15
Why is it t-distribution?
27 September, 2011
16
27 September, 2011
17
Insurance example (contd)
27 September, 2011
18
Excel output of the example (insurance):
27 September, 2011
19
Large sample size cases
No assumptions of normal population distribution or the population variance. If the sample size is sufficiently large, the Central-Limit Theorem may be applied to guarantee that
27 September, 2011
20
Flow Chart for determining the distributions
Is population distribution normal?

yes
no
Is sample size sufficiently large (n >=30), such that CLT applied?

yes no
Is population standard deviation given?

yes
no
Use other methods
Normal tables
27 September, 2011
t-distribution tables
Large sample size (>120)
Normal tables
21
Factors affecting the length of a confidence interval
The shorter the length of confidence interval, the better the estimation Consider the confidence interval for population mean
The length of confidence interval is then The length depends on S, n and
n increases, length decreases increases (confidence level decrease), length decreases S increases, length increases
S
27 September, 2011
Determining sample size for the mean
The required sample size can be found to reach a desired margin of error with a specified level of confidence. The margin of error is also called sampling error The margin of error can be interpreted as

the amount of imprecision in the estimate of the population parameter the amount added and subtracted to the point estimate to form the confidence interval
27 September, 2011
Requirements of determining sample size
27 September, 2011
24
Numerical example

A consumer group wants to estimate the mean electric bill for the amount of July for single-family homes in a large city. Based on studies conducted in other cities, the standard deviation is assumed to be $25. The group wants to estimate the mean bill for July to within $5 with 99% confidence. What sample size is needed?
27 September, 2011
25
27 September, 2011
26
Estimation for the binomial distribution
Recall the common structure of the binomial distribution: A sample of n independent trials Each trial can have only two possible outcomes which are denoted as `success and `failure The probability of a success at each trial is assumed to be constant p The parameters of the binomial distribution are n and p Now, assume that p is unknown and we want to use the sample proportion to estimate p
27 September, 2011
Point estimation: sample proportion
27 September, 2011
28
27 September, 2011
29
Sampling distribution of sample proportion
1st sample of n
2nd sample of n
3rd sample of n
Population
Sampling distribution of
kth sample of n
Sampling distribution of sample proportion
In previous section, we discussed that normal approximation to the binomial distribution In fact, the normal approximation can be justified on the basis of the Central-Limit Theorem since sample proportion is just a sample mean The textbook uses the rule of CLT: By the CLT, we get
27 September, 2011
31
27 September, 2011
32
Example
During June and July of 2001, the European Union Executive Commission conducted a study of 6,543 European adults. Of those surveyed, 56% said that the euro single currency would promote economic growth and 73% knew the correct date of the changeover (January 1, 2002). Construct a 95% confidence interval estimate for the proportion of European adults who believe that the euro would promote economic growth. Interpret the interval constructed.
27 September, 2011
33
Excel output of the example (Euro)
27 September, 2011
34
Requirements of determining sample size for the proportion
27 September, 2011
35
Numerical example:
A study of 658 CEOs conducted by the Conference Board reported that 250 stated that their companys greatest concern was sustained and steady top-line growth (CEOs Greatest Concerns, USA Today Snapshots, May 8, 2006, P1D). To conduct a follow-up study to estimate the population proportion of CEOs whose greatest concern was sustained and steady topline growth to within 0.01 with 95% confidence, how many CEOs would you survey?
27 September, 2011
36
27 September, 2011
37
Useful and interesting websites

http://www.socr.ucla.edu/Applets.dir/Confid enceInterval.html Confidence Intervals simulations
http://en.wikipedia.org/wiki/Confidence_interval
Confidence Intervals information
27 September, 2011
38
Assignment 2: Box-and-Whisker Plot

G2
G1
10
15
20
Stem-and-Leaf Display
25
30
35
40
Stem unit: Statistics Sample Size Mean Median Std. Deviation Minimum Maximum 94 37.08511 40 4.760183 12 40 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
1 0
00 0 00 0 000 00 00 0000 00000 0000 0000000 0000 000
40 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
27 September, 2011
39

Lecture 7 N

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 7 N

Uploaded by

Copyright:

Available Formats

Highlight the last lecture

STAT 101 -- Part VI

Highlight the last lecture (contd)

Assumption: Population is normal yes

STAT 101 -- Part VI

STAT 101 -- Part VI

STAT 101 -- Part VI

STAT 101 -- Part VI

VII. Confidence Intervals

Confidence interval estimation

STAT 101 -- Part VII

Confidence Intervals for the population mean

Assumptions: Population is normally distributed Standard deviation of population is given

STAT 101 -- Part VII

STAT 101 -- Part VII

STAT 101 -- Part VII

Example: protein intake

STAT 101 -- Part VII

Excel output of the example (Protein Intake):

STAT 101 -- Part VII

Interpretation of confidence intervals

True population mean

Confidence intervals for the mean with unknown population variance

STAT 101 -- Part VII

STAT 101 -- Part VII

STAT 101 -- Part VII

Insurance example (contd)

STAT 101 -- Part VII

Excel output of the example (insurance):

STAT 101 -- Part VII

Large sample size cases

STAT 101 -- Part VII

Flow Chart for determining the distributions

Is population distribution normal?

Is sample size sufficiently large (n >=30), such that CLT applied?

Is population standard deviation given?

Use other methods

Large sample size (>120)

Factors affecting the length of a confidence interval

The length of confidence interval is then The length depends on S, n and

Determining sample size for the mean

Requirements of determining sample size

STAT 101 -- Part VII

STAT 101 -- Part VII

STAT 101 -- Part VII

Estimation for the binomial distribution

Point estimation: sample proportion

STAT 101 -- Part VII

STAT 101 -- Part VII

Sampling distribution of sample proportion

Sampling distribution of sample proportion

STAT 101 -- Part VII

STAT 101 -- Part VII

STAT 101 -- Part VII

Excel output of the example (Euro)

STAT 101 -- Part VII

Requirements of determining sample size for the proportion

STAT 101 -- Part VII

STAT 101 -- Part VII

STAT 101 -- Part VII

Useful and interesting websites

Confidence Intervals information