You are on page 1of 39

Highlight the last lecture

27 September, 2011

STAT 101 -- Part VI

Highlight the last lecture (contd)

Assumption: Population is normal yes


27 September, 2011

No

n is large yes

No

Resampling

nonparametric

STAT 101 -- Part VI

27 September, 2011

STAT 101 -- Part VI

27 September, 2011

STAT 101 -- Part VI

27 September, 2011

STAT 101 -- Part VI

VII. Confidence Intervals


Point

and interval estimations of Means of normal distribution and nonnormal distribution Proportion parameter of binomial distribution Determining sample size For the mean For the proportion
27 September, 2011 STAT 101 -- Part VII 6

Confidence interval estimation

27 September, 2011

STAT 101 -- Part VII

Confidence Intervals for the population mean

Assumptions: Population is normally distributed Standard deviation of population is given

27 September, 2011

STAT 101 -- Part VII

27 September, 2011

STAT 101 -- Part VII

27 September, 2011

STAT 101 -- Part VII

10

Example: protein intake


Find the 95% confidence interval for the average daily protein intake of men aged 20-25. Population standard deviation is 58.6 grams. The random sample with size 267 men aged 20-25 is observed. The margin of error is Before collected the data from 267 men, we can say that there are 95% chance of the random interval will include It is noted that sample mean is still a random variable before collecting any data and we are still talking about probability. After collecting the daily protein intake of these 267 men and calculating the sample mean of 72.1 grams.

27 September, 2011

STAT 101 -- Part VII

11

After obtained the numerical result from sampling, we can not say that the population mean falls between 65.071g and 79.129g with 95% chance. The correct way to present the result is: The 95% confidence interval for the average daily protein intake for men aged 20-25 is (65.071g, 79.129g) Having determined a numerical result from one specific sample, it is no longer sensible to speak about the probability of its covering the fixed quantity If many repeated samples with same sample size were taken from the same population and the confidence intervals were constructed, the proportion of intervals containing would be approximately 0.95.
STAT 101 -- Part VII 12

27 September, 2011

Excel output of the example (Protein Intake):

27 September, 2011

STAT 101 -- Part VII

13

Interpretation of confidence intervals

True population mean


Values below true mean Values above true mean

http://www.socr.ucla.edu/Applets.dir/ConfidenceInterval.html
27 September, 2011 STAT 101 -- Part VII 14

Confidence intervals for the mean with unknown population variance


The only assumption is the population distribution is normal. The population standard deviation is unknown. It is reasonable to estimate the population standard deviation from the sample standard deviation.

27 September, 2011

STAT 101 -- Part VII

15

Why is it t-distribution?

27 September, 2011

STAT 101 -- Part VII

16

27 September, 2011

STAT 101 -- Part VII

17

Insurance example (contd)

27 September, 2011

STAT 101 -- Part VII

18

Excel output of the example (insurance):

27 September, 2011

STAT 101 -- Part VII

19

Large sample size cases

No assumptions of normal population distribution or the population variance. If the sample size is sufficiently large, the Central-Limit Theorem may be applied to guarantee that

27 September, 2011

STAT 101 -- Part VII

20

Flow Chart for determining the distributions

Is population distribution normal?


yes

no

Is sample size sufficiently large (n >=30), such that CLT applied?


yes no

Is population standard deviation given?


yes

no

Use other methods

Normal tables
27 September, 2011

t-distribution tables
STAT 101 -- Part VII

Large sample size (>120)

Normal tables
21

Factors affecting the length of a confidence interval

The shorter the length of confidence interval, the better the estimation Consider the confidence interval for population mean

The length of confidence interval is then The length depends on S, n and

n increases, length decreases increases (confidence level decrease), length decreases S increases, length increases
STAT 101 -- Part VII 22

S
27 September, 2011

Determining sample size for the mean

The required sample size can be found to reach a desired margin of error with a specified level of confidence. The margin of error is also called sampling error The margin of error can be interpreted as

the amount of imprecision in the estimate of the population parameter the amount added and subtracted to the point estimate to form the confidence interval
STAT 101 -- Part VII 23

27 September, 2011

Requirements of determining sample size

27 September, 2011

STAT 101 -- Part VII

24

Numerical example

A consumer group wants to estimate the mean electric bill for the amount of July for single-family homes in a large city. Based on studies conducted in other cities, the standard deviation is assumed to be $25. The group wants to estimate the mean bill for July to within $5 with 99% confidence. What sample size is needed?

27 September, 2011

STAT 101 -- Part VII

25

27 September, 2011

STAT 101 -- Part VII

26

Estimation for the binomial distribution

Recall the common structure of the binomial distribution: A sample of n independent trials Each trial can have only two possible outcomes which are denoted as `success and `failure The probability of a success at each trial is assumed to be constant p The parameters of the binomial distribution are n and p Now, assume that p is unknown and we want to use the sample proportion to estimate p
STAT 101 -- Part VII 27

27 September, 2011

Point estimation: sample proportion

27 September, 2011

STAT 101 -- Part VII

28

27 September, 2011

STAT 101 -- Part VII

29

Sampling distribution of sample proportion

1st sample of n

2nd sample of n

3rd sample of n

Population

Sampling distribution of

kth sample of n
27 September, 2011 STAT 101 -- Part VII 30

Sampling distribution of sample proportion

In previous section, we discussed that normal approximation to the binomial distribution In fact, the normal approximation can be justified on the basis of the Central-Limit Theorem since sample proportion is just a sample mean The textbook uses the rule of CLT: By the CLT, we get

27 September, 2011

STAT 101 -- Part VII

31

27 September, 2011

STAT 101 -- Part VII

32

Example

During June and July of 2001, the European Union Executive Commission conducted a study of 6,543 European adults. Of those surveyed, 56% said that the euro single currency would promote economic growth and 73% knew the correct date of the changeover (January 1, 2002). Construct a 95% confidence interval estimate for the proportion of European adults who believe that the euro would promote economic growth. Interpret the interval constructed.

27 September, 2011

STAT 101 -- Part VII

33

Excel output of the example (Euro)

27 September, 2011

STAT 101 -- Part VII

34

Requirements of determining sample size for the proportion

27 September, 2011

STAT 101 -- Part VII

35

Numerical example:

A study of 658 CEOs conducted by the Conference Board reported that 250 stated that their companys greatest concern was sustained and steady top-line growth (CEOs Greatest Concerns, USA Today Snapshots, May 8, 2006, P1D). To conduct a follow-up study to estimate the population proportion of CEOs whose greatest concern was sustained and steady topline growth to within 0.01 with 95% confidence, how many CEOs would you survey?

27 September, 2011

STAT 101 -- Part VII

36

27 September, 2011

STAT 101 -- Part VII

37

Useful and interesting websites


http://www.socr.ucla.edu/Applets.dir/Confid enceInterval.html Confidence Intervals simulations

http://en.wikipedia.org/wiki/Confidence_interval

Confidence Intervals information

27 September, 2011

STAT 101 -- Part VII

38

Assignment 2: Box-and-Whisker Plot


G2
G1

10

15

20
Stem-and-Leaf Display

25

30

35

40

Stem unit: Statistics Sample Size Mean Median Std. Deviation Minimum Maximum 94 37.08511 40 4.760183 12 40 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

1 0

00 0 00 0 000 00 00 0000 00000 0000 0000000 0000 000

40 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

27 September, 2011

STAT 101 -- Part VII

39

You might also like