Professional Documents
Culture Documents
Intensive Course in Statistics by Dr. Poch Bunnak March 29-April 29 April 2, 2 2010
2010 Dr. Poch Bunnak 1
Content
I. Link between probability and inferential statistics II. Important probability concepts III. Types of probability distribution Discrete Continuous IV. Normal and standard normal distributions V. Sampling li distributions di ib i VI. Central Limit Theorem VII.Exercise
2010 Dr. Poch Bunnak 2
Based on this p principle p of sampling p g variability y, we can make an educated guess of the precision of a single sample mean
2010 Dr. Poch Bunnak
The list must be mutually exclusive, i.e. no two outcomes can occur at the same time:
Die roll {odd number or even number} A list of exhaustive and mutually exclusive outcomes is called a sample space and is denoted by S.
The outcomes are denoted by O1, O2, , Ok Using notation from set theory, we can represent the sample space and its outcomes as:
Requirements q of Probabilities
Given a sample space S = {O1, O2, , Ok}, the h probabilities assigned i d to the h outcome must satisfy these requirements:
(1)The probability of any outcome is between 0 and 1
2010
Example p 1
Six top-rated applicants applied for jobs: M1, M2, F3, M4 F5, M4, F5 M6. M6 Only O l 2 vacancies i are available. il bl Since Si they are all highly qualified, the selection committee wants t to t select l t randomly. d l They Th want t to t know k what h t is i the probability that both female candidates are selected and the probability probabilit that one male and one female are selected. Pairing: ii See Table bl 4.1. in i SMSS, page 68 Answer: P(f,f) = .10, P(m,f) = .30, P(m,m) = .30
2010
Example p 2
A=sex of students (1=fem., 2=male) B=married before graduation (1=yes, (1=yes 2=no) We want to calculate P(B1 | A1)
B1 A1 A2 P(Bj) .11 .06 .17 B2 .29 .54 .83 P(Ai) .40 .60 1.00
Thus, there is a 27.5% chance that that students are married before graduation given that they are female. Note on marginal probability 2010 Dr. Poch Bunnak 8
( )
2010
Examples:
Number of students in a class Number of children in the family Nu Number be of o TV V sets se s in the e household ouse o d
2010 Dr. Poch Bunnak 10
2010
is the mean
P(x) is the probability of the various outcomes x.
The variance, denoted by 2 (sigma squared), measures the amount of spread (variation) of a distribution: 2 = [(x )2 P(x)] , , is the square q root of 2 The standard deviation,
Dr. Poch Bunnak 12
2010
Example p of a Discrete P. D.
Compute , 2, and of the number of siblings using these data: 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 5, 5
2010
17
No matter what the values of and are for a NPD, the total area under th curve is the i equal l to t one We W can consider id partial ti l areas under d the th curve as representing probabilities. Ex. What is the probability that a value z falls above the value of + 2? Below + 3? Etc. Empirical Rules for NPD (See Page 52 in SMSS)
2010
18
2010
19
2010
20
What is the probability that a computer is assembled in a time between 45 and 60 minutes? Or what is P(45 < X < 60) ?
2010 Dr. Poch Bunnak 21
2010
22
Example, p cont.
Weve converted P(45 < X < 60) for a normal distribution with mean = 50 and st. st dev. dev = 10 to P( P(.5 5 < Z < 1) [a standard normal distribution with mean = 0 and st. dev. = 1], meaning:
A z-value of x=45 is -.5, meaning that x=45 is .5 st. dev. < the mean A z-value of x=60 is 1, meaning that x=60 is 1 st. dev. > the mean
So, where do we go from here? Find the probability for the areas between z=-.5 and z=0 and between z=0 and z=1 Based on SNP table:
Area between z=-.5 and z=0: .5000 - .3085 = .1915 Area between z=0 and z=1: .5000 - .1587 = .3413 Both areas = .1915 1915 + .3413 3413 = .5328 5328 Thus, the probability that a computer is assembled between 45 and 60 minutes is 53.28% (See graphs in next slides)
Dr. Poch Bunnak 23
2010
Example, p cont.
P(.5 < Z < 1) looks like this: The probability is the area under the curve We will add up the two sections: P(.5 < Z < 0) and ( < Z < 1) ) P(0
.5
2010 Dr. Poch Bunnak
1
24
Example, p cont.
Use Table A on page 527 to look-up probabilities b biliti P(0 < Z < z) )
We can break up P(.5 < Z < 1) into: P(.5 < Z < 0) + P(0 < Z < 1) y around zero, , so The distribution is symmetric (multiplying by -1 and re-arranging the terms), thus:
P(.5 < Z < 0) = P(.5 > Z > 0) = P(0 < Z < .5) Hence: P(.5 < Z < 1) = P(0 < Z < .5) + P(0 < Z < 1) = .1915 + .3414 = .5328
2010 Dr. Poch Bunnak 25
Practices
Do exercises 1-3 on page 72 Do Problems 11 and 17 on page 89.
2010
26
2010
27
Exercise
1. 2. 3. 4. Problem 10, page 88 in SMSS Problem 17, page 89 in SMSS Problem 31, page 92 in SMSS Use census 1998 data:
1) Find measures of central tendency, dispersion, and shape of age of population 2) ) Each student draws a 5% random sample p and redo 1) ) above. 3) Compare the results 4) Redo 2) and 3) by using a 15% random sample.
2010
29