Professional Documents
Culture Documents
Measures of Dispersion
Probability
Normal Distribution
1
MEASURES OF DISPERSION
2
Definition
• Measures of dispersion are descriptive
statistics that describe how similar a set of
scores are to each other
– The more similar the scores are to each other, the
lower the measure of dispersion will be
– The less similar the scores are to each other, the
higher the measure of dispersion will be
– In general, the more spread out a distribution is,
the larger the measure of dispersion will be
3
Measures of Dispersion
• Which of the
distributions of scores
has the larger 125
100
dispersion? 75
50
25
0
1 2 3 4 5 6 7 8 9 10
125
100
75
50
25
0
1 2 3 4 5 6 7 8 9 10
4
Measures of Dispersion
• Which of the
distributions of scores
has the larger 125
100
dispersion? 75
50
25
100
50
That is, they are less similar to 25
each other 0
1 2 3 4 5 6 7 8 9 10
5
Measures of Dispersion
• Measures of dispersion include:
– The range
– The Semi-Interquartile range (SIR)
– Variance.
– Standard deviation
– Coefficient of Variation.
6
Measures of Dispersion: The Range
Example:
What is the range of the following data:
4 8 1 6 6 2 9 3 6 9
Solution
• The largest score (XL) is 9; the smallest score (XS) is 1; the range is
XL - XS = 9 - 1 = 8
Range: 1 - 9 (specify minimum and maximum values)
7
When To Use the Range
• The range is used to summarize quantitative data.
8
Measures of Dispersion: Semi-Interquartile Range
• The semi-interquartile range (or SIR) is defined as the
difference of the first and third quartiles divided by two
– The first quartile is the 25th percentile
– The third quartile is the 75th percentile
9
Example: SIR
2 4 6 8 10 12 14 20 30 60
10
SIR Example
• 25 % of the scores are 2
below 5 4
5 = 25th %tile
– 5 is the first quartile 6
• 25 % of the scores are
8
above 25 10
– 25 is the third quartile
12
14
• SIR = (Q3 - Q1) / 2 = (25 -
20
5) / 2 = 10 25 = 75th %tile
30
60
11
Example: Inter-quartile range cont.
12
Calculation of the Inter-quartile range for
grouped data
Class interval Frequency Cumulative
(f) frequency
1-3 10 10
4-6 14 24
7-9 10 34
10-12 6 40
13-15 5 45
16-18 5 50
Total 50
13
• 1st quartile (or 25th percentile):
= L + (n/4 - cf)
3.5 + (12.5 – 10) x 3
14
= 3.5 + 0.535
= 4.04 years
• 3rd quartile: L + (3n/4 - cf) w
f
= 9.5 + (37.5 – 34) x 3
6
= 9.5 + 1.75
= 11.25 years
SIR = ½ (11.25 - 4.04)
= 3.61
14
Measures of Dispersion: Variance
15
Variance
• Variance is defined as the average of the
square deviations. The population variance is:
X 2
2
N
16
What Does the Variance Formula
Mean?
• First, it says to subtract the mean from each of
the scores
– This difference is called a deviate or a deviation
score
– The deviate tells us how far a given score is from
the typical, or average, score
– Thus, the deviate is a measure of dispersion for a
given score
17
What Does the Variance Formula
Mean?
18
What Does the Variance Formula Mean?
• One of the definitions of the mean was that it always made the sum of the
scores minus the mean equal to 0
• Thus, the average of the deviates must be 0 since the sum of the deviates
must equal 0
• To avoid this problem, square the deviate score prior to averaging them
– Squaring the deviate score makes all the squared scores positive.
• Variance is the mean of the squared deviation scores
• The larger the variance is, the more the scores deviate, on average, away
from the mean
• The smaller the variance is, the less the scores deviate, on average, from
the mean
19
Standard Deviation
• When the deviate scores are squared in variance, their unit of measure is
squared as well
– E.g. If people’s weights are measured in kilograms, then the variance of
the weights would be expressed in kilograms2 (or squared kilograms)
• Since squared units of measure are often awkward to deal with, the square
root of variance is often used instead
– The standard deviation is the square root of variance.
• Standard deviation = variance
• Variance = standard deviation2
20
Computational Formula
X
2
X
2
2
X
N
2
N N
2 is the population variance, X is a score, is the population mean, and
N is the number of scores
21
Computational Formula Example
X X2 X- (X-)2
9 81 2 4
8 64 1 1
6 36 -1 1
5 25 -2 4
8 64 1 1
6 36 -1 1
= 42 = 306 =0 = 12
22
Computational Formula Example
X
2
X
X
2 2
N
2
2
N N
2
12
306 42
6 6
6 2
306 294
6
12
6
2
23
Variance of a Sample
is slightly different from the formula for the variance of a
population:
2
2
X X
s N 1
s2 is the sample variance, X is a score, X is the sample mean, and N is
the number of scores
24
Variance of a Sample
– Is the mean of the squared deviations of the mean
from the observed values.
25
No. 1 2 3 4 5 6 7 8 9 10
Salary 15 18 16 14 15 15 12 17 90 95
(in $)
Mean: $ 30.7K
26
Example Cont.
Amount (in $) X - mean (x – 30.7)²
15 - 15.7 246.46
18 - 12.7 161.29
16 - 14.7 216.09
14 - 16.7 278.89
15 - 15.7 246.46
15 - 15.7 246.46
12 - 18.7 349.69
17 - 13.7 187.69
90 59.3 3,516.49
95 64.3 4,134.49
Total 9,584.01
27
Variance, s²: 9,584.09
10 - 1
= 1064.9
28
Alternative formulae: Variance
Amount X²
15 225
18 324
16 256
14 196
15 225
15 225
12 144
17 289
90 8100
95 9025
307 19,009
29
• Variance, s²: Σ(x²) - (Σx)² /n
n
= 19,009 - (307)² / 10
10 – 1
= 19,009 - 9,424.9
9
= 1,064.9
30
Example for Grouped Data (Cont.)
31
Mean: 391 / 50
= 7.82 years
Variance: 22.93 years²
32
Standard deviation
• Is the positive square root of the variance.
• The standard deviation is used more frequently than the variance.
– Standard deviation has same units of measurement as the mean.
• The mean and the standard deviation of the set of data can be used to
summarize the characteristics of the entire distribution of values.
» Mean ± SD
Example of workers’ salary:
• Variance for workers’ salary: 9,584.09
10 - 1
= 1064.9 $²
• Standard deviation: 32.63 $
• Mean: 30.7 $
34
• Standard deviation = √ (variance)
= √{∑ f(m – mean)²} / (n – 1)
= √ {1,123.44 ÷ (50 – 1)}
= 4.79 years.
35
Coefficient of variation
36
Cholesterol level Forced Heart rates
(mg/ 100 ml) expiratory volume beats per
(liters) min.
Mean 198.8 2.95 140.9
Standard 43.9 0.62 35.5
deviation
CV 22.1 21.0 25.2
37
PROBABILITY
38
Why Learn Probability?
39
Why Learn Probability?
• Nothing in life is certain. In everything we do, we gauge the
chances of successful outcomes.
• A probability provides a quantitative description of the
chances or likelihoods associated with various outcomes
• It provides a bridge between descriptive and inferential
statistics
Probability
Population Sample
Statistics
40
Why Learn Probability?
41
Basic Concepts
• An experiment is the process by which an observation (or
measurement) is obtained.
• An event is an outcome of an experiment, usually denoted by a
capital letter.
– The basic element to which probability is applied
– When an experiment is performed, a particular event either
happens, or it doesn’t!
42
Experiments and Events
43
Basic Concepts
44
Basic Concepts
• Two events are mutually exclusive if, when one event occurs, the
other cannot, and vice versa.
• Which of the following two sets of events are mutually exclusive ?
45
Basic Concepts
• An event that cannot be decomposed is called a simple event.
• Denoted by E with a subscript.
• Each simple event will be assigned a probability, measuring
“how often” it occurs.
• The set of all simple events of an experiment is called the
sample space, S.
46
Example
• The die toss:
• Simple events: Sample space:
1 E1
2
S ={E1, E2, E3, E4, E5, E6}
E2
S
3 E3 •E1 •E3
4 E4 •E5
5
E5 •E2 •E4 •E6
6 E6
47
Basic Concepts
• An event is a collection of one or more simple
events.
S
•E1 •E3
•The die toss: A •E5
–A: an odd number B
–B: a number > 2 •E2 •E4 •E6
50
Using Simple Events
51
Example 1
H HH 1/4
H P(at least 1 head)
1/4
T HT
1/4
= P(E1) + P(E2) + P(E3)
H TH 1/4 = 1/4 + 1/4 + 1/4 = 3/4
T
T TT
52
Example 2
The sample space of throwing a pair of dice is
53
2ND ROLL
1 2 3 4 5 6
1 (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
F 2 (2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
I
R 3 (3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
S 4 (4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
T
5 (5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
6 (6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
54
Consider the following events:
A - Dice add to 3.
B - Dice add to 6.
C - 1st die show 1
D - 2nd die show 1
Determine probabilities associated with the four
events.
55
Example 3
Event Simple events Probability
A B A B
57
2ND ROLL
1 2 3 4 5 6
1 (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
F 2 (2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
I
R 3 (3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
S 4 (4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
T
5 (5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
6 (6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
58
Consider the following events:
A - Dice add to 3.
B - Dice add to 6.
C - 1st die show 1
D - 2nd die show 1
Determine probabilities associated with the four
events.
59
Example 3
Event Simple events Probability
A B A B
61
Event Relations
The intersection of two events, A and B, is
the event that both A and B occur when the
experiment is performed. We write A B.
S
A B A B
P (A and B)
P( A B) P( A) P( B) P( A B) A B
65
Example: Additive Rule
Example: Suppose that there were 120
students in the classroom, and that they
could be classified as follows:
A: black hair Black Not Black
P(A) = 50/120 Male 20 40
B: female Female 30 30
P(B) = 60/120
P(AB) = P(A) + P(B) – P(AB)
= 50/120 + 60/120 - 30/120
= 80/120 = 2/3 Check: P(AB)
= (20 + 30 + 30)/120 66
Example: Two Dice
67
Example 3
Event Simple events Probability
68
Example: Two Dice
A: dice add to 3
B: dice add to 6
72
Calculating Probabilities AC
A
for Complements
• We know that for any event A:
– P(A AC) = 0
• Since either A or AC must occur,
P(A AC) =1
• so that P(A AC) = P(A)+ P(AC) = 1
P(AC) = 1 – P(A)
73
Example
Select a student at random
from the classroom. Define:
A: male Black Not Black
P(A) = 60/120 Male 20 40
B: female Female 30 30
P(B) = ?
“given”
76
Example 1
Toss a fair coin twice. Define
– A: head on second toss
– B: head on first toss
P(A|B) = ½
HH
1/4 P(A|not B) = ½
1/4
HT
1/4
P(A) does not A and B are
TH 1/4
change, whether independent!
TT B happens or
not…
77
Example 3: Two Dice
Toss a pair of fair dice. Define
– A: red die show 1
– B: green die show 1
81
Random Variables
• A quantitative variable x is a random variable if the value that it
assumes, corresponding to the outcome of an experiment is a
chance or random event.
• Random variables can be discrete or continuous.
82
Example
Toss a fair coin three times and
define x = number of heads.
x
HHH x p(x)
3
1/8 P(x = 0) = 1/8 0 1/8
HHT 2
1/8 P(x = 1) = 3/8 1 3/8
2
HTH 1/8 P(x = 2) = 3/8
2 2 3/8
THH
1/8 P(x = 3) = 1/8
1/8
1 3 1/8
HTT 1
1/8
1 Probability Histogram
1/8
THT for x
0
1/8
TTH
TTT
83
Example
Toss two dice and define
x = sum of two dice. x p(x)
2 1/36
3 2/36
4 3/36
5 4/36
6 5/36
7 6/36
8 5/36
9 4/36
10 3/36
11 2/36
12 1/3684
Probability Distributions
Probability distributions can be used to describe the population.
– Shape: Symmetric, skewed, mound-shaped…
– Outliers: unusual or unlikely measurements
– Center and spread: mean and standard deviation. A
population mean is called and a population standard
deviation is called .
85
THE BINOMIAL DISTRIBUTION
• Many business experiments can be characterized by the
Bernoulli process
• The Bernoulli process is described by the binomial probability
distribution
1. Each trial has only two possible outcomes
2. The probability stays the same from one trial to the next
3. The trials are statistically independent
4. The number of trials is a positive integer
• The binomial distribution is used to find the probability of a
specific number of successes out of n trials
86
The Binomial Distribution
The binomial distribution is used to find the
probability of a specific number of successes out of n
trials
We need to know
n = number of trials
p = the probability of success on any single trial
We let
r = number of successes
q = 1 – p = the probability of a failure
87
The Binomial Distribution
0 0.03125 = 5! (0.5)0(0.5)5 – 0
0!(5 – 0)!
1 0.15625 = 5! (0.5)1(0.5)5 – 1
1!(5 – 1)!
2 0.31250 = 5! (0.5)2(0.5)5 – 2
2!(5 – 2)!
3 0.31250 = 5! (0.5)3(0.5)5 – 3
3!(5 – 3)!
4 0.15625 = 5! (0.5)4(0.5)5 – 4
4!(5 – 4)!
5 0.03125 = 5! (0.5)5(0.5)5 – 5
5!(5 – 5)!
Table 2.7
89
Solving Problems with the Binomial Formula
Thus
5!
P ( 4 successes in 5 trials) 0.5 40.55 4
4! (5 4)!
5( 4)(3)(2)(1)
(0.0625)(0.5) 0.15625
4(3)(2)(1)(1! )
Or about 16%
90
The Normal Distribution
| | |
40 µ = 50 60
Smaller µ, same
| | |
µ = 40 50 60
Larger µ, same
| | |
40 50 µ = 60
Figure 2.8
92
The Normal Distribution
Same µ, smaller
Same µ, larger
µ
Figure 2.9
93
The Normal Distribution
–1 +1
a µ b
–2 +2
a µ b
–3 +3
a µ b
Figure 2.10
94
The Normal Distribution
X
Z
where
X = value of the random variable we want to measure
µ = mean of the distribution
= standard deviation of the distribution
Z = number of standard deviations from X to the mean, µ 97
Using the Standard Normal Table
For example, µ = 100, = 15, and we want to find the
probability that X is less than 130
X 130 100
Z
15
30
2 std dev µ = 100
15 P(X < 130) = 15
| | | | | | |
X = IQ
55 70 85 100 115 130 145
X
| | | | | | | Z
–3 –2 –1 0 1 2 3
Figure 2.11
98
Using the Standard Normal Table
Step 2
Look up the probability from a table of normal curve areas
Use Appendix A or Table 2.9 (portion below)
The column on the left has Z values
The row at the top has second decimal places for the
Z values
Table 2.9 99
Standard Normal Table
100
Problems:
1. Suppose the average length of stay in a chronic disease hospital of
a certain type of patients is 60 days with a standard of 15. If it is
reasonable to assume an approximately normal distribution of
lengths of stay.
Find the probability that a randomly selected patient from this
group will have a length of hospital stay:
• Greater than 50 days.
• Less than 30 days.
• Between 30 and 60 days.
• Greater than 90 days.
• Greater than 30 days.
101
2. A study of blood pressure of school children gave a distribution of
systolic blood pressures (SBP) close to the Normal Distribution with
mean and standard deviation equal to 105.8 mm Hg. and 13.4 mm.
Hg., respectively.
• What proportion of children would be expected to have SBP greater
than 120 mm. Hg.?
• What proportion of children would be expected to have SBP less
than 120 mm. Hg.?
• What proportion of children would be expected to have SBP
between 100 mm. Hg. and 120 mm. Hg.?
102
103