You are on page 1of 57

Understanding Probability

From Randomness to Probability/ Probability Rules!


What is chance? - Excerpt from War and
Peace by Leo Tolstoy
But what is chance ? What is genius ?
The words chance and genius mean nothing actually
existing, and so cannot be denned. These words
merely denote a certain stage in the comprehension
of phenomena. I do not know how some
phenomenon is brought about; I believe that I cannot
know ; consequently I do not want to know and talk
of chance. I see a force producing an effect out of
proportion with the average effect of human powers ;
I do not understand how this is brought about, and I
talk about genius.

What is chance? - Excerpt from War and
Peace by Leo Tolstoy - continued
To a herd of rams, the ram the herdsman drives
each evening into a special enclosure to feed and
that becomes twice as fat as the others must
seem to be a genius. And it must appear an
astonishing conjunction of genius with a whole
series of extraordinary chances that this ram,
who instead of getting into the general fold
every evening goes into a special enclosure
where there are oats that this very ram, swelling
with fat, is killed for meat.
Thought Question
1. If you flip a coin and do it fairly, what is the
probability that it will land heads up?

2. If you were to flip a fair coin six times,
which sequence do you think would be most
likely:

HHHHHH or HHTHTH or HHHTTT?

Dealing with Random Phenomena
A random phenomenon is a situation in which we know
what outcomes could happen, but we dont know which
particular outcome did or will happen.

In general, each occasion upon which we observe a random
phenomenon is called a trial.

At each trial, we note the value of the random phenomenon,
and call it an outcome.

When we combine outcomes, the resulting combination is an
event.
The collection of all possible outcomes is called the sample
space.
Examples
Toss a coin
Sample space S={Heads, Tails}={H,T}

Roll a die
Sample space S={1,2,3,4,5,6}
Event={2,4,6}=even number
The Law of Large Numbers
First a definition . . .
When thinking about what happens with combinations of
outcomes, things are simplified if the individual trials are
independent.

Roughly speaking, this means that the outcome of one trial
doesnt influence or change the outcome of another.

For example, coin flips are independent.
The Law of Large Numbers
The Law of Large Numbers (LLN) says that the long-
run relative frequency of repeated independent events gets
closer and closer to a single value.

We call the single value the probability of the event.

Because this definition is based on repeatedly observing
the events outcome, this definition of probability is often
called empirical probability.
Coin-Toss Example
assume coins made such that they are equally likely to land with heads or
tails up when flipped - probability of a flipped coin showing heads up is .
The Nonexistent Law of Averages
The LLN says nothing about short-run behavior.

Relative frequencies even out only in the long run, and this
long run is really long (infinitely long, in fact).

The so called Law of Averages (that an outcome of a
random event that hasnt occurred in many trials is due
to occur) doesnt exist at all.
Example: The Gamblers Fallacy
Independent Chance Events Have No Memory

Example:
People tend to believe that a string of good luck will
follow a string of bad luck in a casino. However, making
ten bad gambles in a row doesnt change the
probability that the next gamble will also be bad.

What looks more likely?
Toss a coin 4 times and record the results of each toss. Which of these outcomes are more
probable?

HTHT TTHH HHHT HHHH

They are 16 possible outcomes
HHHH THHH HHHT THHT
HHTH THTH HHTT THTT
HTHH TTHH HTHT TTHT
HTTH TTTH HTTT TTTT

The probability of getting all heads is 1/16 or (0.5) (0.5) (0.5) (0.5) equal to 0.0625. The probability
of getting 50% heads and 50% tails is 6/16 (0.375).


Probability Distribution for the number of heads
No. of
Heads
0 1 2 3 4
Proportion: 0.0625 0.25 0.375 .25 0.0625
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 1 2 3 4
Tossing a coin 4 times
Probability Distribution for number of heads
No. of
Heads
0 1 2 3 4
Proportion: 0.0625 0.25 0.375 .25 0.0625
Modeling Probability
When probability was first studied, a group of French
mathematicians looked at games of chance in which all the
possible outcomes were equally likely. They developed
mathematical models of theoretical probability.

Its equally likely to get any one of six outcomes from the
roll of a fair die.
Its equally likely to get heads or tails from the toss of a fair
coin.

However, keep in mind that events are not always equally
likely.
A skilled basketball player has a better than 50-50 chance
of making a free throw.
The probability of an event is the number of outcomes in
the event divided by the total number of possible
outcomes.


P(A) =
Modeling Probability
# of outcomes in A
# of possible outcomes
The Personal-Probability Interpretation
In everyday speech, when we express a degree of
uncertainty without basing it on long-run relative
frequencies or mathematical models, we are stating
subjective or personal probabilities.

The Personal-Probability Interpretation
Personal probability: the degree to which a given individual believes the event
will happen.
Personal-Probability versus Relative Frequency Probability

Probability
What does the word probability mean?

Two distinct interpretations:
For the probability of winning a lottery based on
buying a single ticket -- we can quantify the chances
exactly.

For the probability that we will eventually buy a
home -- we are basing our assessment on personal
beliefs about how life will evolve for us.

Probability - The Relative-Frequency
Interpretation
Relative-frequency interpretation: applies to
situations that can be repeated over and over again.

Examples:
Buying a weekly lottery ticket and observing
whether it is a winner.

Observing births and noting if baby is male or
female.

Idea of Long-Run Relative Frequency
Observe the Relative Frequency
Example: In 1987 there were a total of 3,809,394 live births in the U.S.,
of which 1,951,153 where males.

probability of male birth is 1,951,153/3,809,394 = 0.5122

Long-run relative frequency of males born in the United States is about 0.512.

Probability = proportion of time the event occurs over the long run

Possible results for relative frequency of male births:

Formal Probability
1. Two requirements for a probability:

A probability is a number between 0 and 1.
For any event A, 0 P(A) 1.

If the probability that a particular flight will be on time is
0.70, that it will be early with probability 0.10.

What is the probability it will be late?
Formal Probability
2. Probability Assignment Rule:
The probability of the set of all possible outcomes of a trial
must be 1.
P(S) = 1 (S represents the set of all possible outcomes.)
Formal Probability
3. Complement Rule:
The set of outcomes that are not in the event A is called
the complement of A, denoted A
C
.
The probability of an event occurring is 1 minus the
probability that it doesnt occur: P(A) = 1 P(A
C
)
Formal Probability
4. Addition Rule:
Events that have no outcomes in common (and, thus,
cannot occur together) are called disjoint (or mutually
exclusive).

Formal Probability
4. Addition Rule:
For two disjoint events A and B, the probability that one
or the other occurs is the sum of the probabilities of the
two events.
P(A B) = P(A) + P(B), provided that A and B are
disjoint.

Addition Rule Example:
According to Krantz (1992, p. 190) 25% of all women
give birth to their first child at under 20 years of age
and one-third (or 33%) give birth to their first child
between the ages of 20 and 24.

So the probability that a randomly selected woman will
have given birth to her first child by the time she turns
25 is 0.25 + 0.33 = 0.58.
Age: Under 20 20 to 24 25 to 29 30 to 34 35 and over
Proportion: 0.25 0.33 0.25 0.125 0.04
Question
Multiplication Rule:
5. Multiplication Rule:
For two independent events A and B, the probability that
both A and B occur is the product of the probabilities of
the two events.
P(A B) = P(A) P(B), provided that A and B are
independent.

Example: Woman will have two children. Assume outcome of 2
nd

birth independent of 1
st
and probability birth results in boy is 0.50.

Then probability of a boy followed by a girl is (0.50)(0.50) = 0.25.
About a 25% chance a woman will have a boy and then a girl.

Formal Probability
5. Multiplication Rule:
Many Statistics methods require an Independence
Assumption, but assuming independence doesnt make it
true.

Always Think about whether that assumption is reasonable
before using the Multiplication Rule.
Problems
The General Addition Rule
When two events A and B are disjoint, we can use the
addition rule for disjoint events:

P(A B) = P(A) + P(B)

However, when our events are not disjoint, this earlier
addition rule will double count the probability of both A
and B occurring.

Thus, we need the General Addition Rule.

The General Addition Rule
General Addition Rule:
For any two events A and B,
P(A B) = P(A) + P(B) P(A B)

The following Venn diagram shows a situation in which
we would use the general addition rule:

Example
Hospital records show that 12% of all patients are admitted
for surgical treatment, 16% are admitted for obstetrics, and 2%
receive both obstetrics and surgical treatment. If a new patient
is admitted to the hospital, what is the probability that the
patient will be admitted either for surgery, obstetrics, or both?

S=surgery; O=obstetrics;
P(S)=0.12; P(O)=0.16; P(S and O)=0.02

P(S or O) = P(S) + P(O) P(S and O)=

=0.12 + 0.16 -0.02 = 0.26
Conditional Probability
When we want the probability of an event from a conditional
distribution, we write P(B|A) and pronounce it the
probability of B given A.

A probability that takes into account a given condition is
called a conditional probability.

To find the probability of the event B given the event A, we
restrict our attention to the outcomes in A. We then find the
fraction of those outcomes B that also occurred.




P(B|A)
P(A B)
P(A)
Independence
Independence of two events means that the outcome of
one event does not influence the probability of the other.

With our new notation for conditional probabilities, we
can now formalize this definition:

Events A and B are independent whenever P(B|A) = P(B).

(Equivalently, events A and B are independent whenever
P(A|B) = P(A).)
Independent Disjoint
Disjoint events cannot be independent! Well, why not?

Since we know that disjoint events have no outcomes in
common, knowing that one occurred means the other
didnt.

Thus, the probability of the second occurring changed
based on our knowledge that the first occurred.

It follows, then, that the two events are not independent.
Depending on Independence
Its much easier to think about independent events than
to deal with conditional probabilities.

It seems that most peoples natural intuition for probabilities
breaks down when it comes to conditional probabilities.

Dont fall into this trap: whenever you see probabilities
multiplied together, stop and ask whether you think they
are really independent.
Conditional Probability Rule
Example : Survey
56% of students live on campus P(L) = 0.56
62 % have campus meal program P(M) = 0.62
42% do both - P(L and M) = 0.42

P(M|L) = P(L and M)/P(L) = 0.42/0.56 = 0.75

Are living on campus and having a meal plan
independent? Are they disjoint?


Problem
The General Multiplication Rule
When two events A and B are independent, we can use
the multiplication rule for independent events:

P(A and B) = P(A) x P(B)

However, when our events are not independent, this
earlier multiplication rule does not work. Thus, we need
the General Multiplication Rule.
The General Multiplication Rule
We encountered the general multiplication rule in the
form of conditional probability.

Rearranging the equation in the definition for conditional
probability, we get the General Multiplication Rule:

For any two events A and B,

P(A and B) = P(A) x P(B|A)
or
P(A and B) = P(B) x P(A|B)
Question
Thought Question - iClicker
You test positive for rare disease, your original chances
of having disease are 1 in 1000.

The test has a 10% false positive rate and a 10% false
negative rate => whether you have disease or not, test
is 90% likely to give a correct answer.

Given you tested positive, what do you think is the
probability that you actually have disease?

A. 90% B. 80% C. 45% D. 1%


Tree Diagrams
A tree diagram helps us think through conditional
probabilities by showing sequences of events as paths
that look like branches of a tree.

Making a tree diagram for situations with conditional
probabilities is consistent with our make a picture
mantra.
Example
Binge Drinking on Campus: Results of a National Study:

44% of college students engage in binge drinking, 37%
drink moderately, and 19% abstain entirely.

Another study finds that among binge drinkers aged 21 to
34, 17% have been involved in an alcohol-related accident,
and among nonbingers only 9% have been involved in
such accidents
Tree Diagrams
The figure shows how we
multiply the probabilities
of the branches together:

What is the probability of
being a binge drinker and
having an accident?

What is the probability
that a student had an
alcohol related accident?
Bayess Rule - Reversing the Conditioning
P B| A
P A| B P B
P A| B P B P A| B
C

P B
C

What is probability
that a student is a
binge drinker (B) given
they had an accident
(A)?
Medical Testing for a Rare Disease
To determine probability of a positive test result being accurate,
you need:

Sensitivity of the test the proportion of people who
correctly test positive when they actually have the disease

Specificity of the test the proportion of people who
correctly test negative when they dont have the disease

Base rate - probability that you are likely to have disease,
without any knowledge of your test results.

Positive Predictive Rate (PPR) - probability of being
diseased, given that someone tests positive.


Medical Testing for a Rare Disease
You test positive for rare disease, your original
chances of having disease are 1 in 1000. The base rate is
equal to 0.001

The test has a 10% false positive rate and a 10% false
negative rate => whether you have disease or not,
test is 90% likely to give a correct answer.

The sensitivity and specificity of the test are both
90%. The proportion of people who correctly test
positive or correctly test negative is 90%.


Medical Testing for a Rare Disease
100,000
P(have disease =0.001)
P(do not have disease =0.999)
100
99,900
P(+test =0.90)
90
P(-test =0.10)
10
P(-test =0.90)
P(+test =0.10)
89,910
9,990
90 + 9,990 = 10,080 test positive but only 90
of those have the disease

So the probability of having the disease given
you test positive is 90/10,080 = 0.009 or 1%

This is called the Positive Predictive Rate

0.0009
0.09999
Bayess Rule
Let A Test Positive B Have Disease

P(B|A) = (0.90)(0.001) / (0.90)(0.001) + (.10)(.999)

P(B|A) = 0.0009 / 0.0009 + 0.0999
= 0.0009 / 0.1008
= 0.009 or 1%

P B| A
P A| B P B
P A| B P B P A| B
C

P B
C

Medical Testing for a Rare Disease

Notice that of the 10,080 who test positive, only 90 are diseased.
So the probability of being diseased, given that someone tests
positive, is only 90/10,080 = 0.009 or 0.9%.

This is called the positive predictive rate (PPR).

If base rate for disease is very low and test for disease is less than
perfect, there will be a relatively high probability that a positive test
result is a false positive.

Breakdown of Actual Status versus Test Status for a Rare
Disease
Test shows positive Test shows negative Total
Actually sick 90 10 100
Actually healthy 9,990 89,910 99,900
Total 10,080 89,920 100,000
Example: Mammogram Test for Breast Cancer
The probability of woman getting breast in her 40s is 0.014 - (from
cancer.gov)

The mammogram test has a 5% false positive rate and a 10% false negative
rate- conservative estimates.

The sensitivity of the test is 90% and specificity of the test is 95%.


Breakdown of Actual Status versus Test Status for Breast
Cancer
Test shows positive Test shows negative Total
Actually sick 1,260 140 1400
Actually healthy 4,930 93,670 98,600
Total 6,190 93,810 100,000
Example: Mammogram Test for Breast Cancer

Notice that of the 6,190 who test positive, only 1,260 have
breast cancer.
So the probability of having breast cancer, given that someone
tests positive, is

1,260/6,190= 0.203 or 20%

Breakdown of Actual Status versus Test Status for Breast
Cancer
Test shows positive Test shows negative Total
Actually sick 1,260 140 1400
Actually healthy 4,930 93,670 98,600
Total 6,190 93,810 100,000
Example: The Case of Sally Clark and
SIDS (Sudden Infant Death Syndrome)
Her prosecution was controversial due to statistical evidence
presented by pediatrician Professor Sir Roy Meadow, who
testified that the chance of two children from an affluent family
suffering sudden infant death syndrome was 1 in 73 million,

He relied on a study which showed that for non-smoking
professional families, the probability of cot death is 1/8500.

1/8500 x 1/8500 = 1/73 million

He assumed independence of the two events. Do you think
this is a fair assumption to make?

Example: The Case of Sally Clark and
SIDS (Sudden Infant Death Syndrome)

The Royal Statistical Society later issued a public statement
expressing its concern at the "misuse of statistics in the
courts" and arguing that there was "no statistical basis" for
Meadow's claim.




Problems

You might also like