From Randomness to Probability/ Probability Rules!
What is chance? - Excerpt from War and Peace by Leo Tolstoy But what is chance ? What is genius ? The words chance and genius mean nothing actually existing, and so cannot be denned. These words merely denote a certain stage in the comprehension of phenomena. I do not know how some phenomenon is brought about; I believe that I cannot know ; consequently I do not want to know and talk of chance. I see a force producing an effect out of proportion with the average effect of human powers ; I do not understand how this is brought about, and I talk about genius.
What is chance? - Excerpt from War and Peace by Leo Tolstoy - continued To a herd of rams, the ram the herdsman drives each evening into a special enclosure to feed and that becomes twice as fat as the others must seem to be a genius. And it must appear an astonishing conjunction of genius with a whole series of extraordinary chances that this ram, who instead of getting into the general fold every evening goes into a special enclosure where there are oats that this very ram, swelling with fat, is killed for meat. Thought Question 1. If you flip a coin and do it fairly, what is the probability that it will land heads up?
2. If you were to flip a fair coin six times, which sequence do you think would be most likely:
HHHHHH or HHTHTH or HHHTTT?
Dealing with Random Phenomena A random phenomenon is a situation in which we know what outcomes could happen, but we dont know which particular outcome did or will happen.
In general, each occasion upon which we observe a random phenomenon is called a trial.
At each trial, we note the value of the random phenomenon, and call it an outcome.
When we combine outcomes, the resulting combination is an event. The collection of all possible outcomes is called the sample space. Examples Toss a coin Sample space S={Heads, Tails}={H,T}
Roll a die Sample space S={1,2,3,4,5,6} Event={2,4,6}=even number The Law of Large Numbers First a definition . . . When thinking about what happens with combinations of outcomes, things are simplified if the individual trials are independent.
Roughly speaking, this means that the outcome of one trial doesnt influence or change the outcome of another.
For example, coin flips are independent. The Law of Large Numbers The Law of Large Numbers (LLN) says that the long- run relative frequency of repeated independent events gets closer and closer to a single value.
We call the single value the probability of the event.
Because this definition is based on repeatedly observing the events outcome, this definition of probability is often called empirical probability. Coin-Toss Example assume coins made such that they are equally likely to land with heads or tails up when flipped - probability of a flipped coin showing heads up is . The Nonexistent Law of Averages The LLN says nothing about short-run behavior.
Relative frequencies even out only in the long run, and this long run is really long (infinitely long, in fact).
The so called Law of Averages (that an outcome of a random event that hasnt occurred in many trials is due to occur) doesnt exist at all. Example: The Gamblers Fallacy Independent Chance Events Have No Memory
Example: People tend to believe that a string of good luck will follow a string of bad luck in a casino. However, making ten bad gambles in a row doesnt change the probability that the next gamble will also be bad.
What looks more likely? Toss a coin 4 times and record the results of each toss. Which of these outcomes are more probable?
HTHT TTHH HHHT HHHH
They are 16 possible outcomes HHHH THHH HHHT THHT HHTH THTH HHTT THTT HTHH TTHH HTHT TTHT HTTH TTTH HTTT TTTT
The probability of getting all heads is 1/16 or (0.5) (0.5) (0.5) (0.5) equal to 0.0625. The probability of getting 50% heads and 50% tails is 6/16 (0.375).
Probability Distribution for the number of heads No. of Heads 0 1 2 3 4 Proportion: 0.0625 0.25 0.375 .25 0.0625 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0 1 2 3 4 Tossing a coin 4 times Probability Distribution for number of heads No. of Heads 0 1 2 3 4 Proportion: 0.0625 0.25 0.375 .25 0.0625 Modeling Probability When probability was first studied, a group of French mathematicians looked at games of chance in which all the possible outcomes were equally likely. They developed mathematical models of theoretical probability.
Its equally likely to get any one of six outcomes from the roll of a fair die. Its equally likely to get heads or tails from the toss of a fair coin.
However, keep in mind that events are not always equally likely. A skilled basketball player has a better than 50-50 chance of making a free throw. The probability of an event is the number of outcomes in the event divided by the total number of possible outcomes.
P(A) = Modeling Probability # of outcomes in A # of possible outcomes The Personal-Probability Interpretation In everyday speech, when we express a degree of uncertainty without basing it on long-run relative frequencies or mathematical models, we are stating subjective or personal probabilities.
The Personal-Probability Interpretation Personal probability: the degree to which a given individual believes the event will happen. Personal-Probability versus Relative Frequency Probability
Probability What does the word probability mean?
Two distinct interpretations: For the probability of winning a lottery based on buying a single ticket -- we can quantify the chances exactly.
For the probability that we will eventually buy a home -- we are basing our assessment on personal beliefs about how life will evolve for us.
Probability - The Relative-Frequency Interpretation Relative-frequency interpretation: applies to situations that can be repeated over and over again.
Examples: Buying a weekly lottery ticket and observing whether it is a winner.
Observing births and noting if baby is male or female.
Idea of Long-Run Relative Frequency Observe the Relative Frequency Example: In 1987 there were a total of 3,809,394 live births in the U.S., of which 1,951,153 where males.
probability of male birth is 1,951,153/3,809,394 = 0.5122
Long-run relative frequency of males born in the United States is about 0.512.
Probability = proportion of time the event occurs over the long run
Possible results for relative frequency of male births:
Formal Probability 1. Two requirements for a probability:
A probability is a number between 0 and 1. For any event A, 0 P(A) 1.
If the probability that a particular flight will be on time is 0.70, that it will be early with probability 0.10.
What is the probability it will be late? Formal Probability 2. Probability Assignment Rule: The probability of the set of all possible outcomes of a trial must be 1. P(S) = 1 (S represents the set of all possible outcomes.) Formal Probability 3. Complement Rule: The set of outcomes that are not in the event A is called the complement of A, denoted A C . The probability of an event occurring is 1 minus the probability that it doesnt occur: P(A) = 1 P(A C ) Formal Probability 4. Addition Rule: Events that have no outcomes in common (and, thus, cannot occur together) are called disjoint (or mutually exclusive).
Formal Probability 4. Addition Rule: For two disjoint events A and B, the probability that one or the other occurs is the sum of the probabilities of the two events. P(A B) = P(A) + P(B), provided that A and B are disjoint.
Addition Rule Example: According to Krantz (1992, p. 190) 25% of all women give birth to their first child at under 20 years of age and one-third (or 33%) give birth to their first child between the ages of 20 and 24.
So the probability that a randomly selected woman will have given birth to her first child by the time she turns 25 is 0.25 + 0.33 = 0.58. Age: Under 20 20 to 24 25 to 29 30 to 34 35 and over Proportion: 0.25 0.33 0.25 0.125 0.04 Question Multiplication Rule: 5. Multiplication Rule: For two independent events A and B, the probability that both A and B occur is the product of the probabilities of the two events. P(A B) = P(A) P(B), provided that A and B are independent.
Example: Woman will have two children. Assume outcome of 2 nd
birth independent of 1 st and probability birth results in boy is 0.50.
Then probability of a boy followed by a girl is (0.50)(0.50) = 0.25. About a 25% chance a woman will have a boy and then a girl.
Formal Probability 5. Multiplication Rule: Many Statistics methods require an Independence Assumption, but assuming independence doesnt make it true.
Always Think about whether that assumption is reasonable before using the Multiplication Rule. Problems The General Addition Rule When two events A and B are disjoint, we can use the addition rule for disjoint events:
P(A B) = P(A) + P(B)
However, when our events are not disjoint, this earlier addition rule will double count the probability of both A and B occurring.
Thus, we need the General Addition Rule.
The General Addition Rule General Addition Rule: For any two events A and B, P(A B) = P(A) + P(B) P(A B)
The following Venn diagram shows a situation in which we would use the general addition rule:
Example Hospital records show that 12% of all patients are admitted for surgical treatment, 16% are admitted for obstetrics, and 2% receive both obstetrics and surgical treatment. If a new patient is admitted to the hospital, what is the probability that the patient will be admitted either for surgery, obstetrics, or both?
S=surgery; O=obstetrics; P(S)=0.12; P(O)=0.16; P(S and O)=0.02
P(S or O) = P(S) + P(O) P(S and O)=
=0.12 + 0.16 -0.02 = 0.26 Conditional Probability When we want the probability of an event from a conditional distribution, we write P(B|A) and pronounce it the probability of B given A.
A probability that takes into account a given condition is called a conditional probability.
To find the probability of the event B given the event A, we restrict our attention to the outcomes in A. We then find the fraction of those outcomes B that also occurred.
P(B|A) P(A B) P(A) Independence Independence of two events means that the outcome of one event does not influence the probability of the other.
With our new notation for conditional probabilities, we can now formalize this definition:
Events A and B are independent whenever P(B|A) = P(B).
(Equivalently, events A and B are independent whenever P(A|B) = P(A).) Independent Disjoint Disjoint events cannot be independent! Well, why not?
Since we know that disjoint events have no outcomes in common, knowing that one occurred means the other didnt.
Thus, the probability of the second occurring changed based on our knowledge that the first occurred.
It follows, then, that the two events are not independent. Depending on Independence Its much easier to think about independent events than to deal with conditional probabilities.
It seems that most peoples natural intuition for probabilities breaks down when it comes to conditional probabilities.
Dont fall into this trap: whenever you see probabilities multiplied together, stop and ask whether you think they are really independent. Conditional Probability Rule Example : Survey 56% of students live on campus P(L) = 0.56 62 % have campus meal program P(M) = 0.62 42% do both - P(L and M) = 0.42
P(M|L) = P(L and M)/P(L) = 0.42/0.56 = 0.75
Are living on campus and having a meal plan independent? Are they disjoint?
Problem The General Multiplication Rule When two events A and B are independent, we can use the multiplication rule for independent events:
P(A and B) = P(A) x P(B)
However, when our events are not independent, this earlier multiplication rule does not work. Thus, we need the General Multiplication Rule. The General Multiplication Rule We encountered the general multiplication rule in the form of conditional probability.
Rearranging the equation in the definition for conditional probability, we get the General Multiplication Rule:
For any two events A and B,
P(A and B) = P(A) x P(B|A) or P(A and B) = P(B) x P(A|B) Question Thought Question - iClicker You test positive for rare disease, your original chances of having disease are 1 in 1000.
The test has a 10% false positive rate and a 10% false negative rate => whether you have disease or not, test is 90% likely to give a correct answer.
Given you tested positive, what do you think is the probability that you actually have disease?
A. 90% B. 80% C. 45% D. 1%
Tree Diagrams A tree diagram helps us think through conditional probabilities by showing sequences of events as paths that look like branches of a tree.
Making a tree diagram for situations with conditional probabilities is consistent with our make a picture mantra. Example Binge Drinking on Campus: Results of a National Study:
44% of college students engage in binge drinking, 37% drink moderately, and 19% abstain entirely.
Another study finds that among binge drinkers aged 21 to 34, 17% have been involved in an alcohol-related accident, and among nonbingers only 9% have been involved in such accidents Tree Diagrams The figure shows how we multiply the probabilities of the branches together:
What is the probability of being a binge drinker and having an accident?
What is the probability that a student had an alcohol related accident? Bayess Rule - Reversing the Conditioning P B| A P A| B P B P A| B P B P A| B C
P B C
What is probability that a student is a binge drinker (B) given they had an accident (A)? Medical Testing for a Rare Disease To determine probability of a positive test result being accurate, you need:
Sensitivity of the test the proportion of people who correctly test positive when they actually have the disease
Specificity of the test the proportion of people who correctly test negative when they dont have the disease
Base rate - probability that you are likely to have disease, without any knowledge of your test results.
Positive Predictive Rate (PPR) - probability of being diseased, given that someone tests positive.
Medical Testing for a Rare Disease You test positive for rare disease, your original chances of having disease are 1 in 1000. The base rate is equal to 0.001
The test has a 10% false positive rate and a 10% false negative rate => whether you have disease or not, test is 90% likely to give a correct answer.
The sensitivity and specificity of the test are both 90%. The proportion of people who correctly test positive or correctly test negative is 90%.
Medical Testing for a Rare Disease 100,000 P(have disease =0.001) P(do not have disease =0.999) 100 99,900 P(+test =0.90) 90 P(-test =0.10) 10 P(-test =0.90) P(+test =0.10) 89,910 9,990 90 + 9,990 = 10,080 test positive but only 90 of those have the disease
So the probability of having the disease given you test positive is 90/10,080 = 0.009 or 1%
This is called the Positive Predictive Rate
0.0009 0.09999 Bayess Rule Let A Test Positive B Have Disease
Notice that of the 10,080 who test positive, only 90 are diseased. So the probability of being diseased, given that someone tests positive, is only 90/10,080 = 0.009 or 0.9%.
This is called the positive predictive rate (PPR).
If base rate for disease is very low and test for disease is less than perfect, there will be a relatively high probability that a positive test result is a false positive.
Breakdown of Actual Status versus Test Status for a Rare Disease Test shows positive Test shows negative Total Actually sick 90 10 100 Actually healthy 9,990 89,910 99,900 Total 10,080 89,920 100,000 Example: Mammogram Test for Breast Cancer The probability of woman getting breast in her 40s is 0.014 - (from cancer.gov)
The mammogram test has a 5% false positive rate and a 10% false negative rate- conservative estimates.
The sensitivity of the test is 90% and specificity of the test is 95%.
Breakdown of Actual Status versus Test Status for Breast Cancer Test shows positive Test shows negative Total Actually sick 1,260 140 1400 Actually healthy 4,930 93,670 98,600 Total 6,190 93,810 100,000 Example: Mammogram Test for Breast Cancer
Notice that of the 6,190 who test positive, only 1,260 have breast cancer. So the probability of having breast cancer, given that someone tests positive, is
1,260/6,190= 0.203 or 20%
Breakdown of Actual Status versus Test Status for Breast Cancer Test shows positive Test shows negative Total Actually sick 1,260 140 1400 Actually healthy 4,930 93,670 98,600 Total 6,190 93,810 100,000 Example: The Case of Sally Clark and SIDS (Sudden Infant Death Syndrome) Her prosecution was controversial due to statistical evidence presented by pediatrician Professor Sir Roy Meadow, who testified that the chance of two children from an affluent family suffering sudden infant death syndrome was 1 in 73 million,
He relied on a study which showed that for non-smoking professional families, the probability of cot death is 1/8500.
1/8500 x 1/8500 = 1/73 million
He assumed independence of the two events. Do you think this is a fair assumption to make?
Example: The Case of Sally Clark and SIDS (Sudden Infant Death Syndrome)
The Royal Statistical Society later issued a public statement expressing its concern at the "misuse of statistics in the courts" and arguing that there was "no statistical basis" for Meadow's claim.