You are on page 1of 13

CHAPTER 4: PROBABILITY AND PROBABILITY DISTRIBUTIONS

E, F : any two events (E is the complement of E) Additive law T E or F T E T F T E and F

Multiplicative law

T E and F T ElF T F T FlET E

Independence E and F are independent if T ElF T E T FlET E T F

Bayes' formula
T ElF

This is Bayes' formula. Often the denominator probability must be obtained as a sum of joint probabilities, for instance, the event F can be written as the union of F and E and F and E:

So

T F T F and E T F and E T FlET E T FlET E

and T ElF

T FlET E . T FlET E T FlET E

Bayes' formula for more than two events E" , E# , ..., E5 that partition the sample space:

T E" lF

T FlE" T E" T FlE" T E" T FlE# T E# T FlE5 T E5

ex. A drawer has 2 white socks and 2 blue socks. Professor reaches in and draw out 2 socks in succession, without replacement. [" white on the first draw, [# white on the second draw, F" blue ... etc. A tree diagram of the events, and the conditional probabilities, is: T [# l[" T F# l["
1 3 2 3

[#

[" T [" T F"


1 2

F#

1 2

F"

T [# lF" T F# lF"

2 3 1 3

[#

F#

Probability of two white socks: T [" and [# T [# l[" T [" 1 1 1 3 2 6

Probability of two socks the same color: T [" and [# T F" and F# Probability that the second sock is blue: 1 1 1 6 6 3

T F# T F# l[" T [" T F# lF" T F" 2 1 1 1 1 3 2 3 2 2 T F# l[" T [" T F#

Probability that the first sock is white, given the second sock is blue: T [" lF# (Bayes' rule)

T F# l[" T [" T F# l[" T [" T F# lF" T F"


2 3 2 1 3 2 1 1 2 3 1 2

2 3

ex. patient with a suspected disease true condition H: event that the disease is present. H: disease absent. T H: background prevalence of the disease in the population. diagnostic test outcomes T 9=: positive for the disease. R /1: negative for the disease T T 9=lH: sensitivity T R /1lH: specificity (empirically determined)

typical diagnostic tests are calibrated to have sensitivity considerably larger than specificity (so that the corresponding error probabilities are comensurate with the seriousness of the errors) true disease status diagnostic H H test outcome T 9= T T 9= and H T T 9= and H R /1 T HlT 9=
T T 9=lHT H T T 9=lHT H T T 9=lHT H

T R /1 and H

T R /1 and H

T HlR /1

T R /1lHT H T R /1lHT H T R /1lHT H

Discrete probability distributions


Random variable: a numerical outcome of a random experiment (usually denoted with an upper case letter, e.g. ]) ex. # of blue socks SAT of a randomly drawn student # democrats in a random sample of voters 1 day's growth (dry weight) of a plant Probability distribution: collection of all possible outcomes of a random variable, and their associated probabilities (a particular outcome usually denoted with a lower case letter, e.g., C) Discrete probability distribution: random variable has a finite or countably infinite number of states

1. Binomial distribution ] , a random variable, is the number of successes in 8 independent identical trials, in which each trial can be a success or a failure (possible outcomes are C 0, 1, 2, 3, ..., 8). For each trial, the probability of success is 1, 0 1 1 (not the pi from a circle). The binomial distribution has probabilities given by T ] C T C 8x 1C 1 18C Cx8 Cx

for C 0, 1, 2, 3, ..., 8. These probabilities add to 1: T 0 T 1 T 2 T 8 1.

The expected value or mean of the binomial random variable ] :

E] . 0 T 0 1 T 1 8 T 8
ALA

81

Variance of the random variable ] : 8 . # T 8

V] 5 # 0 .# T 0 1 .# T 1
ALA

811 1

Standard deviation of ] :

SD] V] ALA 811 1 ] binomial8, 1

Common notation:

] has a binomial distribution with parameters 8 and 1 SAS: The function RANBIN(seed, n, p) returns a binomial random variable with # trials n and success probability p (set seed = 0 and SAS will use the computer clock time as seed) The function PROBBNML(p, n, x) computes the probability that an observation from a binomial(n, p) distribution will be less than or equal to x.

exercise (concept of mean and variance of a discrete distribution): suppose ] has a rectangular distribution given by T ] C T C C 1, 2, 3, 4, 5, 6 (distribution of the result of rolling a die). (a) Draw a picture of the probability distribution. (b) Calculate the expected value of ] . (c) Calculate the variance of ] . 2. Poisson distribution /. .C T ] C T C , Cx C 0, 1, 2, 3, . Here / 2.71828... and . is a parameter (. 0). This is a distribution with positive probability on all the nonnegative integers: T 0 T 1 T 2 T C 1.
C!

1 , 6

Poisson distribution arises as a model of rare events: # radioactive decays in a unit of time # incoming cosmic rays in a unit of time # plant stems in a sample plot # steelhead caught in 1 hr # crimes reported in Moscow, ID in 1 day # car accidents reported in a stretch of US95 in 1 week Note that zero is a possible outcome in the Poisson distribution. Some other properties: E] .

V] 5 # . Variance equals the mean in the Poisson distribution T ] 0 T 0 /. Notation: ] Poisson. True fact: suppose ] binomial8, 1, that is, T ] C T C

8x 1C 1 18C Cx8 Cx

C 0, 1, 2, , 8.

Suppose 8 is large and 1 is small. Then e. .C T ] C Cx where . 81

(Poisson approximation to the binomial) SAS: RANDPOI(seed, m) generates a Poisson random variable with mean m POISSON(m, x) calculates T ] x, where ] Poissonm

3. Multinomial distribution (a multivariate distribution) 5 types sample 8 with replacement

1" proportion of type 1 () in the urn 1# proportion of type 2 () in the urn 15 proportion of type 5 () in the urn The 14 's are constants (parameters); 1" 1# 15 1. ]" , ]# , , ]5 : random variables ]" number of type 1 () in the sample ]# number of type 2 () in the sample ]5 number of type 5 () in the sample ]" ]# ]5 8. The ]4 's are dependent: the value of one affects the others.

T ]" C" and ]# C# and and ]5 C5 8x C C C 1"" 1## 155 C" xC# xC5 x

where C" , C# , ..., C5 are any nonnegative integers that add to 8 (all possible outcomes). examples ]" # democrats, ]# # republicans, ]$ # greens, ]4 # other, in a random sample of 8 voters ]" # genotype AA BB, ]# # AA Bb, ]$ # AA bb, ]% # genotype Aa BB, ..., ]* # genotype aa bb in a random sample of 8 people A population has 8 mice. Catch mice by live-trapping on two sampling occasions; uniquely identify each one caught. ]" # mice caught in the first sample as well as the second ]# # mice caught in the first sample but not in the second ]$ # mice not caught in the first sample but caught in the second ]% # mice not captured in either sample (unobserved) 8: unknown

You might also like