You are on page 1of 7

STATISTICAL LABORATORY, April 5th, 2011 UNIVARIATE AND BIVARIATE PROBABILITY DISTRIBUTIONS

Manuela Cattelan

POISSON DISTRIBUTION

Ex1 Suppose that in a city, the number of suicides can be approximated by a Poisson process with rate = .33 per month. 1) Find the probability of k suicides in a year for k = 0, 1, .... What is the most probable number of suicides? 2) What is the probability of two suicides in a week? 3) A suicide is reported in todays newspaper. What is the probability that the waiting time for the next suicide is greater than 1 month? 4 months? What is the median of the waiting time? (Rice, 2.32) 1. The number X of suicides in a year has a Poisson distribution, with parameter 12 = 3.96 and probability function P (X = k) = exp(3.96)3.96k /k!, k = 0, 1, ... We use R to derive the rst few values of the probability function. > tab <- data.frame(0:10, round(dpois(0:10, lambda = 12 * 0.33), + 3), round(ppois(0:10, lambda = 12 * 0.33), 3)) > names(tab) = c("Values", "Probs", "Cum. Probs") > tab 1 2 3 4 5 6 7

Values 0 1 2 3 4 5 6

Probs Cum. Probs 0.019 0.019 0.075 0.095 0.149 0.244 0.197 0.441 0.195 0.637 0.155 0.791 0.102 0.893

Based on material prepared by Prof. Mario Romanazzi

1 POISSON DISTRIBUTION

8 9 10 11

7 8 9 10

0.058 0.029 0.013 0.005

0.951 0.980 0.992 0.997

From the previous table, the mode is 3 (meaning that the most probable number of suicides in a year is 3) and the median is 4. 2. Note that the parameter is = .33/4 = 0.0825. exp(0.0825)0.08252 /2! a very low value. 3. The waiting time T (months) between two suicides has an exponential distribution with rate parameter = 0.33, meaning that the average time between two suicides is 1 3.030. We use R to answer the questions. > 1 - pexp(1, rate = 0.33) [1] 0.7189237 > 1 - pexp(4, rate = 0.33) [1] 0.2671353 > qexp(0.5, rate = 0.33) [1] 2.100446 Ex2 The 3% of the light bulbs produced by a factory are faulty. Find the probability that in a random sample of 100 light bulbs 2 are faulty. The number of faulty bulbs X Bin(100, 0.03), hence P (X = 2) = 100 0.032 0.9798 2 0.00313,

this probability can be computed in R through the command > dbinom(2, 100, 0.03) [1] 0.2251530 Note that the sample size is quite large and the probability of success is small, in this case the Poisson approximation can be used. X is approximately Poi(3), where the parameter of the distribution is = np = 100 0.03 = 3. The probability can be approximated as > dpois(2, 3)

2 GENERAL BIVARIATE DISCRETE DISTRIBUTIONS

[1] 0.2240418 Ex3 The probability that a person is allergic to a certain drug is 0.001. Find the probability that in 2000 people: 1) less than 2 are allergic; 2) more than 2 are allergic; 3) 3 people are allergic. 1. The number of people allergic to the drug follows a binomial distribution X Bin(2000, 0.001). The probability that less than two people are allergic is > pbinom(1, 2000, 0.001) [1] 0.4058704 which can also by approximated using a Poisson distribution with parameter 2000 0.001 = 2 > ppois(1, 2) [1] 0.4060058 2. The probability that more than two people are allergic is > 1 - pbinom(2, 2000, 0.001) [1] 0.3233236 or using a Poisson approximation > 1 - ppois(2, 2) [1] 0.3233236 3. Finally, the probability that exactly 3 people are allergic is > dbinom(3, 2000, 0.001) [1] 0.1805373 again, using the Poisson approximation the probability is > dpois(3, 2) [1] 0.1804470

GENERAL BIVARIATE DISCRETE DISTRIBUTIONS

Ex1 Consider the bivariate distribution of the random variables X and Y given by

3 MULTINOMIAL DISTRIBUTION Y 11 12 2/12 1/12 4/12 2/12

X 10 1 1/12 2 2/12

Compute: 1) the marginal frequency function of X, 2) the marginal frequency function of Y , 3) the conditional frequency function of X given Y = 11. 1. The marginal frequency function of X is 1 2 x P (X = x) 1/3 2/3

2. The marginal frequency function of Y is y P (Y = y) 10 11 12 1/4 2/4 1/4

3. The conditional frequency function of X given Y = 11 is x 1 2 P (X = x|Y = 11) 1/3 2/3

MULTINOMIAL DISTRIBUTION

Ex1 Three players play 10 independent rounds of a game, and each player has probability 1/3 of winning each round. 1) Find the joint distribution of the numbers of games won by each of the three players. 2) What are the probabilities of the following events: X1 = X2 = 5, X1 = X2 = 3? (Rice, 3.3) Solution. 1. Denote with Xi , i = 1, 2, 3 the number of games won by the i-th player. The joint distribution of (X1 , X2 , X3 ) is multinomial with parameters n = 10 (number of independent trials) and success probabilities p1 = p2 = p3 = 1/3. The probability function is P (X1 = x1 , X2 = x2 , X3 = x3 ) = 10! (1/3)x1 (1/3)x2 (1/3)10x1 x2 , x1 !x2 !(10 x1 x2 )!

where xi satises the constraints 0 xi 10, i = 1, 2.

4 GENERAL BIVARIATE CONTINUOUS DISTRIBUTIONS 2. We use R to answer the questions. > dmultinom(x = c(5, 5, 0), size = 10, prob = rep(1/3, 3)) [1] 0.004267642 > dmultinom(x = c(3, 3, 4), size = 10, prob = rep(1/3, 3)) [1] 0.07112737

Ex2 Three cards are drawn at random and with replacement from the box containing 20 cards, each card with the name of a dierent italian region. Recall that there are 8 northern regions (N), 4 central regions (C) and 8 southern regions (S). Let (XN , XC , XS ) denote the joint distribution of the number of regions of the three areas. 1) What is the probability of no southern regions? one region from each area? 2) Describe the probability distribution XC |XN = 1. Solution. 1. P (XS = 0) = 0.63 = 0.216 and P (XN = XC = XS = 1) = 3! 0.4 0.2 0.4 = 0.192. 2. This is a Binomial distribution: XC |XN = 1 Bi(2, 1/3), whose determinations are 0, 1, 2 with probabilities 4/9, 4/9, 1/9.

GENERAL BIVARIATE CONTINUOUS DISTRIBUTIONS


4x(1 y), 0 x 1 0 y 1, 0, elsewhere.

Ex1 A bivariate density function is dened as follows fX,Y (x, y) =

1) What are the marginal distributions? Are they uniform? 2) Are X and Y stochastically independent? 3) Compute the joint cdf values at the points (2, 1/2), (1/2, 1/2), (1/2, 1/2). Solution. 1. Marginal densities are obtained by integrating out the other variable.
1 1

fX (x) =
0 1

fX,Y (x, y)dy = 4x


0

(1 y)dy = 2x, 0 x 1 and 0 elsewhere,


1

fY (y) =
0

fX,Y (x, y)dx = 4(1 y)


0

xdx = 2(1 y), 0 y 1 and 0 elsewhere.

The marginal distributions are not uniform, because neither density is constant.

4 GENERAL BIVARIATE CONTINUOUS DISTRIBUTIONS

2. X and Y are stochastically independent because the joint density is identically equal to the product of the marginal densities: for all pairs of real numbers fX,Y (x, y) = fX (x)fY (y). 3. Note that stochastic independence implies FX,Y (x, y) = FX (x)FY (y), where FX and FY are the marginal cdfs. Therefore
1/2

FX,Y (2, 1/2) = FX (2)FY (1/2) = FY (1/2) = 2


0

(1 y)dy = 3/4,

FX,Y (1/2, 1/2) = FX (1/2)FY (1/2) = 0,


1/2

FX,Y (1/2, 1/2) = FX (1/2)FY (1/2) = (3/4)FX (1/2) = (3/2)


0

xdx = 3/16.

Ex2 A bivariate density function is dened as follows fX,Y (x, y) = x + y, 0 x 1 0 y 1, 0, elsewhere.

1) Describe the contours of the bivariate density. 2) Derive the marginal distributions. Are X and Y stochastically independent? 3) Compute the joint cdf value at the point (1/2, 1/2). 4) Obtain the conditional densities of Y |X = x, X|Y = y. Solution. 1. Observe that the joint pdf varies between 0 (at the point 0, 0) and 2 (at the point 1, 1). The contours are the subsets of the unit square Q with a constant value 0 c 2 of the density, that is (x, y) Q : x + y = c. Therefore, the contours are parallel segments, more precisely, they are the intersections of the parallel lines x + y = c with Q. The gure below shows the plots of the contours and of the bivariate density. The corresponding R code is > > > > > + > + + f <- function(x, y) x + y x <- seq(0, 1, length = 50) y <- seq(0, 1, length = 50) z <- outer(x, y, f) contour(x, y, z, col = "black", lty = "solid", asp = 1, lwd = 2, xlab = "X", ylab = "Y", main = "Contours of f(x,y) = x+y") persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = "lightblue", ltheta = 120, shade = 0.75, ticktype = "detailed", xlab = "X", ylab = "Y", zlab = "Density", main = "Plot of f(x,y) = x+y")

4 GENERAL BIVARIATE CONTINUOUS DISTRIBUTIONS

Contours of f(x,y) = x+y


1.0

Plot of f(x,y) = x+y

0.8

2.0

0.6

1.5

ity Dens

1.0 0.5 0.0 0.0 0.2 0.4

1.0 0.8 0.6 0.4

0.4

0.2

0.6 0.8

0.2

0.0

1.0 0.0 0.2 0.4 X 0.6 0.8 1.0

0.0

2. Marginal pdfs are


1 1

fX (x) =
0 1

fX,Y (x, y)dy =


0

(x + y)dy = x + 1/2, 0 x 1 and 0 elsewhere,


1

fY (y) =
0

fX,Y (x, y)dx =


0

(x + y)dx = y + 1/2, 0 y 1 and 0 elsewhere.

Here, independence test fails because clearly fX,Y (x, y) = x + y = (x + 1/2)(y + 1/2) = fX (x)fY (y). 3. The joint cdf is 0, x 0 ory 0, xy(x + y)/2, 0 x 1 0 y 1, x(x + 1)/2, 0 x 1 y 1, FX,Y (x, y) = y(y + 1)/2, 0 y 1 x 1, 1, x 1 y 1. Hence, FX,Y (1/2, 1/2) = 1/8. 4. We use the denition of conditional density. For any xed 0 x0 1, fY |X=x0 (y) = x0 + y fX,Y (x0 , y) = , 0 y 1 and 0 elsewhere. fX (x0 ) x0 + 1/2 fX,Y (x, y0 ) x + y0 = , 0 x 1 and 0 elsewhere. fY (y0 ) y0 + 1/2

Similarly, for any xed 0 y0 1, fX|Y =y0 (y) =