You are on page 1of 24

Econ 221, Winter 2018

Old second midterm questions


These are a few second midterms from older classes, in one long list.
1. Suppose you roll a fair six-sided die 2 times. Let X denote the number of unique faces
showing from the rolls (for example, the rolls {r1 , r2 } = {2, 2} result in X = 1).
(a) (5 points) What is pX , the probability density function of X ?

Solution: For X = 1, you need to choose one face, and both rolls have to have
that value, so P {X = 1} = 6/62 = 1/6. For X = 2 you need to choose two distinct
faces, which means P {X = 2} =6 P2 /62 = 5/6. Therefore the probability density
function of X is
k pX (k)
1 1/6
2 5/6

(b) (5 points) Compute E [X ].

Solution: This follows from the definition of the expected value:


2
X 1 5 11
E [X ] = k · pX (k) = 1 · +2· = .
k=1
6 6 6

(c) (5 points) Compute Var (X ).

Solution: This also follows from the definition:


2
X 25 + 5 5
Var (X ) = (k−E [X ])2 pX (k) = (1−11/6)2 ·1/6+(2−11/6)2 ·5/6 = 3
= .
k=1
6 36

 2  E [X ], you
Because in the last part you found could also compute the
 2 variance
using the formula Var (X ) = E X − (E [X ]) , for which you need E X :
2

2
  X 21
E X2 = k2 pX (k) = 12 · 1/6 + 22 · 5/6 = .
k=1
6

21·6−112 5
Then the variance is Var (X ) = 36 = 36 .

2. Let X be a continuous random variable with density function


f X (x) = K x, x ∈ [0, θ ]
(and f X is zero otherwise) for some real numbers θ > 0 and K.
(a) (5 points) What is the value of K (as a function of θ )?

Solution: It is easy to verify that the density given is nonnegative when K > 0,
but the density also has to satisfy the condition that the total area under the curve
is 1. By computing the area under this curve, which is the same as the area inside
the triangle with vertices at (0, 0), (θ , 0) and (θ , Kθ ), you can find that
Z θ
Kθ 2
K xdx =
0
2
2
which implies that K= θ2 . Here’s a picture (I arbitrarily chose θ = 3).
0.6

K × θ2
0.5

area = =1
2
0.4

height = K × θ
1
0.3
0.2
0.1

length = θ
0.0

0 1 2 3

(b) (5 points) Find a formula for FX , the cumulative distribution function of X .

Page 2
Solution: FX (x) give you the probability that X ≤ x; therefore we need to find
the area under f X from 0 to x and that is the formula for FX . The area under
the density function between 0 and x is (analogously to the computations from
part a) is x 2 /θ 2 , or
x2
FX (x) = 2 .
θ
Another picture shows how you can find the area for any x (which is the same as
how you do it using θ ).
0.6
0.5
0.4

K × x2
area = FX(x) =
1

2
0.3
0.2

height = Kx
0.1

length = x
0.0

0 1 2 3

(c) (5 points) Let mX be the median of X , (recall, that is the point at which half the
distribution of X is below mX and half is above mX ). Compute mX as a function of θ .

Page 3
Solution: The meaning of the median given in the question means that mX sat-
isfies the condition FX (mX ) = 12 . That means we can solve for mX :
v
m2X 1 tθ2 θ
FX (mX ) = = implies that mX = =p .
θ2 2 2 2

3. Suppose you are the manager of a factory where four complicated electrical components
are built each day. The probability that any one of these components is built wrong the
first time and needs to be worked on again is 0.05. When any components need more
work, you have to spend a fixed cost of $1000, plus $200 per component that needs work.
(a) (5 points) Compute the expected value and standard deviation of the daily number
of components that need more work.

Solution: Let X denote the number of components (in one day) that are badly
1
built. Then X is a binomial random variable: X ∼ Bin(4, 20 ). That means that
4 1 1 19 76
E [X ] = 20 = 5 parts and Var (X ) = 4 · 20 · 20 = 400 , which implies the standard
p
19
deviation is 10 parts.

(b) (5 points) Compute the expected value and standard deviation of the daily cost of
these badly-built components.

Solution: If we let C denote the cost of badly-built parts, then we can express C
as a function of X : C = 200X + 1000. This implies
p
E [C] = $200E [X ] + $1000 = $1040 and σC = 200σX = $20 · 19 ≈ $87.18.

4. Consider the following experiment: there is an urn with 3 balls in it, marked 1, 2 and
3. Draw two balls out of the urn with replacement. Let X be the number of the first ball
drawn, and let Y be the minimum of the two numbers drawn (so for example, the draws
{2, 2} result in (X , Y ) = (2, 2)).
(a) (5 points) Find PX ,Y , the joint probability density function of (X , Y ).

Solution: It is probably easiest to start with the underlying sample space of balls
drawn, using these to define X and Y and to find their joint pdf. Here are a pair of
tables: in the first, the mapping between the sample space and X and Y is shown,
and each of these observations has equal probability of being drawn. These are
collected into the second table showing pX ,Y (you get the probabilities from the

Page 4
relative occurrences in the left table):
(b1 , b2 ) X Y
(1, 1) 1 1
(1, 2) 1 1
Y
(1, 3) 1 1
1 2 3
(2, 1) 2 1
1 1/3 0 0
(2, 2) 2 2
X 2 1/9 2/9 0
(2, 3) 2 2
3 1/9 1/9 1/9
(3, 1) 3 1
(3, 2) 3 2
(3, 3) 3 3

(b) (5 points) What is the probability that X = Y ?

Solution: From the pX table constructed in the last part, you can pick the parts
that satisfy the equation:
 3
 3
[ X 3 2 1 2
P {Y = X } = P {X = Y = k} = P {X = Y = k} = + + = .
k=1 k=1
9 9 9 3

(c) (5 points) Compute conditional probability density functions of X given Y = k for


each permissible value of k.

Solution: There are three conditional pdfs, since Y can be 1, 2 or 3. These are
all written together in one big table below. This table is similar to the table from
the last part, but it is turned on its side. To get each row, follow the definition of
the conditional density and divide pX ,k by pY (k) for the different values k that Y
may take.
x 1 2 3
pX |Y =1 (x|1) 3/5 1/5 1/5
pX |Y =2 (x|2) 0 2/3 1/3
pX |Y =3 (x|3) 0 0 1

(d) (5 points) Compute the expected values E [X ] and E [Y ].

Solution: From the table in part a, add up over rows or columns to find pX and
pY respectively. You get (also written in one table like the previous answer)
k 1 2 3
pX (k) 1/3 1/3 1/3
pY (k) 5/9 1/3 1/9

Page 5
Therefore the expected values of X and Y are

14
E [X ] = 1 · 1/3 + 2 · 1/3 + 3 · 1/3 = 2, E [Y ] = 1 · 5/9 + 2 · 3/9 + 3 · 1/9 =
9

(e) (5 points) Compute Cov (X , Y ).

Solution: Because you already know the expected values of X and Y , you can
compute the expected value of X Y and use the formula Cov (X , Y ) = E [X Y ] −
E [X ] E [Y ]. Here is a table showing the construction of E [X Y ]. Only the relevant
outcomes are included — that is, the ones that occur with positive probability.
(x, y) x · y pX ,Y (x, y) x y pX ,Y (x, y)
(1, 1) 1 1/3 3/9
(2, 1) 2 1/9 2/9
(2, 2) 4 2/9 8/9
(3, 1) 3 1/9 3/9
(3, 2) 6 1/9 6/9
(3, 3) 9 1/9 9/9
E [X Y ] = 31
9
Therefore the covariance of X and Y is
31 14 1
Cov (X , Y ) = −2· = .
9 9 3

(f) (5 points) Find the expected value and variance of X − Y .

Solution: The expected value is the easy part: we can apply a theorem from class
to say that E [X − Y ] = E [X ] − E [Y ] = 49 . However, the variance is a little harder
because we need the variances of X and Y . However, similar to the previous
expected value calculations it is not very difficult to verify that
(1 − 2)2 (2 − 2)2 (3 − 2)2 2
Var (X ) = + + =
3 3 3 3
and

5 · (1 − 14/9)2 3 · (2 − 14/9)2 (3 − 14/9)2 38


Var (Y ) = + + = .
9 9 9 81
This means that the variance of X − Y = X + (−1) · Y is
2 38 1 38
Var (X − Y ) = + −2· = .
3 81 3 81

Page 6
5. Suppose you would like to assess the impact of online advertisement on sales at your com-
pany. The number of views of your company’s banner ad on a popular website is random
from week to week, as are sales. The distribution given below is the joint distribution of the
weekly number of weekly ad views (in millions of views, rounded to the nearest million)
and weekly product sales (measured in thousands of units, to the nearest thousand).

Sales
2 3
1 1/4 1/12
Views 2 1/6 1/6
3 0 1/3
(a) (5 points) What are the average weekly sales in a week in thousands of units?

Solution: You can compute sales by finding the marginal density function of sales
and then doing the necessary computations. One way of doing this is like so:
S = s pS (s) s × pS (s)
2 5/12 10/12
3 7/12 21/12
31/12
So the expected sales are 31/12(≈ 2.5833) thousands of units.

(b) (5 points) What is the probability that at most 2 million people view the ad and that
sales are 3,000 units?

Solution: This is P {(V ≤ 2) ∩ (S = 3)} = pV,S (1, 3) + pV,S (2, 3) = 3/12 = 1/4.

(c) (5 points) What are the expected sales in a week in which at most 2 million people
view the ad?

Solution: This asks for E [S|V ≤ 2]. That means you need to compute the pdf of
sales given V ≤ 2. Define the event L := {V ≤ 2} (I wrote L for Low number of
views). The pdf pS|L can be found by computing
P {(S = 2) ∩ ((V = 1) ∪ (V = 2))} 5/12
ps|L (2) = P {S = 2|L} = = = 5/8
P {(V = 1) ∪ (V = 2)} 2/3
and
P {(S = 3) ∩ ((V = 1) ∪ (V = 2))} 3/12
ps|L (3) = P {S = 3|L} = = = 3/8.
P {(V = 1) ∪ (V = 2)} 2/3
Therefore the expected value of sales given a low number of views is

E [S|L] = 2 × 5/8 + 3 × 3/8 = 19/8

Page 7
or 2.375 thousands of dollars (note that this is lower than the unconditional ex-
pected value of sales, because that expected value takes into account the possi-
bility that there are many views, and accordingly higher sales).

(d) (5 points) What is the covariance of views and sales?

Solution: You can compute the covariance in a few ways. First you should com-
pute the expected value of views, which is 2 (million views). Then perhaps it is
easiest to calculate E [V × S]:

E [V × S] = 2 × 1/4 + 3 × 1/12 + 4 × 1/6 + 6 × 1/6 + 9 × 1/3


1 1 2 65
= + + +1+3= .
2 4 3 12
65
This implies the covariance is Cov (V, S) = 12 − 2 × 31
12 = 1/4.

(e) (5 points) Are number of views and sales statistically independent?

Solution: No; you can check a number of instances where pV,S (v, s) 6= pV (v) ×
pS (s). For example, let v = 3 and s = 3. You could also use the previous problem
and say that because their covariance is nonzero, V and S cannot be independent.

For parts (f) and (g), suppose your company receives 2 million views this week. Your
boss is considering the effect of placing an extra ad next week that would ensure
3 million views next week and wants to know what you think. This ad would cost
$1,000 more than you are currently paying for advertisement. You sell your product
for $10 a unit.
(f) (5 points) What is the average number of products sold in a week with 2 million
views of the ad? In a 3-million-view week?

Solution: This question asks for E [S|V = 2] and E [S|V = 3]. You need the con-
ditional pdfs first to compute these expectations. Luckily that’s not too hard: the
desired expected values are

E [S|V = 2] = 2 × 1/2 + 3 × 1/2 = 2.5


E [S|V = 3] = 2 × 0 + 3 × 1 = 3

(g) (5 points) Would you recommend spending the extra $1,000 towards marketing the
product this week? That is, does the expected marginal gain from increasing views
outweigh the marginal cost of this new advertisement?

Page 8
Solution: The marginal gain of moving from 2 to 3 million views results in an
expected change of

G = E [S|V = 3] − E [S|V = 2] = 3 − 2.5 = 0.5

thousand units, which means $5,000 (500 units times $10 a unit). Therefore it
would be profitable to use the opportunity to ensure 3 million views.

6. Let the random variable M represent the duration of a major Hollywood movie. Suppose
movie-lengths are uniformly distributed between 90 and 120 minutes. That is, the density
function of the distribution of M is
¨
K x ∈ [90, 120]
f M (x) =
0 otherwise.

(a) (5 points) What is the value of K?

Solution: The pdf f M needs to integrate to 1, which means that (since this density
is rectangular)
Z 120
1
1= f M (x)dx = (120 − 90) × K =⇒ K= .
90
30

The picture is this

Page 9
pdf of X

0.05
0.04

K
0.03
fX(x)
0.02
0.01
0.00

80 90 100 110 120 130


x

(b) (5 points) Find the cdf of M , F M , (you can draw a clearly labeled picture of F M as part
of your definition, if you like).

Solution: Recall that F M tells you P {M < m}, which is the area of the rectangle
under K and between 90 and m (for m ∈ [90, 120]; other values of m are kind of
boring). That area is (x − 90) × K (base times height). Outside of [90, 120], the
cdf has to be flat, and cdfs always have to be zero to the left of the support of a
random variable, and 1 to the right. Therefore the cdf is
x < 90

0
F M (x) = x−90
x ∈ [90, 120]
 30
1 x > 120
The picture looks like this:

Page 10
CDF of X

1.0
0.8
0.6
FX(x)
0.4
0.2
0.0

80 90 100 110 120 130


x

You should at least label the axes at the points on the graph where the function
changes its behavior — that is, at (90,0) and (120,1).

(c) (5 points) What is the expected value of the length of a movie? Please show your
work.

Solution: The expected value is


Z 120 Z 120
1
E [M ] = x · f M (x)dx = x· dx.
90 90
30

This expectation can be calculated by drawing a picture: note that the function
x
g(x) := 30 between 90 and 120 makes a trapezoid. This trapezoid has sides of
height 3 and 4, and its base is 30 units. That means the area under the trapezoid

Page 11
is E [M ] = 72 × 30 = 105. Once again, you can draw a picture: the area under
x · f M (x) is the trapezoid outlined with one solid and three dashed lines.
calculating the expected value
5
4
3
x ⋅ fX(x)
2
1
0

80 90 100 110 120 130


x

(d) (5 points) What is the probability that any given movie is longer than 117 minutes?

Solution: P {M > 117} = 1 − F M (117) = 1 − 117−90


30 = 0.1.

Now suppose you randomly select 10 movies and obtain their runtimes. Call the
average duration of these 10 movies M̄ .
(e) (5 points) What is the probability that [exactly] one of these 10 movies runs for
longer than 117 minutes?

Solution: Let L count the number of movies that last longer than 117 minutes.

Page 12
Then the probability that exactly one movie is longer than 117 minutes is equal
to P {L = 1}, and L ∼ Bin(10, 0.1) (the probability p comes from the previous
10
answer). Therefore the probability is P {L = 1} = 1 0.99 ×0.11 = 0.99 ≈ 0.3874.

(f) (5 points) What is the expected value of M̄ ?

Solution: Let each movie runtime be called Mi for i = 1, 2, . . . 10. The expected
value of the average duration is
10
– ™
  1 X
E M̄ = E Mi
10 i=1
10
1 X
= E [Mi ]
10 i=1
10
1 X
= 105
10 i=1
= 105.

(g) (5 points) What is the variance of M̄ ? You can use the fact that the variance of M
(that is, the variance of a single movie’s duration) is 75.

Solution: Using the notation from the last solution, this is


10
‚ Œ
 1 X
Var M̄ = Var Mi
10 i=1

By our rule about the variance of scaled random variables,


‚ 10 Œ
1 X
= Var Mi
100 i=1

and since the movies are independent,


10
1 X
= Var (Mi )
100 i=1
10
1 X
= 75
100 i=1
= 7.5.
The variability of the average movie runtime is much lower than the variability in
the runtime of a single movie.

Page 13
7. Suppose you work for a tax preparer’s office and need to allocate employees towards filling
a certain difficult tax form. You know the time it takes an employee to complete this form
follows a normal distribution with mean 85 minutes and standard deviation 15 minutes.
(a) (5 points) What is the probability that an employee takes less than one hour (60
minutes) to complete the form?

Solution: This can be computed by transforming to a standard normal random


variable and looking up the answer on the normal table: letting T be the random
variable denoting an employee’s time spent on the form,

T − 85 60 − 85
§ ª
P {T < 60} = P < = P {Z < −1.67} = (1 − 0.9525) = 0.0475.
15 15

(b) (5 points) What is the time (in minutes) at which 95% of all form-fillers will be fin-
ished?

Solution: You now want to know the time, which I will call t ∗ , such that P {T > t ∗ } =
0.05, or P {T < t ∗ } = 0.95. This may be answered by “inverting” the style of
the calculations in the last part: the normal table says that the z ∗ such that
P {Z < z ∗ } = 0.95 is z ∗ = 1.645. t ∗ is a linear transformation of z ∗ : calculate

t ∗ − 85
= 1.645 ⇒ t ∗ = 109.675.
15
After 109-110 minutes, only 5% of the employees will still be filling in the difficult
form.

(c) (5 points) Suppose you choose 5 employees at random to fill out this form. What
is the expected value of the number of employees who will be done one and a half
hours (90 minutes) from now?

Solution: Let D denote the number of employees who are done. This variable is
binomial, and has parameters n = 5 and p = P {T < 90} = P {Z < 1/3} = 0.6293.
Therefore
E [D] = 5 × 0.6293 = 3.1465.

8. Canadians often buy cigarettes and liquor when they visit the United States, due to the
relatively low price of these goods there, and import them over the border into Canada
when they return. Below is a table showing the number of bottles of liquor and cartons of
cigarettes imported per person through a typical Canadian customs booth.

Page 14
Bottles
0 1
0 0.65 0.10
Cartons
1 0.20 0.05

(a) (5 points) What is the expected value and variance of the number of bottles of liquor
imported per person?

Solution: Let B be a random variable denoting the number of bottles imported.


The expected value and variance depend on the marginal distribution, which is
k 0 1
pB (k) 0.85 0.15
That means that the expected value of the number of bottles is

E [B] = 0.85 · 0 + 0.15 · 1 = 0.15.

The variance is

Var (B) = 0.85(0 − 0.15)2 + 0.15(1 − 0.15)2 = 0.1275.

(b) (5 points) What is the expected value and variance of the number of cartons of
cigarettes imported per person?

Solution: Similarly to the previous question, let C denote the number of cartons
imported; then
j 0 1
pC ( j) 0.75 0.25
Then the expected number is

E [C] = 0.75 · 0 + 0.25 · 1 = 0.25.

and the variance is

Var (C) = 0.75(0 − 0.25)2 + 0.25(1 − 0.25)2 = 0.1875.

(c) (5 points) If you know that a person is not importing any cigarettes, what is the
expected number of bottles of liquor imported?

Solution: This depends on a conditional pdf: the pdf of B given that C = 0. This
pdf is

Page 15
` 0 1
pB|C (`|0) 13/15 2/15
Then the expected number of bottles is

13 2
E [B|C = 0] = ·0+ · 1 = 2/15.
15 15

(d) (5 points) What is the covariance between the number of cartons of cigarettes and
the number of bottles of liquor imported per person?

Solution: It is easiest to first find the expected value of the product of B and C:

E [BC] = (0.65 + 0.10 + 0.20) · 0 + 0.05 · 1 = 0.05.

Then the covariance is

Cov (B, C) = E [BC] − E [B] E [C]


= 0.05 − 0.15 · 0.25 = 0.0125.

For parts (e)-(g), suppose a new import tax law is passed: at the border, agents will
collect $2 tax per carton of cigarettes imported and $3 per bottle of liquor. Assume
that this new tax does not induce a change in the amount of goods imported.
(e) (5 points) How much tax revenue will be collected on average per person?

Solution: This is a linear function of the number of bottles and cartons; letting
T be the amount of tax collected per person, you know that T = 2C + 3B. That
means

E [T ] = E [2C + 3B] = 2E [C] + 3E [B] = 2 · 0.25 + 3 · 0.15 = 0.95.

(f) (5 points) Given these new tax rules, what is the variance in the amount of tax rev-
enue collected per person?

Solution: Using the previous results, the variance is

Var (T ) = Var (2C + 3B) = 4Var (C) + 9Var (B) + 2 · 2 · 3Cov (B, C)
= 4 · 0.1875 + 9 · 0.1275 + 12 · 0.0125
= 2.0475.

Page 16
(g) (5 points) Are the number of cartons imported and the number of bottles imported
independent random variables?

Solution: No: because the covariance is nonzero, they are not independent. You
could also check this directly from the joint pdf.

9. Suppose you and your friend stand in a line of 6 people for a picture, where all arrange-
ments of people are equally likely. Let N be a random variable counting the number of
people standing between you and your friend.
(a) (5 points) Find the pdf of N (including its support).

Solution: You can find the pdf either by enumerating the sample space or using
6
combinatorial arguments; there are 2 = 15 total ways to arrange you and your
friend in a line of 6 people. There are several ways to find the answer, but you can
show that pN (k) = (5 − k)/15, for k = 0, . . . , 4. You could alternatively express
this as a table, but the formula is a bit more compact.
k 0 1 2 3 4
pN (k) 1/3 4/15 1/5 2/15 1/15

(b) (5 points) Find the cdf of N at the values of the support of N .

Solution: The cdf is the cumulative total of the pdf from 0 up to a chosen value
k, and it is in this table:
k 0 1 2 3 4
FN (k) 1/3 3/5 4/5 14/15 1

(c) (5 points) Find the pdf of N given that N ≤ 2.

Solution: The cdf tells you that P {N ≤ 2} = FN (2) = 4/5; note that by the def-
inition of conditional probabilities, the support is now only on 0, 1 and 2. This
means the pdf is
k 0 1 2
pN |N ≤2 (k) 5/12 1/3 1/4

(d) (5 points) What is the expected number of people between you and your friend, given
that at most two people stand between the two of you?

Solution: Once you have the conditional pdf above, the expected value is straight-
forward:
E [N ] = 0 · 5/12 + 1 · 1/3 + 2 · 1/4 = 5/6.

Page 17
(e) (5 points) What is the variance of the number of people between you and your friend,
given that at most two people stand between the two of you?

Solution: The variance is

Var (N ) = (0 − 5/6)2 · 5/12 + (1 − 5/6)2 · 1/3 + (2 − 5/6)2 · 1/4 ≈ 0.6389.

10. Suppose you own two separate forests that you use to produce maple syrup. From year to
year, weather conditions make it so that the syrup produced from a forest may be grade 1, 2
or 3 (grade-1 syrup can be sold for the highest price, so the grade of the syrup is important).
Because of the weather, the trees in forest A have a 65% chance of producing grade-1 syrup
and a 35% chance of producing grade-2 syrup, while forest B produces grade-1 syrup with
probability 30%, grade-2 syrup with probability 45% and the probability of producing
grade-3 syrup is 25%. The forests are distant enough from one another that the syrup
quality from each can be considered independent; forest A contributes 60% of the supply
to your total output and forest B contributes 40%.
(a) (5 points) Suppose that you choose one bottle from each forest and test each for
the grade of the syrup. Find the pdf of this selection (i.e., in terms of pairs of syrup
grades).

Solution: This is the joint pdf of the grades associated with a pair of bottles; let
A denote the grade of a bottle from forest A and B denote the grade of a bottle
from forest B. Then because the forests are independent you can find the joint
pdf by multiplying the marginal pdfs together. This is shown (with marginals in
the margins) in the following table:
A
1 2
1 0.195 0.105 0.3
B 2 0.2925 0.1575 0.45
3 0.1625 0.0875 0.25
0.65 0.35

(b) (5 points) What is the covariance between the grade from forest A and the grade
from forest B?

Solution: Because the grade variables A and B can be considered independent


random variables, they have 0 covariance.

For questions (c)-(d), suppose a maple syrup regulatory agent comes to your pro-
duction facility and randomly samples two bottles of your syrup from the production
line.

Page 18
(c) (5 points) Assuming the agent took bottles that contained syrup from forest A, what
is the pdf of the number of grade-1 bottles out of the two selected? What is it for the
bottles from forest B? That is, write down two pdfs (with their supports).

Solution: Let X be the number of bottles graded 1. Given that the bottles come
from forest A, this is a binomial random variable with n = 2 trials and probability
of success p = 0.65. That means the pdf is

2
 ‹
pX |A(k|A) = 0.65k · 0.352−k ,
k

for k = 0, 1 and 2. Similarly, for bottles from forest B, the pdf is

2
 ‹
pX |B ( j|B) = 0.3 j · 0.72− j
j

for j = 0, 1 or 2.
You could also have written these as tables, in which case the pdfs look like this:
k 0 1 2 j 0 1 2
pX |A(k|A) 0.1225 0.455 0.4225 pX |B ( j|B) 0.49 0.42 0.09

(d) (5 points) What is the pdf of the number of grade-1 syrup bottles that the agent takes,
regardless of which forest they came from? (Hint: your forests form a partition of
the total supply.)

Solution: Let X be the number of grade-1 bottles found in the sample that the
agent inspects. The pdf of X is a description of the probability that the agent takes
a certain number of grade-1 bottles out of two trials. These bottles come from
two sources, and you know the pdfs associated with those sources, which form a
partition of the sample space. So, for example,

P {X = 0} = P {X = 0|A} P {A} + P {X = 0|B} P {B}


= pX |A(0|A)P {A} + pX |B (0|B)P {B}
2 2
 ‹  ‹
= 0.65 · 0.35 × 0.6 +
0 2
0.30 · 0.72 × 0.4
0 0

Generalizing this to any number k of bottles, for k = 0, 1 or 2, this is

2 2
 ‹  ‹
pX (k) = 0.6 · 0.65 · 0.35 + 0.4 ·
k 2−k
0.3k · 0.72−k .
k k

In tabular form, this pdf works out to


k 0 1 2
pX (k) 0.2695 0.441 0.2895

Page 19
11. Suppose that you perform the following experiment. First you throw a dart at a dartboard.
The probability that you hit the board is 2/3. If you hit the board with your dart, you
will flip a fair coin twice and record the number of heads that appear (the flips can be
considered independent). If you miss, you will flip the coin once and record how many
heads you flip. Let D record the number of times you hit the dartboard in one run of this
experiment. Let C record the number of heads you flip in one run of this experiment.
(a) (5 points) Write down the probability density function of C given that you miss the
dartboard. Also write down the PDF of C given that you hit the dartboard.

Solution: If you miss the dartboard, you will only toss one coin, which means
that you have a binomial random variable with one trial (also called a Bernoulli
random variable). This has the following PDF (D = 0 when you miss the dart-
board): ¨
1
k ∈ {0, 1}
pC|D (k|D = 0) = 2
0 otherwise.
If you hit the dartboard, C ∼ Bin(2, 1/2), so

1

 4 k=0

1
k=1
pC|D (k|D = 1) = 21

 4 k=2

0 otherwise.

There are other ways of writing down these conditional density functions.

(b) (5 points) Find the joint probability density function of C and D.

Solution: To find joint PDF values for these two random variables, remember
the rule P {A ∩ B} = P {A|B} P {B}. In this case, the probabilities are expressed
as density values, but other than that notational change there is no difference:
written specifically for this problem, the rule is pC D ( j, k) = pC|D ( j|k)p D (k). You
also know that 1
3 k = 0
p D (k) = 32 k = 1

0 otherwise.
So you can use this information with the densities you wrote in the previous
problem to fill in a table.
C
0 1 2
0 1/6 1/6 0 1/3
D
1 1/6 1/3 1/6 2/3
1/3 1/2 1/6 1

Page 20
The marginal PDFs for C and D are shown on the bottom and right-hand margins
of the table respectively.

(c) (5 points) Find the joint cumulative distribution function of C and D.

Solution: This can be found from the joint PDF by adding up the probabilities
that, in that table, lie to the left and above the cell you want to find a CDF value
for. That means you get this table:
C
0 1 2
0 1/6 1/3 1/3
D
1 1/3 5/6 1
Now there are no marginal CDFs, you would have to get each of those by adding
up the values of the marginal PDFs.

(d) (5 points) If you repeat this experiment many times, how many heads do you expect
to flip on average? What is the average number of times you hit the dartboard?

Solution: The expected values of C and D are found by applying the expected
value formula, and lead to these calculations:

E [D] = 0 × 1/3 + 1 × 2/3 = 2/3


E [C] = 0 × 1/3 + 1 × 1/2 + 2 × 1/6 = 5/6.

(e) (5 points) Find the variance of C and the variance of D.

Solution: The
 2variances can be most easily calculated using the variance formula
2

Var (X ) = E X − E [X ] . First find
 
E D2 = 02 × 1/3 + 12 × 2/3 = 2/3
 
E C 2 = 02 × 1/3 + 12 × 1/2 + 22 × 1/6 = 7/6.

This means

Var (D) = 2/3 − (2/3)2 = 2/9


Var (C) = 7/6 − (5/6)2 = 17/36.

(f) (5 points) Calculate Cov (C, D).

Page 21
Solution: Similar to the last question, it’s probably easiest to use the formula
Cov (C, D) = E [C D] − E [C] E [D]. You just need to calculate the first of these,
which is easy because there are so many values where c × d = 0. Leaving out all
the zero-valued parts, the calculation boils down to

E [C D] = 1 × 1 × 1/3 + 1 × 2 × 1/6 = 2/3.

That means
Cov (C, D) = 2/3 − 2/3 × 5/6 = 1/9.

(g) (5 points) Calculate the mean and variance of C + D.

Solution: The mean of C + D is easy:

E [C + D] = E [C] + E [D] = 5/6 + 2/3 = 3/2.

The variance is a little harder. These variables aren’t independent, so you have to
use the full-blown formula

Var (C + D) = Var (C) + Var (D) + 2Cov (C, D)


= 13/18 + 2/9 + 2 × 1/9
= 17/36 + 16/36 = 11/12.

12. Suppose that a random number generator produces random numbers that are distributed
according to this density: ¨
K x ∈ [−3, 7]
f X (x) =
0 otherwise
for some K ∈ R.
(a) (5 points) What is K?

Solution: This density function needs to be nonnegative, and it has to integrate


to 1. In order to make it nonnegative you just need to make K ≥ 0, which is easily
satisfied. Now you need to make sure that it integrates to 1. This means
Z7
Kdx = 1.
−3
Since this is some kind of rectangle, with width 7 − (−3) = 10 and height K, the
integral can be computed, and gives you the right value of K:
Z7
1= Kdx = 10K ⇒ K = 1/10.
−3

Page 22
(b) (5 points) What is the probability that a generated number is less than 0?

Solution: This is asking for an answer that can be found by integrating the PDF
up to the asked-for value 0: that’s a rectangle with height K = 1/10 and width
0 − (−3) = 3. Therefore
P {X ≤ 0} = 3/10.
This incidentally is how you would go about finding the CDF FX , just for any value
x — the rectangle always has height 1/10, but its width changes depending on
x, like width = x − (−3) = 3 + x. That means

x < −3

0
3+x
FX (x) = 10 −3 ≤ x < 7

1 x ≥ 7.

(c) (5 points) Use the definition of expected value to calculate the average of the num-
bers produced by this random number generator.

Solution: To find the expected value, you need to integrate the function x ×
f X (x) = x/10 over the support X = [−3, 7], or in other words,
Z 7

K xdx.
−3

You can plot the function in the integral, and it looks like this figure:

Page 23
Triangles for the expected value
7K

0.6
0.4
0.2
+
Kx

−3 0
0.0

7

−0.2

−3K

−2 0 2 4 6
x
The area of the left (red) triangle has to be subtracted from the area of the right
(blue) triangle. This calculation is done in the next equation:

(7 − 0) × (7K − 0K) (−3 − 0) × (0K − (−3)K)


Area = +
2 2
49K − 9K
=
2
= 20K = 2.

Page 24

You might also like