You are on page 1of 2

STAT 440: Homework 4 Due: 2/16/2017

For all problems, turn in the full R-code you used to answer the questions. In a separate file, use knitr to
report any analytic derivations you made. Clearly mark where your code switches from one problem to the
next. Many of these problems ask you to do multiple data manipulations. Include all analytic calculations
and write out algorithms for your sampling routines.

Upload your code and your report to the Canvas dropbox for HW 4.

1. Simulating from a ZIP Model. The Zero-Inflated Poisson distribution is useful for modeling count
processes where there are additional zero values. It is commonly used to model the counts of rare
events, where most of the time there will be no events. Let X ZIP (p, ) be a random variable from
the zero-inflated Poisson distribution with occurance probability p and rate . Then
(
(1 p) + p e , i = 0
P (X = i) = i
p ei! , i = 1, 2, . . .

Random variables from a ZIP distribution can also be written as a function of two other random
variables. If X ZIP (p, ), then
X =Y Z
where
Y Bern(p)
Z P ois()
(a) Simulate 10000 iid random variables Xi from Xi ZIP (0.3, 7) and plot a histogram of the
resulting random variables.
(b) Calculate the theoretical probabilities:

P (X = i) i = 0, 1, ..., 9

and compare them to the Monte Carlo estimates of these probabilities from your simulations.
(c) Estimate = E(X) where X ZIP (p, ). Use 1000 Monte Carlo samples to estimate , and give
a 95% confidence interval for your estimate.

2. Checking the Accuracy of a Monte Carlo Integration. Use Monte Carlo to estimate the integral
Z 4
= (3x2 2x 10)dx.
2

Perform this calculation m = 1000 times each for Monte Carlo sample sizes of n = 1000, n = 10, 000,
and n = 100, 000. For each n, plot a histogram of (n)1 , . . . , (n)m , and calculate the mean squared
error of the estimate,
m
1 X
M SE(n) = ((n)i 0 ),
m i=1

where (n)i is the ith MC estimate of for sample size n, and 0 is the true value of the integral .
STAT 440: Homework 4 Due: 2/16/2017

3. Tail Moments of the Standard Normal Distribution, Revisited. In the last HW, you estimated
the conditional moment of the standard normal distribution: Z N (0, 1)

= E[Z|Z > ]

using Monte Carlo. Now you will do the same using importance sampling.

(a) First, use the efficient sampler you wrote in your last HW (or use the one in the posted solutions)
to estimate 4.5 using Monte Carlo with 1,000 MC samples. Report the estimate 4.5 , the time it
took to compute the estimator (most of this time will be spent drawing the Z|Z > ), and your
standard error.
(b) Now you will estimate 4.5 using an importance sampler. First, use a N (, 2 ) as your proposal
distribution. Show to to write the expectation 4.5 as an expectation with respect to the N (, 2 )
distribution.
(c) Write the density of Z|Z > in terms of the normal pdf and the normal cdf. Implement this
density as an R function. Use your function to plot this density for enough values of z between -1
and 10 to make the plot look smooth.
(d) Implement an importance sampler using a N (, 2 ) as the proposal density. Again, use 1,000
samples. Estimate 4.5 using your importance sampler for a few different values of and 2 , and
calculate the standard error. For the best values of and 2 that you find, plot the corresponding
density on top of your plot of the density of Z|Z > , and report 4.5 , how long it took to compute
the estimator, and the standard error.
(e) Now try a different proposal density. Write another importance sampler that uses an Exp( = )
proposal density. Run your new importance sampler using 1,000 samples, and report 4.5 , how
long it took to compute the estimator, and the standard error.
(f) Make a table that summarizes your three estimators of 4.5 . This table should contain the point
estimate, the running time, and the standard error. Which estimator to you think is the best?

You might also like