Professional Documents
Culture Documents
Practice 1
EXERCICE 1
Assume that the measurement error of a certain device is distributed according to a uniform
distribution on the interval [0, ]. We want to estimate the parameter () from samples of different
sizes, simulated with R.
1. Construct the estimators of using the moments and maximum likelihood methods.
2. Simulation exercice:
Generate the value of the parameter, fixing the seed. The true value of the parameter
will be a random value between 0 and 2:
> set.seed(12345)
> theta=runif(1,min=0,max=2)
We do the process one time: Mesure k=10 values of measurement errors (generate with
R) and estimate its value using both methods. (Hint: Remember in R, the uniform values
are generated by the function runif, the maximum is calculated with the max function
and the mean with the meanfunction.
We do the simulation 1000 times and record the values of both estimators for each
sample:
#Matrix with 1000 rows and 2 columns for the estimators
> res=matrix(0,nc=2,nr=1000)
> for (i in 1:1000){
+ x=...generate k=10 values of an uniform between 0 and theta...
+ res[i,]=c(...value of MM estimator...,...value of ML estimator...)
+ }
We graph the values of the estimators in a boxplot and we mark the real value of with
a dotted red line.
> boxplot(res)
> abline(h=theta,col=2,lty=3)
Compute the mean and the variance of the calculated statistics:
> apply(res,2,mean)
> apply(res,2,var)
1
Moments method Maximum Likelihood
Sample size Mean Variance Mean Variance
k = 10
k = 100
k = 1000
3. Repeat this simulation exercice with samples of size k= 100 and 1000. Fill the table above.
4. Based on the results, argue advantages and disadvantages of each estimator based on the
sample size. According to the criteria of lower mean square error, which do you think is the
best estimator?
EXERCICE 2
Consider that the height of the inhabitants of a country could be modelized with a random
variable X N ( = 175, 2 = 102 ).
x <- rnorm(20,mean=175,sd=10)
2. Do a descriptive statistics analysis to calculate the mean and the standard deviation.
> summary(x)
> mean(x)
> sd(x)
> ic(x,10)
The functions code can be saved in files with the extension .R. Functions stored can be loaded
into the program using
> source("C:/misprogramas/R/ic.R")
being ic.R the file with the code of the function ic.
2
4. Calculate the CI99 % (). How do you explain that this interval is wider more reliable than the
previous one?
Why this interval is expected to be wider than the previous one? Could it have been narrower?
6. The CI95 % () supposing 2 unknown could be obtained with the R instruction t.test:
> t.test(x)
EXERCICE 3
We will reproduce the results taking 200 samples of 50 people and measuring its height. We
take 200 samples of random variable height X N ( = 175, 2 = 102 ), with size n = 50 each.
3. Calculate the mean and the standard deviation of the 200 sample means. Draw a histogram.
Does the shape of the histogram of the function remember the shape of the density of a
normal (Gaussian)?
You will have in vs 200 values of S 2 , for n = 50. Draw a histogram of vs.
> hist(vs2)
3
Does the shape of the histogram of the function remember the shape of the density of a
Chi-Square?
5. Calculate for each of these 200 samples the lower end and the upper end of the confidence
interval CI95 % () with known:
Bottom end:
Upper end:
6. How many of these intervals do not contain the true value of the parameter = 175?
You can draw the 200 intervals using a graph. You also can see the proportion of intervals
that do not contain the true value of
> plot(0,type="n",xlim=c(0,200),ylim=c(167,183))
> abline(h=175,col=4)
> segments(1:200,ll,1:200,ul,col=1+(ll>175 | ul<175))
7. Repeat the last two items and construct CI95 % () with unknown.