Professional Documents
Culture Documents
x = rnorm(100,mean = 10, sd = 4)
dnorm(8,mean = 10,sd = 4)
[1] 0.08801633
plot(density(x))
N (, 2 )
1 12 122
N , 2 2
2 12 2
To work with multivariate distributions, we need to use the package mvtnorm. Install and load the package.
1
install.packages("mvtnorm")
library(mvtnorm)
Suppose we want to create random data from the following joint distribution.
4 2 1
N ,
10 1 3
M = c(4,10)
S = matrix(c(2,1,1,3),nrow = 2,ncol = 2)
S
x = rmvnorm(1000,M,sigma = S)
x
[,1] [,2]
[1,] 3.4458896 9.982241
[2,] 1.6570184 9.352386
[3,] 4.9145387 8.972307
[4,] 5.7328493 10.121716
[5,] 3.8460262 10.883015
[6,] 5.1941458 10.597796
If you want to find out the probability of observing values of 4 and 9 for the two variables,
dmvnorm(c(4,9),mean = M,sigma = S)
[1] 0.05827419
To create the density function, we need the function kde2d() from the MASS package.
library(MASS)
y = kde2d(x = x[,1],y=x[,2])
persp(y,col="red",theta = 30)
1 ~ N (0, 12 ); 2 ~ N (0, 22 )
reg1 = lm(read~as.numeric(hsb2$female)+as.numeric(hsb2$ses)+socst,data =
hsb2)
summary(reg1)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 20.89628 3.87283 5.396 1.96e-07 ***
as.numeric(hsb2$female) 1.77089 1.13889 1.555 0.122
as.numeric(hsb2$ses) -0.88930 0.67222 -1.323 0.187
socst 0.58583 0.05373 10.903 < 2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
reg2 = lm(math~as.numeric(hsb2$female)+as.numeric(hsb2$ses)+science,data =
hsb2)
summary(reg2)
An implicit assumption in the above is that the error terms in the two models are uncorrelated. But perhaps
they are. For example, a student who does better than expected in reading (positive error) may also do better in
math (positive error) as the student may be overall smart; or perhaps the errors are negatively correlated (one
who is good in math may not be good in reading and vice versa). A better approach is to consider possibility
of correlation between the two errors terms as shown below.
1 0 12 122
N 0, 2 2
2
12 2
This approach is called seemingly unrelated regression. The following code provides the maximum
likelihood estimation for this specification.
3
library(bbmle)
LLSUR = function(a0,a1,a2,a3,b0,b1,b2,b3,s1,s2,s12)
{
y1 = a0 + a1*(as.numeric(hsb2$female)) + a2*(as.numeric(hsb2$ses)) +
a3*(hsb2$socst)
y2 = b0 + b1*(as.numeric(hsb2$female)) + b2*(as.numeric(hsb2$ses)) +
b3*(hsb2$science)
e1 = hsb2$read-y1
e2 = hsb2$math-y2
S = matrix(c(s1,s12,s12,s2),nrow = 2,ncol = 2)
LLsum = sum(dmvnorm(cbind(e1,e2),mean = c(0,0),sigma = S,log = T))
return(-1*LLsum)
}
res1 = mle2(minuslogl = LLSUR,
start =
list(a0=mean(hsb2$read),a1=0,a2=0,a3=0,b0=mean(hsb2$math),b1=0,b2=0,b3=0,s1=1
00,s2=100,s12=cov(hsb2$read,hsb2$math)))
summary(res1)
Coefficients:
Estimate Std. Error z value Pr(z)
a0 25.405590 4.108955 6.1830 6.290e-10 ***
a1 1.705661 1.133366 1.5050 0.132336
a2 -1.055350 0.670755 -1.5734 0.115631
a3 0.508460 0.058770 8.6518 < 2.2e-16 ***
b0 29.949253 3.761965 7.9611 1.706e-15 ***
b1 -0.668848 1.045196 -0.6399 0.522221
b2 -0.918586 0.609210 -1.5078 0.131597
b3 0.495192 0.061306 8.0774 6.615e-16 ***
s1 63.499317 6.428655 9.8775 < 2.2e-16 ***
s2 52.924588 5.424540 9.7565 < 2.2e-16 ***
s12 16.183306 5.338093 3.0317 0.002432 **
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Comparing the results to separate regressions, we see that the coefficients are somewhat different. Is there
correlation between the two error terms?
16.1833/(sqrt(63.499)*sqrt(52.9245))
[1] 0.2791613
Yet another illustration of how the maximum likelihood principle can enhance and allow you to design and
estimate more complicated specifications.