Professional Documents
Culture Documents
Var(j ) = 2,
Cov(j , k ) = 0 j 6= j.
522
Cov(Y) = 2I.
Residuals
=YY
0
= Y0(I H)0(I H)Y = Y0(I H)Y
525
Sums of squares
We can partition variability in y into variability due to changes
in predictors and variability due to random noise (effects
other than the predictors). The sum of squares decomposition is:
n
X
j=1
|
(yj y)2 =
(
y y)2 +
X
j
{z
Total SS
{z
}
SSReg.
2 .
j
| {z }
SSError
SSR
SSE
=1
.
SST
SST
) = ,
E(
) = 2(Z 0Z)1.
Cov(
For residuals:
E(
) = 0,
Cov(
) = 2(I H),
E(
0
) = (n r 1) 2.
An unbiased estimate of 2 is
0 (I H)Y
0
Y
SSE
2
=
=
s =
n (r + 1)
nr1
nr1
527
is distributed independent of
and furthermore
Confidence Intervals
) = s2(Z 0Z)1, a 100(1 )%
Estimating Cov( ) as Cov(
confidence region for is the set of values of that satisfy:
1
)0Z 0Z(
) (r + 1)Fr+1,nr1(),
(
2
s
where r + 1 is the rank of Z.
c0
0 = z0
An unbiased estimate is Y
0 with variance z0 (Z Z) z0 .
530
531
Ynm =
(r+1)m =
01 02 0m
11 12 1m
...
...
...
...
r1 r2 rm
h
i
=
11 12
21 22
...
...
n1 n2
1p
01
0
h
i
2p
2
=
=
..
.
(1)
(2)
(p)
.
..
0n
np
535
Cov((i), (k)) = ik I,
i, k = 1, 2, ..., p.
536
(1)
(2)
(p)
= (Z 0Z)1Z 0
)0(Y
(Y(1) Z (1)
(1) Z (1) )
(Y(2) Z (2))0(Y(1) Z (1))
...
(Y(p) Z (p))0(Y(1) Z (1))
)0(Y
(Y(1) Z (1)
(p) Z (p) )
(Y(2) Z (2))0(Y(p) Z (p))
...
= Y Y
The usual decomposition into sums of squares and crossproducts can be shown to be:
0
Y
| {zY}
T otSSCP
0
Y
|{zY
}
RegSSCP
0
,
|{z}
ErrorSSCP
0
= Y 0Y Y
539
Propeties of Estimators
For the multivariate regression model and with Z of full rank
r + 1 < n:
E() = ,
(i),
(k)) = ik (Z 0Z)1,
Cov(
i, k = 1, ..., p.
Estimated residuals
(i) satisfy E(
(i)) = 0 and
E(
0(i)
(k)) = (n r 1)ik ,
and therefore
E(
) = 0,
E(
0
) = (n r 1).
Prediction
For a given set of predictors z00 =
1 z01
simultaneously estimate the mean responses
response variables as z00 .
z0r we can
z00 for all p
(i) (i))(
(k) (k))0z0] = z00 [E((i) (i))((k) (k))0]z0
E[z00 (
= ik z00 (Z 0Z)1z0.
541
Prediction
A single observation at z = z0 can also be estimated un (i)) and
biasedly by z00 but the forecast errors (Y0i z00
(k)) may be correlated
(Y0k z00
(i))(Y0k z00
(k))
E(Y0i z00
(k) (k)))
(i) (i)))((0k) z00 (
= E((0i) z00 (
(i) (i))(
(k) (k))z0
= E((0i)(0k)) + z00 E(
= ik (1 + z00 (Z 0Z)1z0).
542
The MLE of is
=
1 0
1
= (Y Z )0(Y Z ).
n
n
is Wp,nr1().
The sampling distribution of n
543
H0 : (2) = 0,
If we set Z =
where =
(1),(q+1)p
(2),(rq)p
max(1), L((1), )
max, L(, )
1)
L((1),
=
=
L(, )
!n/2
||
1|
|
||
2 ln = n ln
1|
|
|n|
= n ln
.
|n + n(1 )|
546
1
||
[nr1 (pr+q+1)] ln
1|
2
|
2
p(rq) , approximately.
547
Other Tests
Most software (R/SAS) report values on test statistics such
as:
|
|
1. Wilks Lambda: =
|0 |
0 )
1]
2. Pillais trace criterion: trace[(
0)
1]
3. Lawley-Hotellings trace: trace[(
0
1
4. Roys Maximum Root test: largest eigenvalue of
Note that Wilks Lambda is directly related to the Likelihood
Qp
Ratio test. Also, = i=1(1 + li), where li are the roots of
0 l(
0)| = 0.
|
548
p2 j 2 4
, and k =
p2 +j 2 5
Then
1 1/s ks r
Fpj,ksr
1/s
pj
(Y Z(1)
and therefore, the likelihood ratio test is also a test of extra
sums of squares and cross-products.
1 ,
then the extra predictors do not contribute to
If
reducing the size of the error sums of squares and crossproducts, this translates into a small value for 2 ln and we fail
to reject H0.
550
Example-Exercise 7.25
Amitriptyline is an antidepressant suspected to have serious
side effects.
Data on Y1 = total TCAD plasma level and Y2 = amount
of the drug present in total plasma level were measured on
17 patients who overdosed on the drug. [We divided both
responses by 1000.]
Potential predictors are:
z1 = gender (1 = female, 0 = male)
z2 = amount of antidepressant taken at time of overdose
z3 = PR wave measurements
z4 = diastolic blood pressure
z5 = QRS wave measurement.
551
Example-Exercise 7.25
We first fit a full multivariate regression model, with all five
predictors. We then fit a reduced model, with only z1, z2,
and performed a Likelihood ratio test to decide if z3, z4, z5
contribute information about the two response variables that
is not provided by z1 and z2.
1
17
"
0.87 0.7657
0.7657 0.9407
1 =
1
17
"
1.8004 1.5462
1.5462 1.6207
552
Example-Exercise 7.25
We wish to test H0 : 3 = 4 = 5 = 0 against H1 : at least
one of them is not zero, using a likelihood ratio test with
= 0.05.
= 0.0008 and |
1| = 0.0018.
The two determinants are ||
Then, for n = 17, p = 2, r = 5, q = 2 we have:
||
1
=
(n r 1 (p r + q + 1)) ln
2
| 1 |
1
0.0008
(17 5 1 (2 5 + 2 + 1)) ln
= 8.92.
2
0.0018
!
553
Example-Exercise 7.25
The critical value is 2
(0.05) = 12.59. Since 8.92 <
2(52)
12.59 we fail to reject H0. The three last predictors do not
provide much information about changes in the means for the
two response variables beyond what gender and dose provide.
Note that we have used the MLEs of under the two hypotheses, meaning, we use n as the divisor for the matrix of
sums of squares and cross-products of the errors. We do not
use the usual unbiased estimator, obtained by dividing the
matrix E (or W ) by n r 1, the error degrees of freedom.
What we have not done but should: Before relying on
the results of these analyses, we should carefully inspect the
residuals and carry out the usual tests.
554
Prediction
If the model Y = Z + was fit to the data and found to be
adequate, we can use it for prediction.
Suppose we wish to predict the mean response at some value
z0 of the predictors. We know that
z00 Np(z00 , z00 (Z 0Z)1z0).
We can then compute a Hotelling T 2 statistic as
z00 z00
T 2 = q
0
0
1
z0(Z Z) z0
1
z00 z00
n
q
.
0
0
1
nr1
z0(Z Z) z0
555
(z00 z00 )0
(z00 z00 )
nr1
!
#
"
p(n r 1)
Fp,nrp() .
z00 (Z 0Z)1z0
nrp
v
!
u
u p(n r 1)
t
F
nrp
n
0 (Z 0 Z)1 z
z
()
ii .
p,nrp
0
0
nr1
556
(Y0 z00 )
(Y0 z00 )0
nr1
!
#
"
p(n r 1)
Fp,nrp() .
(1 + z00 (Z 0Z)1z0)
nrp
557
v
!
u
u p(n r 1)
t
F
nrp
n
0
0
1
ii ,
p,nrp () (1 + z0 (Z Z) z0 )
nr1
diagonal element of .
558
0.0567 0.2413
= 0.5071
0.6063 ,
0.00033 0.00032
"
0.1059 0.0910
0.0910 0.0953
"
0.8938
0.685
1
nr1
"
43.33 41.37
41.37 48.15
"
"
0.8938 0
43.33 41.37
)
(z00
0.685
41.37 48.15
0.100645(2.1538 3.81).
"
0.8938
)
0.685
562
0.685
2.1538 3.81 0.100645 (17/14) 0.0953
= 0.685 0.309,
for E(Y01) and E(Y02), respectively.
Note: If we had been using sii (computed as the i, ith element
of E over n r 1) instead of
ii as an estimator of ii, we
would not be multiplying by 17 and dividing by 14 in the
expression above.
563
0.685
2.1538 3.81 1.100645 (17/14) 0.0953
= 0.685 1.022.
As anticipated, the confidence intervals for single forecasts
are wider than those for the mean responses at a same set
of values for the predictors.
564
Multivariate Tests:
Df
Pillai
3
Wilks
3
Hotelling-Lawley 3
Roy
3
---
test stat
0.6038599
0.4405021
1.1694286
1.0758181
Dropping predictors
fit1.lm <- update(fit.lm, .~ . - PR - dBP - QRS)
Coefficients:
(Intercept)
gender1
antidepressant
TCAD
0.0567201
0.5070731
0.0003290
drug
-0.2413479
0.6063097
0.0003243
568
569
Res.Df Df Gen.var.
Wilks approx F num Df den Df
Pr(>F)
1
13
0.034850
2
14 1 0.051856 0.38945
9.4065
2
12 0.003489 **
---