You are on page 1of 34

M-Stat-503

Advanced Multivariate Analysis

Multivariate Regression Analysis

Dr. Md. Rezaul Karim


Professor
Department of Statistics, University of Rajshahi
mrezakarim@yahoo.com; mrkarim@ru.ac.bd

February 11, 2018

Multiple Linear Regression


We can distinguish three cases according to the number
of variables:
1. Simple linear regression: one y and one x. For
example, suppose we wish to predict college grade
point average (GPA) based on an applicant’s high
school GPA.
2. Multiple linear regression: one y and several x’s.
For example, we could attempt to improve our
prediction of college GPA by using more than one
independent variable, such as, high school GPA,
standardized test scores [e.g., American College
Testing (ACT) or Scholastic Aptitude Test (SAT)], or
rating of high school.
Prof. M. R. Karim, Stats., R.U. 2

1
Multiple Linear Regression
3. Multivariate multiple linear regression: several y’s
and several x’s. In the preceding illustration, we may
wish to predict several y’s (such as number of years
of college the person will complete or GPA in the
sciences, arts, and humanities).
• To further distinguish case 2 from case 3, we could
designate case 2 as univariate multiple regression
because there is only one y. Thus in case 3,
multivariate indicates that there are several y’s and
multiple implies several x’s.
• The term multivariate regression usually refers to
case 3.
Prof. M. R. Karim, Stats., R.U. 3

Model with Two Predictor Variables


First-Order Model with Two Predictor Variables
• When there are two predictor variables Xl and X2,
the regression model:
Yi = β0 +β1Xi1 +β2Xi2 + εi , i =1, 2, …, n
• is called a first-order model with two predictor
variables.
• A first-order model is linear in the predictor
variables.
• Yi denotes as usual the response in the ith trial,
and Xi1 and Xi2 are the values of the two predictor
variables in the ith trial.
Prof. M. R. Karim, Stats., R.U. 4

2
Model with Two Predictor Variables
• The parameters of the model are β0, β1, and β2,
and the error term is εi.
• Assuming that E{εi } = 0, the regression function
for the model is:
E{Y} = β0 +β1X1 +β2X2
• Analogous to simple linear regression, where the
regression function is a line, regression function
with two predictors is a plane.

Prof. M. R. Karim, Stats., R.U. 5

Model with Two Predictor Variables


Meaning of Regression Coefficients:
• For example, let E{Y} = 10 +2X1 +5X2
• The parameter β0=10 is the Y intercept of the
regression plane.
• If the scope of the model includes X1 =0, X2 = 0,
then β0=10 represents the mean response E{Y} at
X1 =0, X2 = 0.
• Otherwise, β0 does not have any particular
meaning as a separate term in the regression
model.
Prof. M. R. Karim, Stats., R.U. 6

3
Model with Two Predictor Variables
• The parameter β1 indicates the change in the
mean response E{Y} per unit change in X1 when
X2 is held constant.
• Likewise, β2 indicates the change in the mean
response E{Y} per unit change in X2 when X1 is
held constant.
• The parameters β1 and β2 are sometimes called
partial regression coefficients because they
reflect the partial effect of one predictor variable
when the other predictor variable is included in
the model and is held constant.
Prof. M. R. Karim, Stats., R.U. 7

Model with more than two predictor variables

First-order regression model with more than two


predictors:
• We consider now the case where there are p-1
predictor variables X1, ..., Xp-1.
• The regression model:
Yi = β0 +β1Xi1 +β2Xi2 + … + βp-1Xi,p-1 + εi
• is called a first-order model with p-1 predictor
variables.

Prof. M. R. Karim, Stats., R.U. 8

4
Model with more than two predictor variables
• It can also be written: p 1
Yi  0    k X ik   i
k 1

• or, if we let Xi0 = 1, it can be written as:


p 1
Yi    k X ik   i
k 0

• Assuming that E{εi}= 0, the response function


for this regression model is:
E{Y}= β0 +β1X1 +β2X2 + … + βp-1Xp-1
• This response function is a hyperplane, which is
a plane in more than two dimensions.
Prof. M. R. Karim, Stats., R.U. 9

Model with more than two predictor variables


• This model is known as general regression
model, with normal error terms.
• Where,
o β0, β1, …, βp-1 are parameters
o X1, X2 , …, Xp-1 are known constants
o εi are independent N(0, σ2)
Note:
• When p-1 = 1, this regression model reduces to
the simple linear regression model.

Prof. M. R. Karim, Stats., R.U. 10

5
Model with Qualitative Predictor Variables
• The general linear regression model
encompasses not only quantitative predictor
variables but also qualitative ones, such as
gender (male, female) or disability status (not
disabled, partially disabled, fully disabled), etc.
• We Use indicator variables that take on the
values 0 and 1 to identify the classes of a
qualitative variable.

Prof. M. R. Karim, Stats., R.U. 11

Model with Qualitative Predictor Variables


• Consider a regression analysis to predict the
length of hospital stay (Y) based on the age (X1
and gender (X2) of the patient.
• We define X2 as follows:
1 if patient female
X2  
0 if patient male

• The first-order regression model then is as


follows:
Yi   0  1 X i1   2 X i 2   i , where:
1 if patient female
X i1  patient's age, X i 2  
0 if patient male
Prof. M. R. Karim, Stats., R.U. 12

6
Model with Qualitative Predictor Variables
• The response function for this regression model
is:
E Y   0  1 X1  2 X 2

• For male patients, X2 = 0 and response function


becomes:
E Y   0  1 X1 Male patients
• For female patients, X2 = 1 and response function
becomes:
E Y   (0  2 )  1 X1 Female patients

• These two response functions represent parallel


straight lines with different intercepts.

Prof. M. R. Karim, Stats., R.U. 13

Model with Qualitative Predictor Variables


• In general, we represent a qualitative variable with c
classes by means of c-1 indicator variables.
• For instance, if in the hospital stay example the
qualitative variable disability status is to be added as
another predictor variable, it can be represented as
follows by the two indicator variables X3 and X4:

1 if patient not disabled


X3  
0 otherwise
1 if patient partially disabled
X4  
0 otherwise

Prof. M. R. Karim, Stats., R.U. 14

7
Model with Qualitative Predictor Variables
• The first-order model with age, gender, and
disability status as predictor variables then is:
Yi  0  1 X i1  2 X i 2  3 X i 3  4 X i 4   i

Prof. M. R. Karim, Stats., R.U. 15

General Linear Regression Model in Matrix Terms


• To express general linear regression model:
Yi  0  1 X i1  2 X i 2  ...   p 1 X i , p 1   i
• in matrix terms, we need to define the following
matrices:
 Y1  1 X 11 X 12 X 1, p 1   0   1 
Y  1 X X 22 X 2, p 1    1    
 2
X β ε   2
21
Y=
       
       
Yn  n1 1 X n1 X n2 X n , p 1 
n p   p 1  p1  n  n1
• In matrix terms, the general linear regression
model is:
Yn1 = Xn pβ p1 + ε n1
Prof. M. R. Karim, Stats., R.U. 16

8
General Linear Regression Model in Matrix Terms
Estimation of Regression Coefficients
1. Least Squares Method:
• To determine the least squares estimator, we write
the sum of squares of the residuals as
n
SSE    i2  εε  (Y - Xβˆ )(Y - Xβˆ )
i 1

 YY - YXβˆ  βˆ XY  βˆ XXβˆ

• Note now that YXβˆ is 1x1, hence is equal to


its transpose, which is βˆ XY .
Prof. M. R. Karim, Stats., R.U. 17

General Linear Regression Model in Matrix Terms


• Thus, we find:
SSE  YY - 2βˆ XY  βˆ XXβˆ
SSE
0
βˆ
or,  2 XY  2 XXβˆ  0 or, XXβˆ = XY

Multiplying both sides by  XX 


1

 XX  XXβˆ =  XX  XY


1 1
We get,
 βˆ p1   X'X  p p X'Yp1
1

since,  X'X  ( X'X)  I


1
and Iβˆ  βˆ
Prof. M. R. Karim, Stats., R.U. 18

9
General Linear Regression Model in Matrix Terms

Variance-Covariance Matrix

βˆ =  X'X  X'Y
-1


 E βˆ =  X'X  X'E Y =  X'X  X'Xβ = β
-1 -1


σ 2 βˆ =  X'X  X'σ 2 Y X  X'X 
-1 -1

= σ 2  X'X  X'IX  X'X 


-1 -1

= σ 2  X'X 
-1

Prof. M. R. Karim, Stats., R.U. 19

General Linear Regression Model in Matrix Terms


Estimation of Regression Coefficients
2. Method of Maximum Likelihood:
• The method of maximum likelihood leads to the
same estimators for normal error regression
model as those obtained by the method of least
squares.
• The likelihood function for general linear
regression is:
 1 2
 Y    1 X i1  ...   p 1 X i , p 1  
n
1
L(β,  2 )  exp   2
(2 )  2
2 n/2 i 0
i 1 

Prof. M. R. Karim, Stats., R.U. 20

10
General Linear Regression Model in Matrix Terms

• Maximizing this likelihood function with respect


to β0, β1, ... , βp-1 leads to the MLEs of the
parameters.
• The least squares and maximum likelihood
estimators are minimum variance unbiased,
consistent, and sufficient estimators.

Prof. M. R. Karim, Stats., R.U. 21

Analysis of Variance Results


Sums of Squares and Mean Squares
• The sums of squares for the analysis of variance
in matrix terms are: n
( Yi ) 2
n n
1
SSTO   (Yi  Y ) 2   Yi 2   YY    YJY
i 1

i 1 i 1 n n
where J is a square matrix with all elements 1,
1 1
J n n   

1 1
 1 
 Y  I    J  Y
 n 
Prof. M. R. Karim, Stats., R.U. 22

11
Analysis of Variance Results

n
SSE    i 2  εε  (Y - Xβˆ )(Y - Xβˆ )
i 1

 YY - YXβˆ  βˆ XY  βˆ XXβˆ


 YY - YXβˆ  βˆ XY  YXβˆ
 YY - βˆ XY
 Y[I  H ]Y

where, H = X  X'X  X'


-1

Prof. M. R. Karim, Stats., R.U. 23

Analysis of Variance Results

• The square n × n matrix H is called the hat


matrix.
• It plays an important role in diagnostics for
regression analysis.
• The matrix H is symmetric H=H/ and has the
special property (called idempotency), HH = H

Prof. M. R. Karim, Stats., R.U. 24

12
Analysis of Variance Results

SSR  SSTO  SSE


 1 
 Y I    J  Y  (YY - βˆ XY )
 n 
1  1 
 βˆ XY    YJY  Y  H    J  Y
n  n 

Prof. M. R. Karim, Stats., R.U. 25

Analysis of Variance Results


Note:
• In general, a quadratic form is defined as
n n
YAY  SSTO   aijYY
i j,
i 1 j 1

where aij =a ji . A is a n  n symmetric matrix and


is called the matrix of the quadratic form.

Prof. M. R. Karim, Stats., R.U. 26

13
Analysis of Variance Results
• Each of the above three sums of squares can now
be seen of the form YAY where the three A
matrices are:
 1   1 

 n 
I J , [ I  H ], and  H    J
     n 
• Since each of these A matrices are symmetric,
SSTO, SSE, and SSR are quadratic forms.
• That is, all sums of squares in the analysis of
variance for linear statistical models can be
expressed as quadratic forms.
Prof. M. R. Karim, Stats., R.U. 27

ANOVA Table for general linear model

Source of
DF SS MS F
Variation
1
βˆ XY    YJY MSR F = MSR/MSE
Regression (p-1) n = SSR/(p-1) ~ F(k-1, n-k)

YY - βˆ XY
Residual or MSE
(n-p)
error = SSE/(n-p)
 1 
Total n-1 Y  I    J  Y
 n 

Prof. M. R. Karim, Stats., R.U. 28

14
Coefficient of Multiple Determination
• The coefficient of multiple determination,
denoted by R2, is defined as follows:
SSR SSE
R2   1
SSTO SSTO
• The coefficient of determination R2 will have a
value between 0 and 1.
• Adding more X variables to the regression model
can only increase R2 and never reduce it, because
SSE can never become larger with more X
variables and SSTO is always the same for a
given set of responses.
Prof. M. R. Karim, Stats., R.U. 29

Coefficient of Multiple Determination


• Since R2 usually can be made larger by including
a larger number of predictor variables, it is
sometimes suggested that a modified measure be
used that adjusts for the number of X variables in
the model.
• The adjusted coefficient of multiple
determination, denoted by Ra2, adjusts R2 by
dividing each sum of squares by its associated
degrees of freedom:
SSE /(n  p )  n  1  SSE
Ra2  1   1  
SSTO /(n  1)  n  p  SSTO
Prof. M. R. Karim, Stats., R.U. 30

15
Next – Remaining Materials
• Kutner, M., Nachtsheim, C. and Neter, J.
(2007). Applied Linear Statistical Models.
6th Edition, McGraw Hill/Irwin Series.

Chapter 7: Section 7.7 -

Prof. M. R. Karim, Stats., R.U. 31

Multivariate Multiple Regression


• Here we consider the problem of modeling the
relationship between m responses Y1, Y2, ..., Ym
and a single set of predictor variables z1, z2, …, zr.
• Each response is assumed to follow its own
regression model, so that
Y1   01  11 z1    r1 z r   1
Y2   02  12 z1    r 2 zr   2

Ym   0 m  1m z1    rm zr   m

Prof. M. R. Karim, Stats., R.U. 32

16
Multivariate Multiple Regression

The error term ε '  1 ,  2 ,  m ,


has E (ε)  0, and Var(ε)  Σ

Thus, the error terms associated with different


responses may be correlated

Prof. M. R. Karim, Stats., R.U. 33

Multivariate Multiple Regression


To establish notation conforming to the
classical linear regression model,
let
[ z j 0 , z j1 , ... , z jr ] denote the values of the predictor
variables for the jth trial,
Y j'  Y j1 , Y j 2 , Y jm  , be the responses, and
ε 'j   j1 ,  j 2 ,  jm  be the errors.

Prof. M. R. Karim, Stats., R.U. 34

17
Multivariate Multiple Regression
• In matrix notation, the design matrix

 z10 z11 z1r 


z z21 z2 r 
Z n( r 1)   20
 
 
 zn 0 zn1 znr 

• is the same as that for the single-response


regression model

Prof. M. R. Karim, Stats., R.U. 35

Multivariate Multiple Regression


• The other matrix quantities have multivariate
counterparts.
Y11 Y12 Y1m 
Y Y Y2 m 
Ynm   21 22  Y Y(2) Y( m ) 
   (1)
 
Yn1 Yn 2 Ynm 
  01  02 0m 
 12 1m 
β ( r 1)m   11  β (1) β (2) ... β ( m ) 
 
 
  r1 r 2  rm 

Prof. M. R. Karim, Stats., R.U. 36

18
Multivariate Multiple Regression
ε1' 
 
11 12 1m   
 ε ' 
 22  2 m   2
ε n m   21  ε (1) ε (2) ε ( m )    
   
   
 n1  n 2  nm 
 
 '
ε n 
• The multivariate linear regression model is
Ynm  Z n( r 1)β ( r 1)m  ε nm
with
E (ε (i ) )  0, Cov  ε (i ) , ε ( k )    ik I, i, k  1, 2, ,m
Prof. M. R. Karim, Stats., R.U. 37

Multivariate Multiple Regression


• The m observations on the jth trial have
covariance matrix Σ   ik 
• But observations from different trials are
uncorrelated.
• Here β and σik are unknown parameters; the
design matrix Z has jth row [ z j 0 , z j1 , ... , z jr ]

Prof. M. R. Karim, Stats., R.U. 38

19
Multivariate Multiple Regression
• Simply stated, the ith response Y(i) follows the
linear regression model
Y(i )  Zβ (i )  ε (i ) , i  1, 2, , m
with Cov(ε (i ) )   ii I.
• In conformity with the single-response solution,
we take
β (i )   Z ' Z  Z ' Y(i )
1
ˆ

Prof. M. R. Karim, Stats., R.U. 39

Multivariate Multiple Regression

• Collecting these univariate least squares


estimates, we obtain

βˆ  βˆ (1) βˆ (2) βˆ ( m ) 

  Z ' Z  Z '  Y(1)


1
Y(2) Y( m ) 
or
βˆ   Z ' Z  Z ' Y
1

Prof. M. R. Karim, Stats., R.U. 40

20
Multivariate Multiple Regression
• For any choice of parameters
B  b (1) b (2) b ( m ) 
• The matrix of errors is Y - ZB.
• The error sum of squares and cross products
matrix is
 Y  ZB  '  Y  ZB 
  Y(1)  Zb (1)  '  Y(1)  Zb (1)  Y  Zb (1)  '  Y( m )  Zb ( m )  
 
(1)

 
 
 Y( m )  Zb ( m )  '  Y(1)  Zb (1)   Y( m)  Zb( m)  '  Y( m)  Zb( m) 

Prof. M. R. Karim, Stats., R.U. 41

Multivariate Multiple Regression

The selection b (i )  βˆ ( i ) minimizes the ith diagonal


sum of squares  Y(i )  Zb (i )  '  Y( i )  Zb ( i )  .

Consequently, tr  Y  ZB  '  Y  ZB   is minimized


by the choice B  βˆ .

The generalized variance  Y  ZB  '  Y  ZB  is also


minimized by the least squares estimates B  βˆ .

Prof. M. R. Karim, Stats., R.U. 42

21
Multivariate Multiple Regression
• Using the least squares estimates β̂ , we can form
the matrices of
ˆ  Zβˆ  Z  Z ' Z 1 Z ' Y
Predicted values: Y
Residuals: ˆ  I  Z  Z ' Z 1 Z ' Y
εˆ  Y  Y
 
• The orthogonality conditions among the
residuals, predicted values, and columns of Z,
which hold in classical linear regression, hold in
multivariate multiple regression.
• They follow from Z'[I - Z(Z'Z)-1Z'] = Z' - Z' =0.

Prof. M. R. Karim, Stats., R.U. 43

Multivariate Multiple Regression


Specifically,
Z ' εˆ  Z ' I  Z  Z ' Z  Z ' Y  0
1
 
so the residuals εˆ (i ) are perpendicular to the columns of Z.
Also,
ˆ ' εˆ  βˆ ' Z ' I  Z  Z ' Z 1 Z ' Y  0
Y
 
confirming that the predicted Y ˆ are perpendicular to
(i )

all residual vectors εˆ ( k ) .


ˆ  εˆ
Because Y  Y
Y 'Y  Y 
ˆ  εˆ ' Y 
ˆ  εˆ  Y
ˆ 'Y 
ˆ  εˆ ' εˆ  0  0 '

Prof. M. R. Karim, Stats., R.U. 44

22
Multivariate Multiple Regression

or ˆ 'Y
Y 'Y  Y ˆ  εˆ ' εˆ
 Total SS and   Predicted SS and   Residual SS and 
   
 cross products   cross products   and cross prod. 

The residual sum of squares and cross products can also


be written as
ˆ  Y ' Y  βˆ ' Z ' Zβˆ
ˆ 'Y
εˆ ' εˆ  Y ' Y  Y

Prof. M. R. Karim, Stats., R.U. 45

Multivariate Multiple Regression


Example – 1: (Ex. 7.8/p390):
• Fitting a multivariate straight-line regression
model
• To illustrate the calculations of βˆ , Y
ˆ and εˆ we fit
a straight-line regression model
Y j1   01  11 z j1   j1
Y j 2   02  12 z j 2   j 2 , j  1, 2,...,5
• to two responses Y1 and Y2 using the following
data.
• Also verify the sum of squares and cross-products
decomposition Y ' Y  Y ˆ 'Y
ˆ  εˆ ' εˆ
Prof. M. R. Karim, Stats., R.U. 46

23
Multivariate Multiple Regression
z1 0 1 2 3 4
y1 1 4 3 8 9
y2 -1 -1 2 3 2

Solution:
• We have
1 1 1 1 1   0.6 0.2 
, Z ' Z  
1
Z'    
0 1 2 3 4   0.2 0.1 
βˆ (1)   Z ' Z  Z ' y (1)  1 2 '
1

βˆ (2)   Z ' Z  Z ' y (2)   1 1 '


1

Prof. M. R. Karim, Stats., R.U. 47

Multivariate Multiple Regression


1 1
  Z ' Z  Z '  y (1)
1
βˆ  βˆ (1) βˆ (2)     y (2) 
2 1 

yˆ1  1  2 z1 , yˆ 2  1  z2

1 0 1 1
1 1  3 0 
 1 1  
ˆ  Zβˆ  1 2    
Y  5 1
  2 1   
1 3 7 2 
1 4  9 3 

Prof. M. R. Karim, Stats., R.U. 48

24
Multivariate Multiple Regression

ˆ  0 1 2 1 0  ' εˆ ' Y
εˆ  Y  Y ˆ  0 0 
0 1 1 1 1 0 0 
   

171 43
Y 'Y   ˆ  165 45 , εˆ ' εˆ   6 2 
ˆ 'Y
 , Y  45 15   2 4 
 43 19     

the sum of squares and cross-products decomposition


ˆ 'Y
Y 'Y  Y ˆ  εˆ ' εˆ
is verified

Prof. M. R. Karim, Stats., R.U. 49

Multivariate Multiple Regression


R-codes:

z1 <- c(0, 1, 2, 3, 4)
y1 <- c(1, 4, 3, 8, 9)
y2 <- c(-1, -1, 2, 3, 2)

Z <- matrix(c(rep(1,5), z1), nrow=5, ncol=2)


Y <- matrix(c(y1, y2), nrow=5, ncol=2)

ZtZ <- t(Z) %*% Z


ZtZ.inv <- solve(ZtZ)

Prof. M. R. Karim, Stats., R.U. 50

25
Multivariate Multiple Regression
beta1.hat <- ZtZ.inv %*% t(Z) %*% y1
beta2.hat <- ZtZ.inv %*% t(Z) %*% y2
beta.hat <- cbind(beta1.hat, beta2.hat)
Y.hat <- Z %*% beta.hat
e.hat <- Y - Y.hat
t(Y) %*% Y
t(Y.hat) %*% Y.hat + t(e.hat) %*% e.hat
# verified
# Prediction
Z.given <- c(1, 2.5)
Y.pred <- Z.given %*% beta.hat
Prof. M. R. Karim, Stats., R.U. 51

Multivariate Multiple Regression


# using lm command

data.ex7.8 <- data.frame(z1 = c(0, 1, 2, 3, 4),


y1 = c(1, 4, 3, 8, 9),
y2 = c(-1, -1, 2, 3, 2))

lm.fit <- lm(cbind(y1, y2) ~ z1, data = data.ex7.8)


summary(lm.fit)

# Prediction
Z.given <- c(1, 2.5)
Y.pred <- Z.given %*% coef(lm.fit)
Prof. M. R. Karim, Stats., R.U. 52

26
Multivariate Multiple Regression
Result 7.9 :
For the least squares estimator βˆ  βˆ (1) βˆ (2) βˆ ( m ) 
determined under the multivariate multiple regression
model with full rank ( Z )  r  1  n,
  
E βˆ (i )  β (i ) or E βˆ  β and

Cov  βˆ 
, βˆ ( k )   ik  Z ' Z  , i, k  1, 2,..., m
1
(i )

continued...

Prof. M. R. Karim, Stats., R.U. 53

Multivariate Multiple Regression

The residual εˆ  εˆ (1) εˆ (2) εˆ ( m )   Y  Zβˆ


satisfy E  εˆ (i )   0, E  εˆ (' i )εˆ ( k )   (n  r  1) ik , so
 εˆ ' εˆ 
E  εˆ   0, E  Σ
 n  r 1 
Also, εˆ and βˆ are uncorrelated.

Prof. M. R. Karim, Stats., R.U. 54

27
Multivariate Multiple Regression
Proof of Result 7.8 :
The ith response follows the multiple regression model
Y(i )  Zβ (i )  ε (i ) , E  ε (i )   0, E  ε (i )ε (' i )    ii I

Also, βˆ (i )  β (i )   Z ' Z  Z ' Y(i )  β (i )   Z ' Z  Z ' ε (i )


1 1

εˆ (i )  Y(i )  Yˆ (i )  I  Z  Z ' Z  Z ' Y(i )


1
 
 I  Z  Z ' Z  Z '  Zβ (i )  ε (i )   I  Z  Z ' Z  Z ' ε (i )
1 1
   
 
so, E βˆ  β , E  εˆ   0
(i ) (i ) (i )

Prof. M. R. Karim, Stats., R.U. 55

Multivariate Multiple Regression


   
Next, Cov βˆ (i ) , βˆ ( k )  E βˆ ( i )  β ( i ) βˆ ( k )  β ( k ) ' 
  Z ' Z  Z ' E  ε (i )ε (' k )  Z  Z ' Z    ik  Z ' Z 
1 1 1

U : random vector, A : fixed matrix


E  U ' AU   E  tr  U ' AU    tr  AE (UU ') 


E  εˆ '(i )εˆ ( k )   E ε (' i )  I  Z(Z ' Z) 1 Z '  ε ( k ) 
 tr  I  Z(Z ' Z) 1 Z '   ik I    ik tr  I  Z(Z ' Z) 1 Z '  
 εˆ '(i )εˆ ( k ) 
  ik (n  r  1)  E  Σ
 n  r  1 
 
Prof. M. R. Karim, Stats., R.U. 56

28
Multivariate Multiple Regression
Note:
Let Z be an n  (r  1) matrix [( r  1)  n]
with full rank, i.e. rank(Z)=r  1
(this is necessary for  Z ' Z 
1
to exist).
Then, using the identity tr(AB)=tr(BA), we have
tr[Z(Z ' Z) 1 Z ']=tr[Z ' Z( Z ' Z) 1 ]
=tr[I ( r 1)( r 1) ]=r  1=rank(Z)
 tr  I  Z(Z ' Z) 1 Z '    tr[I nn ]-tr[I ( r 1)( r 1) ]
=n  r  1
Prof. M. R. Karim, Stats., R.U. 57

Multivariate Multiple Regression



Finally, Cov βˆ (i ) , εˆ ( k ) 

1
 
 E  Z ' Z  Z ' ε (i )ε (' k ) I  Z  Z ' Z  Z ' 
1


  Z ' Z  Z ' E  ε ε   I  Z  Z ' Z  Z '
1 ' 1
(i ) ( k )

  Z ' Z  Z ' I  I  Z  Z ' Z  Z ' 


1 1
ik

    Z ' Z  Z '  Z ' Z  Z '   0


1 1
ik

so that each element of βˆ is uncorrelated


with each element of εˆ.
Prof. M. R. Karim, Stats., R.U. 58

29
Multivariate Multiple Regression

Prof. M. R. Karim, Stats., R.U. 59

Multivariate Multiple Regression


Likelihood Ratio Tests for Regression Parameters
As in the case of the univariate model, we can construct
a likelihood ratio test to decide weather a set of r - q
predictors zq 1 , zq  2 ,..., zr , is associated with the m responses.
The appropriate hypothesis is
 β (1) 
 (( q 1)m ) 
H 0 : β (2)  0 where β      
 
 β (2) 
 (( r  q )m ) 
This implies the responses do not depend on zq 1 , zq  2 ,..., zr .
Prof. M. R. Karim, Stats., R.U. 60

30
Multivariate Multiple Regression
 
Setting Z   Z1 Z 2  , we can write the general model as
 n( q 1) n( r  q ) 
 β (1) 
 
E (Y)  Zβ   Z1 Z 2        Z1β (1)  Z 2β (2)
 β (2) 
 

Under H 0 : β (2)  0, Y  Z1β (1)  ε


and the likelihood ratio test of H 0 is based on the quatities
involved in the extra sum of squares and cross products.

Prof. M. R. Karim, Stats., R.U. 61

Multivariate Multiple Regression

Extra sum of squares and cross products


=SSE(z 0 , z1 ,..., zq )  SSE(z 0 , z1 ,..., zq , zq 1 ,..., zr )

   
= Y  Z1βˆ (1) ' Y  Z1βˆ (1)  Y  Zβˆ ' Y  Zβˆ  
ˆ Σ
=n Σ1
ˆ 
where βˆ (1)   Z1'Z1  Z1'Y, Σ
1
ˆ  n 1 Y  Zβˆ ' Y  Zβˆ   
and Σ1 
ˆ  n 1 Y  Z βˆ ' Y  Z βˆ
1 (1) 1 (1)  .

Prof. M. R. Karim, Stats., R.U. 62

31
Multivariate Multiple Regression
From Result 7.10, the likelihood ratio, , can be
expressed in terms of generalized variances:
max L(β (1) , Σ)
L(βˆ (1) , Σ
ˆ )  |Σ n/2
β(1) , Σ ˆ|
  
1

max L(β, Σ) L(βˆ , Σ ˆ) ˆ |



β,Σ
 1 

Reject H 0 for large values of


 ˆ 
Σ ˆ

 2 ln    n ln     n ln


ˆ 
Σ1

ˆ n Σ
nΣ ˆ Σ
1
ˆ  
 ˆ 
Σ
For n large,   n  r  1  (m  r  q  1) / 2 ln   ~  m2 ( r  q )
 ˆΣ 
 1

Prof. M. R. Karim, Stats., R.U. 63

Multivariate Multiple Regression


Examples 7.10:
• Considering the purchase of a computer must first
assess their future needs in to determine the
proper equipment.
• A computer scientist collected data from seven
similar company sites so that a forecast equation
of computer-hardware requirements for inventory
management could be developed.

Prof. M. R. Karim, Stats., R.U. 64

32
Multivariate Multiple Regression
• The data are given in the following Table for
o Z1 = customer orders (in thousands)
o Z2 = add-delete item count (in thousands)
o Y1 = CPU (central processing unit) time (in hours)
o Y2 = disk input/output capacity
y1 y2 z1 z2
141.5 301.8 123.5 2.108
168.9 396.1 146.1 9.213
154.8 328.2 133.9 1.905
146.5 307.4 128.5 0.815
172.8 362.4 151.5 1.061
160.1 369.5 136.2 8.603
108.5 229.1 92.0 1.125
Prof. M. R. Karim, Stats., R.U. 65

Multivariate Multiple Regression


Questions:
a) Perform a multivariate multiple regression
analysis using two response variables, Y1 and Y2,
and two independent variables, Z1 and Z2
b) Comments on the fitted models.
c) Predict Y1 and Y2 for given values of Z1=130 and
Z2=3.5.
d) Apply likelihood ratio test for testing H0: β2=0
and comment.

Prof. M. R. Karim, Stats., R.U. 66

33
References
• Draper, N.R. and Smith, H. (1999). Applied
regression analysis. 3rd Edition, Wiley.
• Kutner, M., Nachtsheim, C. and Neter, J.
(2004). Applied Linear Statistical Models.
5th Edition, McGraw Hill/Irwin Series.
• Rencher, A. C. (2002): Methods of Multivariate
Analysis, 2nd ed. Wiley, N.Y.
• Weisberg, S. (2005). Applied Linear Regression,
3rd Edition. John Wiley & Sons, NY.

Prof. M. R. Karim, Stats., R.U. 67

34

You might also like