M-Stat-503-02-Multivariate Regression - MRK-2018028 PDF

M-Stat-503
Advanced Multivariate Analysis
Multivariate Regression Analysis
Dr. Md. Rezaul Karim

Professor
Department of Statistics, University of Rajshahi
mrezakarim@yahoo.com; mrkarim@ru.ac.bd
February 11, 2018
Multiple Linear Regression

We can distinguish three cases according to the number
of variables:
1. Simple linear regression: one y and one x. For
example, suppose we wish to predict college grade
point average (GPA) based on an applicant’s high
school GPA.
2. Multiple linear regression: one y and several x’s.
For example, we could attempt to improve our
prediction of college GPA by using more than one
independent variable, such as, high school GPA,
standardized test scores [e.g., American College
Testing (ACT) or Scholastic Aptitude Test (SAT)], or
rating of high school.
Prof. M. R. Karim, Stats., R.U. 2
1
Multiple Linear Regression
3. Multivariate multiple linear regression: several y’s
and several x’s. In the preceding illustration, we may
wish to predict several y’s (such as number of years
of college the person will complete or GPA in the
sciences, arts, and humanities).
• To further distinguish case 2 from case 3, we could
designate case 2 as univariate multiple regression
because there is only one y. Thus in case 3,
multivariate indicates that there are several y’s and
multiple implies several x’s.
• The term multivariate regression usually refers to
case 3.
Model with Two Predictor Variables

First-Order Model with Two Predictor Variables
• When there are two predictor variables Xl and X2,
the regression model:
Yi = β0 +β1Xi1 +β2Xi2 + εi , i =1, 2, …, n
• is called a first-order model with two predictor
variables.
• A first-order model is linear in the predictor
variables.
• Yi denotes as usual the response in the ith trial,
and Xi1 and Xi2 are the values of the two predictor
variables in the ith trial.
2
• The parameters of the model are β0, β1, and β2,
and the error term is εi.
• Assuming that E{εi } = 0, the regression function
for the model is:
E{Y} = β0 +β1X1 +β2X2
• Analogous to simple linear regression, where the
regression function is a line, regression function
with two predictors is a plane.

Meaning of Regression Coefficients:
• For example, let E{Y} = 10 +2X1 +5X2
• The parameter β0=10 is the Y intercept of the
regression plane.
• If the scope of the model includes X1 =0, X2 = 0,
then β0=10 represents the mean response E{Y} at
X1 =0, X2 = 0.
• Otherwise, β0 does not have any particular
meaning as a separate term in the regression
model.
3
• The parameter β1 indicates the change in the
mean response E{Y} per unit change in X1 when
X2 is held constant.
• Likewise, β2 indicates the change in the mean
response E{Y} per unit change in X2 when X1 is
held constant.
• The parameters β1 and β2 are sometimes called
partial regression coefficients because they
reflect the partial effect of one predictor variable
when the other predictor variable is included in
the model and is held constant.
Model with more than two predictor variables
First-order regression model with more than two

predictors:
• We consider now the case where there are p-1
predictor variables X1, ..., Xp-1.
• The regression model:
Yi = β0 +β1Xi1 +β2Xi2 + … + βp-1Xi,p-1 + εi
• is called a first-order model with p-1 predictor
variables.
4
• It can also be written: p 1
Yi  0    k X ik   i
k 1
• or, if we let Xi0 = 1, it can be written as:

p 1
Yi    k X ik   i
k 0
• Assuming that E{εi}= 0, the response function

for this regression model is:
E{Y}= β0 +β1X1 +β2X2 + … + βp-1Xp-1
• This response function is a hyperplane, which is
a plane in more than two dimensions.

• This model is known as general regression
model, with normal error terms.
• Where,
o β0, β1, …, βp-1 are parameters
o X1, X2 , …, Xp-1 are known constants
o εi are independent N(0, σ2)
Note:
• When p-1 = 1, this regression model reduces to
the simple linear regression model.
5
Model with Qualitative Predictor Variables
• The general linear regression model
encompasses not only quantitative predictor
variables but also qualitative ones, such as
gender (male, female) or disability status (not
disabled, partially disabled, fully disabled), etc.
• We Use indicator variables that take on the
values 0 and 1 to identify the classes of a
qualitative variable.

• Consider a regression analysis to predict the
length of hospital stay (Y) based on the age (X1
and gender (X2) of the patient.
• We define X2 as follows:
1 if patient female
X2  
0 if patient male
• The first-order regression model then is as

follows:
Yi   0  1 X i1   2 X i 2   i , where:
1 if patient female
X i1  patient's age, X i 2  
0 if patient male
6
• The response function for this regression model
is:
E Y   0  1 X1  2 X 2
• For male patients, X2 = 0 and response function

becomes:
E Y   0  1 X1 Male patients
• For female patients, X2 = 1 and response function
becomes:
E Y   (0  2 )  1 X1 Female patients
• These two response functions represent parallel

straight lines with different intercepts.

• In general, we represent a qualitative variable with c
classes by means of c-1 indicator variables.
• For instance, if in the hospital stay example the
qualitative variable disability status is to be added as
another predictor variable, it can be represented as
follows by the two indicator variables X3 and X4:
1 if patient not disabled

X3  
0 otherwise
1 if patient partially disabled
X4  
0 otherwise
7
• The first-order model with age, gender, and
disability status as predictor variables then is:
Yi  0  1 X i1  2 X i 2  3 X i 3  4 X i 4   i
General Linear Regression Model in Matrix Terms

• To express general linear regression model:
Yi  0  1 X i1  2 X i 2  ...   p 1 X i , p 1   i
• in matrix terms, we need to define the following
matrices:
 Y1  1 X 11 X 12 X 1, p 1   0   1 
Y  1 X X 22 X 2, p 1    1    
 2
X β ε   2
21
Y=
       
       
Yn  n1 1 X n1 X n2 X n , p 1 
n p   p 1  p1  n  n1
• In matrix terms, the general linear regression
model is:
Yn1 = Xn pβ p1 + ε n1
8
Estimation of Regression Coefficients
1. Least Squares Method:
• To determine the least squares estimator, we write
the sum of squares of the residuals as
n
SSE    i2  εε  (Y - Xβˆ )(Y - Xβˆ )
i 1
 YY - YXβˆ  βˆ XY  βˆ XXβˆ
• Note now that YXβˆ is 1x1, hence is equal to

its transpose, which is βˆ XY .

• Thus, we find:
SSE  YY - 2βˆ XY  βˆ XXβˆ
SSE
0
βˆ
or,  2 XY  2 XXβˆ  0 or, XXβˆ = XY
Multiplying both sides by  XX 

1
 XX  XXβˆ =  XX  XY

1 1
We get,
 βˆ p1   X'X  p p X'Yp1
1
since,  X'X  ( X'X)  I

1
and Iβˆ  βˆ
9
Variance-Covariance Matrix
βˆ =  X'X  X'Y
-1

 E βˆ =  X'X  X'E Y =  X'X  X'Xβ = β
-1 -1

σ 2 βˆ =  X'X  X'σ 2 Y X  X'X 
-1 -1
= σ 2  X'X  X'IX  X'X 

-1 -1
= σ 2  X'X 
-1

Estimation of Regression Coefficients
2. Method of Maximum Likelihood:
• The method of maximum likelihood leads to the
same estimators for normal error regression
model as those obtained by the method of least
squares.
• The likelihood function for general linear
regression is:
 1 2
 Y    1 X i1  ...   p 1 X i , p 1  
n
1
L(β,  2 )  exp   2
(2 )  2
2 n/2 i 0
i 1 
10
• Maximizing this likelihood function with respect

to β0, β1, ... , βp-1 leads to the MLEs of the
parameters.
• The least squares and maximum likelihood
estimators are minimum variance unbiased,
consistent, and sufficient estimators.
Analysis of Variance Results

Sums of Squares and Mean Squares
• The sums of squares for the analysis of variance
in matrix terms are: n
( Yi ) 2
n n
1
SSTO   (Yi  Y ) 2   Yi 2   YY    YJY
i 1
i 1 i 1 n n
where J is a square matrix with all elements 1,
1 1
J n n   

1 1
 1 
 Y  I    J  Y
 n 
11
n
SSE    i 2  εε  (Y - Xβˆ )(Y - Xβˆ )
i 1
 YY - YXβˆ  βˆ XY  βˆ XXβˆ

 YY - YXβˆ  βˆ XY  YXβˆ
 YY - βˆ XY
 Y[I  H ]Y
where, H = X  X'X  X'

-1
• The square n × n matrix H is called the hat

matrix.
• It plays an important role in diagnostics for
regression analysis.
• The matrix H is symmetric H=H/ and has the
special property (called idempotency), HH = H
12
SSR  SSTO  SSE

 1 
 Y I    J  Y  (YY - βˆ XY )
 n 
1  1 
 βˆ XY    YJY  Y  H    J  Y
n  n 

Note:
• In general, a quadratic form is defined as
n n
YAY  SSTO   aijYY
i j,
i 1 j 1
where aij =a ji . A is a n  n symmetric matrix and

is called the matrix of the quadratic form.
13
• Each of the above three sums of squares can now
be seen of the form YAY where the three A
matrices are:
 1   1 

 n 
I J , [ I  H ], and  H    J
     n 
• Since each of these A matrices are symmetric,
SSTO, SSE, and SSR are quadratic forms.
• That is, all sums of squares in the analysis of
variance for linear statistical models can be
expressed as quadratic forms.
ANOVA Table for general linear model
Source of
DF SS MS F
Variation
1
βˆ XY    YJY MSR F = MSR/MSE
Regression (p-1) n = SSR/(p-1) ~ F(k-1, n-k)
YY - βˆ XY
Residual or MSE
(n-p)
error = SSE/(n-p)
 1 
Total n-1 Y  I    J  Y
 n 
14
Coefficient of Multiple Determination
• The coefficient of multiple determination,
denoted by R2, is defined as follows:
SSR SSE
R2   1
SSTO SSTO
• The coefficient of determination R2 will have a
value between 0 and 1.
• Adding more X variables to the regression model
can only increase R2 and never reduce it, because
SSE can never become larger with more X
variables and SSTO is always the same for a
given set of responses.
Coefficient of Multiple Determination

• Since R2 usually can be made larger by including
a larger number of predictor variables, it is
sometimes suggested that a modified measure be
used that adjusts for the number of X variables in
the model.
• The adjusted coefficient of multiple
determination, denoted by Ra2, adjusts R2 by
dividing each sum of squares by its associated
degrees of freedom:
SSE /(n  p )  n  1  SSE
Ra2  1   1  
SSTO /(n  1)  n  p  SSTO
15
Next – Remaining Materials
• Kutner, M., Nachtsheim, C. and Neter, J.
(2007). Applied Linear Statistical Models.
6th Edition, McGraw Hill/Irwin Series.
Chapter 7: Section 7.7 -
Multivariate Multiple Regression

• Here we consider the problem of modeling the
relationship between m responses Y1, Y2, ..., Ym
and a single set of predictor variables z1, z2, …, zr.
• Each response is assumed to follow its own
regression model, so that
Y1   01  11 z1    r1 z r   1
Y2   02  12 z1    r 2 zr   2
Ym   0 m  1m z1    rm zr   m
16
The error term ε '  1 ,  2 ,  m ,

has E (ε)  0, and Var(ε)  Σ
Thus, the error terms associated with different

responses may be correlated

To establish notation conforming to the
classical linear regression model,
let
[ z j 0 , z j1 , ... , z jr ] denote the values of the predictor
variables for the jth trial,
Y j'  Y j1 , Y j 2 , Y jm  , be the responses, and
ε 'j   j1 ,  j 2 ,  jm  be the errors.
17
• In matrix notation, the design matrix
 z10 z11 z1r 

z z21 z2 r 
Z n( r 1)   20
 
 
 zn 0 zn1 znr 
• is the same as that for the single-response

regression model

• The other matrix quantities have multivariate
counterparts.
Y11 Y12 Y1m 
Y Y Y2 m 
Ynm   21 22  Y Y(2) Y( m ) 
   (1)
 
Yn1 Yn 2 Ynm 
  01  02 0m 
 12 1m 
β ( r 1)m   11  β (1) β (2) ... β ( m ) 
 
 
  r1 r 2  rm 
18
ε1' 
 
11 12 1m   
 ε ' 
 22  2 m   2
ε n m   21  ε (1) ε (2) ε ( m )    
   
   
 n1  n 2  nm 
 
 '
ε n 
• The multivariate linear regression model is
Ynm  Z n( r 1)β ( r 1)m  ε nm
with
E (ε (i ) )  0, Cov  ε (i ) , ε ( k )    ik I, i, k  1, 2, ,m

• The m observations on the jth trial have
covariance matrix Σ   ik 
• But observations from different trials are
uncorrelated.
• Here β and σik are unknown parameters; the
design matrix Z has jth row [ z j 0 , z j1 , ... , z jr ]
19
• Simply stated, the ith response Y(i) follows the
linear regression model
Y(i )  Zβ (i )  ε (i ) , i  1, 2, , m
with Cov(ε (i ) )   ii I.
• In conformity with the single-response solution,
we take
β (i )   Z ' Z  Z ' Y(i )
1
ˆ
• Collecting these univariate least squares

estimates, we obtain
βˆ  βˆ (1) βˆ (2) βˆ ( m ) 
  Z ' Z  Z '  Y(1)

1
Y(2) Y( m ) 
or
βˆ   Z ' Z  Z ' Y
1
20
• For any choice of parameters
B  b (1) b (2) b ( m ) 
• The matrix of errors is Y - ZB.
• The error sum of squares and cross products
matrix is
 Y  ZB  '  Y  ZB 
  Y(1)  Zb (1)  '  Y(1)  Zb (1)  Y  Zb (1)  '  Y( m )  Zb ( m )  
 
(1)
 
 
 Y( m )  Zb ( m )  '  Y(1)  Zb (1)   Y( m)  Zb( m)  '  Y( m)  Zb( m) 
The selection b (i )  βˆ ( i ) minimizes the ith diagonal

sum of squares  Y(i )  Zb (i )  '  Y( i )  Zb ( i )  .
Consequently, tr  Y  ZB  '  Y  ZB   is minimized

by the choice B  βˆ .
The generalized variance  Y  ZB  '  Y  ZB  is also

minimized by the least squares estimates B  βˆ .
21
• Using the least squares estimates β̂ , we can form
the matrices of
ˆ  Zβˆ  Z  Z ' Z 1 Z ' Y
Predicted values: Y
Residuals: ˆ  I  Z  Z ' Z 1 Z ' Y
εˆ  Y  Y
 
• The orthogonality conditions among the
residuals, predicted values, and columns of Z,
which hold in classical linear regression, hold in
multivariate multiple regression.
• They follow from Z'[I - Z(Z'Z)-1Z'] = Z' - Z' =0.

Specifically,
Z ' εˆ  Z ' I  Z  Z ' Z  Z ' Y  0
1
 
so the residuals εˆ (i ) are perpendicular to the columns of Z.
Also,
ˆ ' εˆ  βˆ ' Z ' I  Z  Z ' Z 1 Z ' Y  0
Y
 
confirming that the predicted Y ˆ are perpendicular to
(i )
all residual vectors εˆ ( k ) .

ˆ  εˆ
Because Y  Y
Y 'Y  Y 
ˆ  εˆ ' Y 
ˆ  εˆ  Y
ˆ 'Y 
ˆ  εˆ ' εˆ  0  0 '
22
or ˆ 'Y
Y 'Y  Y ˆ  εˆ ' εˆ
 Total SS and   Predicted SS and   Residual SS and 
   
 cross products   cross products   and cross prod. 
The residual sum of squares and cross products can also

be written as
ˆ  Y ' Y  βˆ ' Z ' Zβˆ
ˆ 'Y
εˆ ' εˆ  Y ' Y  Y

Example – 1: (Ex. 7.8/p390):
• Fitting a multivariate straight-line regression
model
• To illustrate the calculations of βˆ , Y
ˆ and εˆ we fit
a straight-line regression model
Y j1   01  11 z j1   j1
Y j 2   02  12 z j 2   j 2 , j  1, 2,...,5
• to two responses Y1 and Y2 using the following
data.
• Also verify the sum of squares and cross-products
decomposition Y ' Y  Y ˆ 'Y
ˆ  εˆ ' εˆ
23
z1 0 1 2 3 4
y1 1 4 3 8 9
y2 -1 -1 2 3 2
Solution:
• We have
1 1 1 1 1   0.6 0.2 
, Z ' Z  
1
Z'    
0 1 2 3 4   0.2 0.1 
βˆ (1)   Z ' Z  Z ' y (1)  1 2 '
1
βˆ (2)   Z ' Z  Z ' y (2)   1 1 '

1

1 1
  Z ' Z  Z '  y (1)
1
βˆ  βˆ (1) βˆ (2)     y (2) 
2 1 
yˆ1  1  2 z1 , yˆ 2  1  z2
1 0 1 1
1 1  3 0 
 1 1  
ˆ  Zβˆ  1 2    
Y  5 1
  2 1   
1 3 7 2 
1 4  9 3 
24
ˆ  0 1 2 1 0  ' εˆ ' Y
εˆ  Y  Y ˆ  0 0 
0 1 1 1 1 0 0 
   
171 43
Y 'Y   ˆ  165 45 , εˆ ' εˆ   6 2 
ˆ 'Y
 , Y  45 15   2 4 
 43 19     
the sum of squares and cross-products decomposition

ˆ 'Y
Y 'Y  Y ˆ  εˆ ' εˆ
is verified

R-codes:
z1 <- c(0, 1, 2, 3, 4)
y1 <- c(1, 4, 3, 8, 9)
y2 <- c(-1, -1, 2, 3, 2)
Z <- matrix(c(rep(1,5), z1), nrow=5, ncol=2)

Y <- matrix(c(y1, y2), nrow=5, ncol=2)
ZtZ <- t(Z) %*% Z

ZtZ.inv <- solve(ZtZ)
25
beta1.hat <- ZtZ.inv %*% t(Z) %*% y1
beta2.hat <- ZtZ.inv %*% t(Z) %*% y2
beta.hat <- cbind(beta1.hat, beta2.hat)
Y.hat <- Z %*% beta.hat
e.hat <- Y - Y.hat
t(Y) %*% Y
t(Y.hat) %*% Y.hat + t(e.hat) %*% e.hat
# verified
# Prediction
Z.given <- c(1, 2.5)
Y.pred <- Z.given %*% beta.hat

# using lm command
data.ex7.8 <- data.frame(z1 = c(0, 1, 2, 3, 4),

y1 = c(1, 4, 3, 8, 9),
y2 = c(-1, -1, 2, 3, 2))
lm.fit <- lm(cbind(y1, y2) ~ z1, data = data.ex7.8)

summary(lm.fit)
# Prediction
Z.given <- c(1, 2.5)
Y.pred <- Z.given %*% coef(lm.fit)
26
Result 7.9 :
For the least squares estimator βˆ  βˆ (1) βˆ (2) βˆ ( m ) 
determined under the multivariate multiple regression
model with full rank ( Z )  r  1  n,
  
E βˆ (i )  β (i ) or E βˆ  β and
Cov  βˆ 
, βˆ ( k )   ik  Z ' Z  , i, k  1, 2,..., m
1
(i )
continued...
The residual εˆ  εˆ (1) εˆ (2) εˆ ( m )   Y  Zβˆ

satisfy E  εˆ (i )   0, E  εˆ (' i )εˆ ( k )   (n  r  1) ik , so
 εˆ ' εˆ 
E  εˆ   0, E  Σ
 n  r 1 
Also, εˆ and βˆ are uncorrelated.
27
Proof of Result 7.8 :
The ith response follows the multiple regression model
Y(i )  Zβ (i )  ε (i ) , E  ε (i )   0, E  ε (i )ε (' i )    ii I
Also, βˆ (i )  β (i )   Z ' Z  Z ' Y(i )  β (i )   Z ' Z  Z ' ε (i )

1 1
εˆ (i )  Y(i )  Yˆ (i )  I  Z  Z ' Z  Z ' Y(i )

1
 
 I  Z  Z ' Z  Z '  Zβ (i )  ε (i )   I  Z  Z ' Z  Z ' ε (i )
1 1
   
 
so, E βˆ  β , E  εˆ   0
(i ) (i ) (i )

   
Next, Cov βˆ (i ) , βˆ ( k )  E βˆ ( i )  β ( i ) βˆ ( k )  β ( k ) ' 
  Z ' Z  Z ' E  ε (i )ε (' k )  Z  Z ' Z    ik  Z ' Z 
1 1 1
U : random vector, A : fixed matrix

E  U ' AU   E  tr  U ' AU    tr  AE (UU ') 

E  εˆ '(i )εˆ ( k )   E ε (' i )  I  Z(Z ' Z) 1 Z '  ε ( k ) 
 tr  I  Z(Z ' Z) 1 Z '   ik I    ik tr  I  Z(Z ' Z) 1 Z '  
 εˆ '(i )εˆ ( k ) 
  ik (n  r  1)  E  Σ
 n  r  1 
 
28
Note:
Let Z be an n  (r  1) matrix [( r  1)  n]
with full rank, i.e. rank(Z)=r  1
(this is necessary for  Z ' Z 
1
to exist).
Then, using the identity tr(AB)=tr(BA), we have
tr[Z(Z ' Z) 1 Z ']=tr[Z ' Z( Z ' Z) 1 ]
=tr[I ( r 1)( r 1) ]=r  1=rank(Z)
 tr  I  Z(Z ' Z) 1 Z '    tr[I nn ]-tr[I ( r 1)( r 1) ]
=n  r  1


Finally, Cov βˆ (i ) , εˆ ( k ) 

1
 
 E  Z ' Z  Z ' ε (i )ε (' k ) I  Z  Z ' Z  Z ' 
1

  Z ' Z  Z ' E  ε ε   I  Z  Z ' Z  Z '
1 ' 1
(i ) ( k )
  Z ' Z  Z ' I  I  Z  Z ' Z  Z ' 

1 1
ik
    Z ' Z  Z '  Z ' Z  Z '   0

1 1
ik
so that each element of βˆ is uncorrelated

with each element of εˆ.
29

Likelihood Ratio Tests for Regression Parameters
As in the case of the univariate model, we can construct
a likelihood ratio test to decide weather a set of r - q
predictors zq 1 , zq  2 ,..., zr , is associated with the m responses.
The appropriate hypothesis is
 β (1) 
 (( q 1)m ) 
H 0 : β (2)  0 where β      
 
 β (2) 
 (( r  q )m ) 
This implies the responses do not depend on zq 1 , zq  2 ,..., zr .
30
 
Setting Z   Z1 Z 2  , we can write the general model as
 n( q 1) n( r  q ) 
 β (1) 
 
E (Y)  Zβ   Z1 Z 2        Z1β (1)  Z 2β (2)
 β (2) 
 
Under H 0 : β (2)  0, Y  Z1β (1)  ε

and the likelihood ratio test of H 0 is based on the quatities
involved in the extra sum of squares and cross products.
Extra sum of squares and cross products

=SSE(z 0 , z1 ,..., zq )  SSE(z 0 , z1 ,..., zq , zq 1 ,..., zr )
   
= Y  Z1βˆ (1) ' Y  Z1βˆ (1)  Y  Zβˆ ' Y  Zβˆ  
ˆ Σ
=n Σ1
ˆ 
where βˆ (1)   Z1'Z1  Z1'Y, Σ
1
ˆ  n 1 Y  Zβˆ ' Y  Zβˆ   
and Σ1 
ˆ  n 1 Y  Z βˆ ' Y  Z βˆ
1 (1) 1 (1)  .
31
From Result 7.10, the likelihood ratio, , can be
expressed in terms of generalized variances:
max L(β (1) , Σ)
L(βˆ (1) , Σ
ˆ )  |Σ n/2
β(1) , Σ ˆ|
  
1
max L(β, Σ) L(βˆ , Σ ˆ) ˆ |

|Σ
β,Σ
 1 
Reject H 0 for large values of

 ˆ 
Σ ˆ
nΣ
 2 ln    n ln     n ln


ˆ 
Σ1

ˆ n Σ
nΣ ˆ Σ
1
ˆ  
 ˆ 
Σ
For n large,   n  r  1  (m  r  q  1) / 2 ln   ~  m2 ( r  q )
 ˆΣ 
 1


Examples 7.10:
• Considering the purchase of a computer must first
assess their future needs in to determine the
proper equipment.
• A computer scientist collected data from seven
similar company sites so that a forecast equation
of computer-hardware requirements for inventory
management could be developed.
32
• The data are given in the following Table for
o Z1 = customer orders (in thousands)
o Z2 = add-delete item count (in thousands)
o Y1 = CPU (central processing unit) time (in hours)
o Y2 = disk input/output capacity
y1 y2 z1 z2
141.5 301.8 123.5 2.108
168.9 396.1 146.1 9.213
154.8 328.2 133.9 1.905
146.5 307.4 128.5 0.815
172.8 362.4 151.5 1.061
160.1 369.5 136.2 8.603
108.5 229.1 92.0 1.125

Questions:
a) Perform a multivariate multiple regression
analysis using two response variables, Y1 and Y2,
and two independent variables, Z1 and Z2
b) Comments on the fitted models.
c) Predict Y1 and Y2 for given values of Z1=130 and
Z2=3.5.
d) Apply likelihood ratio test for testing H0: β2=0
and comment.
33
References
• Draper, N.R. and Smith, H. (1999). Applied
regression analysis. 3rd Edition, Wiley.
• Kutner, M., Nachtsheim, C. and Neter, J.
(2004). Applied Linear Statistical Models.
5th Edition, McGraw Hill/Irwin Series.
• Rencher, A. C. (2002): Methods of Multivariate
Analysis, 2nd ed. Wiley, N.Y.
• Weisberg, S. (2005). Applied Linear Regression,
3rd Edition. John Wiley & Sons, NY.
34

M-Stat-503-02-Multivariate Regression - MRK-2018028 PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

M-Stat-503-02-Multivariate Regression - MRK-2018028 PDF

Uploaded by

Copyright:

Available Formats

M-Stat-503

Advanced Multivariate Analysis

Multivariate Regression Analysis

Dr. Md. Rezaul Karim

February 11, 2018

Multiple Linear Regression

Model with Two Predictor Variables

Prof. M. R. Karim, Stats., R.U. 5

Model with Two Predictor Variables

Model with more than two predictor variables

First-order regression model with more than two

Prof. M. R. Karim, Stats., R.U. 8

• or, if we let Xi0 = 1, it can be written as:

• Assuming that E{εi}= 0, the response function

Model with more than two predictor variables

Prof. M. R. Karim, Stats., R.U. 10

Prof. M. R. Karim, Stats., R.U. 11

Model with Qualitative Predictor Variables

• The first-order regression model then is as

• For male patients, X2 = 0 and response function

• These two response functions represent parallel

Prof. M. R. Karim, Stats., R.U. 13

Model with Qualitative Predictor Variables

1 if patient not disabled

Prof. M. R. Karim, Stats., R.U. 14

Prof. M. R. Karim, Stats., R.U. 15

General Linear Regression Model in Matrix Terms

 YY - YXβˆ  βˆ XY  βˆ XXβˆ

• Note now that YXβˆ is 1x1, hence is equal to

General Linear Regression Model in Matrix Terms

Multiplying both sides by  XX 

 XX  XXβˆ =  XX  XY

since,  X'X  ( X'X)  I

= σ 2  X'X  X'IX  X'X 

Prof. M. R. Karim, Stats., R.U. 19

General Linear Regression Model in Matrix Terms

Prof. M. R. Karim, Stats., R.U. 20

• Maximizing this likelihood function with respect

Prof. M. R. Karim, Stats., R.U. 21

Analysis of Variance Results

 YY - YXβˆ  βˆ XY  βˆ XXβˆ

where, H = X  X'X  X'

Prof. M. R. Karim, Stats., R.U. 23

Analysis of Variance Results

• The square n × n matrix H is called the hat

Prof. M. R. Karim, Stats., R.U. 24

SSR  SSTO  SSE

Prof. M. R. Karim, Stats., R.U. 25

Analysis of Variance Results

where aij =a ji . A is a n  n symmetric matrix and

Prof. M. R. Karim, Stats., R.U. 26

ANOVA Table for general linear model

Prof. M. R. Karim, Stats., R.U. 28

Coefficient of Multiple Determination

Chapter 7: Section 7.7 -

Prof. M. R. Karim, Stats., R.U. 31

Multivariate Multiple Regression

Prof. M. R. Karim, Stats., R.U. 32

The error term ε '  1 ,  2 ,  m ,

Thus, the error terms associated with different

Prof. M. R. Karim, Stats., R.U. 33

Multivariate Multiple Regression