You are on page 1of 36

Introduction to the Gauss-Markov Linear

Model
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 1 / 36
Random Vectors
y =

y
1
y
2
.
.
.
y
n

is a random vector if and only if each element of y is a


random variable (i.e., y
i
is a random variable i = 1, . . . , n).
The mean of the random vector y is E(y) =

E(y
1
)
E(y
2
)
.
.
.
E(y
n
)

.
The variance of the random vector y is the matrix whose i, jth
element is Cov(y
i
, y
j
) = E(y
i
y
j
) E(y
i
)E(y
j
).
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 2 / 36
Example: Variance of a Random Vector
For example, the variance of y =

y
1
y
2
y
3

is
Var(y) =

Cov(y
1
, y
1
) Cov(y
1
, y
2
) Cov(y
1
, y
3
)
Cov(y
2
, y
1
) Cov(y
2
, y
2
) Cov(y
2
, y
3
)
Cov(y
3
, y
1
) Cov(y
3
, y
2
) Cov(y
3
, y
3
)

Var(y
1
) Cov(y
1
, y
2
) Cov(y
1
, y
3
)
Cov(y
2
, y
1
) Var(y
2
) Cov(y
2
, y
3
)
Cov(y
3
, y
1
) Cov(y
3
, y
2
) Var(y
3
)

.
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 3 / 36
The Gauss-Markov Linear Model
y = X +
y is an n 1 random vector of responses.
X is an n p matrix of constants with columns corresponding to
explanatory variables. X is sometimes referred to as the design
matrix.
is an unknown parameter vector in IR
p
.
is an n 1 random vector of errors.
E() = 0 and Var() =
2
I, where
2
is an unknown parameter in
IR
+
.
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 4 / 36
The Gauss-Markov Linear Model
Note that the model is not completely specied because the
distribution of y is not completely specied.
y = X + , E() = 0, Var() =
2
I
= E(y) = X, Var(y) =
2
I
= y (X,
2
I)
y has a distribution with mean X and variance
2
I.
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 5 / 36
The Normal Theory Gauss-Markov Linear Model
We often add an assumption of multivariate normality to the
Gauss-Markov linear model: N(0,
2
I).
The assumption N(0,
2
I) is equivalent to

1
, . . . ,
n
i.i.d.
N(0,
2
).
The assumption N(0,
2
I) = y N(X,
2
I), i.e.,
y
1
, . . . , y
n
are independent normal random variables,
Var(y
i
) =
2
i = 1, . . . , n, and
E(y
i
) = x

(i)
(where x

(i)
is the ith row of X) i = 1, . . . , n.
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 6 / 36
Goal of Analysis
y = X +
The goal of analysis often focuses on answering questions
about certain linear functions of of the form C for a
specied matrix C.
The normality assumption is useful for constructing
condence intervals and performing tests concerning C.
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 7 / 36
Example 1
Researchers harvested ve randomly selected ears of corn from a
eld. For i = 1, . . . , 5; let y
i
denote the weight in grams of the i
th
ear.
y
1
, . . . , y
5
i.i.d.
N(,
2
)
y
i
= +
i
, i = 1, . . . , 5;
1
, . . . ,
5
i.i.d.
N(0,
2
)
y
1
= +
1
y
2
= +
2
y
3
= +
3

1
, . . . ,
5
i.i.d.
N(0,
2
)
y
4
= +
4
y
5
= +
5
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 8 / 36
Example 1 (continued)
y
1
= +
1
y
2
= +
2
y
3
= +
3

1
, . . . ,
5
i.i.d.
N(0,
2
)
y
4
= +
4
y
5
= +
5

y
1
y
2
y
3
y
4
y
5

N(0,
2
I)
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 9 / 36
Example 1 (continued)

y
1
y
2
y
3
y
4
y
5

N(0,
2
I)

y
1
y
2
y
3
y
4
y
5

1
1
1
1
1

[] +

N(0,
2
I)
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 10 / 36
Example 1 (continued)

y
1
y
2
y
3
y
4
y
5

1
1
1
1
1

[] +

N(0,
2
I)
y = X + , N(0,
2
I)
C = [1][] =
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 11 / 36
Example 2
Researchers randomly assigned eight experimental units to two
treatments and measured a response of interest. For i = 1, 2; let
y
i1
, y
i2
, y
i3
, y
i4
denote the responses of the experimental units in the i
th
treatment group.
y
11
, y
12
, y
13
, y
14
i.i.d.
N(
1
,
2
)
independent of
y
21
, y
22
, y
23
, y
24
i.i.d.
N(
2
,
2
)
y
ij
=
i
+
ij
, i = 1, 2; j = 1, . . . , 4

11
,
12
,
13
,
14
,
21
,
22
,
23
,
24
i.i.d.
N(0,
2
)
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 12 / 36
Example 2 (continued)
y
11
=
1
+
11
y
12
=
1
+
12
y
13
=
1
+
13
y
14
=
1
+
14
y
21
=
2
+
21
y
22
=
2
+
22
y
23
=
2
+
23
y
24
=
2
+
24

11
,
12
,
13
,
14
,
21
,
22
,
23
,
24
i.i.d.
N(0,
2
)
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 13 / 36
Example 2 (continued)

y
11
y
12
y
13
y
14
y
21
y
22
y
23
y
24

11

12

13

14

21

22

23

24

11

12

13

14

21

22

23

24

N(0,
2
I)
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 14 / 36
Example 2 (continued)

y
11
y
12
y
13
y
14
y
21
y
22
y
23
y
24

1 0
1 0
1 0
1 0
0 1
0 1
0 1
0 1

11

12

13

14

21

22

23

24

11

12

13

14

21

22

23

24

N(0,
2
I)
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 15 / 36
Example 2 (continued)

y
11
y
12
y
13
y
14
y
21
y
22
y
23
y
24

1 0
1 0
1 0
1 0
0 1
0 1
0 1
0 1

11

12

13

14

21

22

23

24

11

12

13

14

21

22

23

24

N(0,
2
I)
y = X + , N(0,
2
I)
C = [1, 1]

=
1

2
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 16 / 36
Example 3
Suppose eight fertilizer amounts denoted x
1
, . . . , x
8
were randomly
assigned to eight eld plots. For i = 1, . . . , 8; let y
i
denote the yield of
the plot that received fertilizer amount x
i
.
y
i
=
0
+
1
x
i
+
i
, i = 1, . . . , 8

1
, . . . ,
8
i.i.d.
N(0,
2
)
y
1
=
0
+
1
x
1
+
1
y
2
=
0
+
1
x
2
+
2
y
3
=
0
+
1
x
3
+
3
y
4
=
0
+
1
x
4
+
4
y
5
=
0
+
1
x
5
+
5
y
6
=
0
+
1
x
6
+
6
y
7
=
0
+
1
x
7
+
7
y
8
=
0
+
1
x
8
+
8
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 17 / 36
Example 3 (continued)

y
1
y
2
y
3
y
4
y
5
y
6
y
7
y
8

0
+
1
x
1

0
+
1
x
2

0
+
1
x
3

0
+
1
x
4

0
+
1
x
5

0
+
1
x
6

0
+
1
x
7

0
+
1
x
8

N(0,
2
I)
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 18 / 36
Example 3 (continued)

y
1
y
2
y
3
y
4
y
5
y
6
y
7
y
8

1 x
1
1 x
2
1 x
3
1 x
4
1 x
5
1 x
6
1 x
7
1 x
8

N(0,
2
I)
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 19 / 36
Example 3 (continued)

y
1
y
2
y
3
y
4
y
5
y
6
y
7
y
8

1 x
1
1 x
2
1 x
3
1 x
4
1 x
5
1 x
6
1 x
7
1 x
8

N(0,
2
I)
y = X + , N(0,
2
I)
C = [0, 1]

=
1
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 20 / 36
Example 4
Eight hogs were randomly assigned to two diets and two inoculations
such that two hogs received each combination of diet and inoculation.
This experiment involves two factors: diet and inoculation.
In this case, each factor has two levels (denoted here generically
as 1 and 2).
A combination of one level from each factor forms a treatment.
In this case, we have four treatments:
Treatment Diet Inoculation
1 1 1
2 1 2
3 2 1
4 2 2
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 21 / 36
Example 4 (continued)
For i = 1, 2; j = 1, 2; and k = 1, 2; let y
ijk
denote the average daily gain
of the k
th
hog that received diet i and inoculation j.
y
ijk
= +
ijk
i = 1, 2; j = 1, 2; k = 1, 2;

111
,
112
,
121
,
122
,
211
,
212
,
221
,
222
i.i.d.
N(0,
2
)
Under this model, neither diet nor inoculation affects average daily
gain.
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 22 / 36
Example 4 (continued)
For i = 1, 2; j = 1, 2; and k = 1, 2; let y
ijk
denote the average daily gain
of the k
th
hog that received diet i and inoculation j.
y
ijk
= +
i
+
ijk
i = 1, 2; j = 1, 2; k = 1, 2;

111
,
112
,
121
,
122
,
211
,
212
,
221
,
222
i.i.d.
N(0,
2
)
Under this model, only diet affects average daily gain.
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 23 / 36
Example 4 (continued)
For i = 1, 2; j = 1, 2; and k = 1, 2; let y
ijk
denote the average daily gain
of the k
th
hog that received diet i and inoculation j.
y
ijk
= +
j
+
ijk
i = 1, 2; j = 1, 2; k = 1, 2;

111
,
112
,
121
,
122
,
211
,
212
,
221
,
222
i.i.d.
N(0,
2
)
Under this model, only inoculation affects average daily gain.
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 24 / 36
Example 4 (continued)
y
ijk
= +
i
+
j
+
ijk
i = 1, 2; j = 1, 2; k = 1, 2;

111
,
112
,
121
,
122
,
211
,
212
,
221
,
222
i.i.d.
N(0,
2
)
Under this model, factors diet and inoculation affect the mean
average daily gain in an additive manner.
There is no interaction between the factors diet and inoculation.
inoculation
diet 1 2 inoculation difference
1 +
1
+
1
+
1
+
2

1

2
2 +
2
+
1
+
2
+
2

1

2
diet difference
1

2

1

2
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 25 / 36
Example 4 (continued)
y
ijk
= +
i
+
j
+
ij
+
ijk
i = 1, 2; j = 1, 2; k = 1, 2;

111
,
112
,
121
,
122
,
211
,
212
,
221
,
222
i.i.d.
N(0,
2
)
Under this model, there is one mean for each combination of diet
and inoculation.
Those four means are free to take any four values with no
restrictions.
inoculation
diet 1 2 inoculation
1 +
1
+
1
+
11
+
1
+
2
+
12

1

2
+
11

12
2 +
2
+
1
+
21
+
2
+
2
+
22

1

2
+
21

22
diet
1

2
+
11

21

1

2
+
12

22
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 26 / 36
Example 4 (continued)
An equivalent model is the so called cell means model:
y
ijk
=
ij
+
ijk
i = 1, 2; j = 1, 2; k = 1, 2;

111
,
112
,
121
,
122
,
211
,
212
,
221
,
222
i.i.d.
N(0,
2
)
inoculation
diet 1 2 inoculation
1
11

12

11

12
2
21

22

21

22
diet
11

21

12

22
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 27 / 36
Example 4 (continued)
y
ijk
= +
i
+
j
+
ij
+
ijk
i = 1, 2; j = 1, 2; k = 1, 2;
y
111
= +
1
+
1
+
11
+
111
y
112
= +
1
+
1
+
11
+
112
y
121
= +
1
+
2
+
12
+
121
y
122
= +
1
+
2
+
12
+
122
y
211
= +
2
+
1
+
21
+
211
y
212
= +
2
+
1
+
21
+
212
y
221
= +
2
+
2
+
22
+
221
y
222
= +
2
+
2
+
22
+
222

111
,
112
,
121
,
122
,
211
,
212
,
221
,
222
i.i.d.
N(0,
2
)
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 28 / 36
Example 4 (continued)

y
111
y
112
y
121
y
122
y
211
y
212
y
221
y
222

+
1
+
1
+
11
+
1
+
1
+
11
+
1
+
2
+
12
+
1
+
2
+
12
+
2
+
1
+
21
+
2
+
1
+
21
+
2
+
2
+
22
+
2
+
2
+
22

111

112

121

122

211

212

221

222

111

112

121

122

211

212

221

222

N(0,
2
I)
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 29 / 36
Example 4 (continued)

y
111
y
112
y
121
y
122
y
211
y
212
y
221
y
222

1 1 0 1 0 1 0 0 0
1 1 0 1 0 1 0 0 0
1 1 0 0 1 0 1 0 0
1 1 0 0 1 0 1 0 0
1 0 1 1 0 0 0 1 0
1 0 1 1 0 0 0 1 0
1 0 1 0 1 0 0 0 1
1 0 1 0 1 0 0 0 1

11

12

21

22

111

112

121

122

211

212

221

222

y = X + , N(0,
2
I)
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 30 / 36
Example 4 (continued)
= [,
1
,
2
,
1
,
2
,
11
,
12
,
21
,
22
]

inoculation
diet 1 2
1 +
1
+
1
+
11
+
1
+
2
+
12
2 +
2
+
1
+
21
+
2
+
2
+
22
diet
1

2
+
11

21

1

2
+
12

22
Is the difference between diet means for inoculation 1 the same as the
difference between diet means for inoculation 2?
C = [0, 0, 0, 0, 0, 1, 1, 1, 1] =
11

12

21
+
22
= 0?
This questions asks if there is interaction between the factors diet and
inoculation.
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 31 / 36
Example 4 (continued)
= [,
1
,
2
,
1
,
2
,
11
,
12
,
21
,
22
]

inoculation
diet 1 2 inoculation
1 +
1
+
1
+
11
+
1
+
2
+
12

1

2
+
11

12
2 +
2
+
1
+
21
+
2
+
2
+
22

1

2
+
21

22
Is the difference between inoculation means for diet 1 the same as the
difference between inoculation means for diet 2?
C = [0, 0, 0, 0, 0, 1, 1, 1, 1] =
11

12

21
+
22
= 0?
This questions also asks if there is interaction between the factors diet
and inoculation.
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 32 / 36
Example 4 (continued)
= [,
1
,
2
,
1
,
2
,
11
,
12
,
21
,
22
]

inoculation
diet 1 2 Diet Means
1 +
1
+
1
+
11
+
1
+
2
+
12
+
1
+

+
1
2 +
2
+
1
+
21
+
2
+
2
+
22
+
2
+

+
2
Is the average over inoculation means for diet 1 different than the
average over inoculation means for diet 2?
C = [0, 1, 1, 0, 0, .5, .5, .5, .5] =
1

2
+
1

2
= 0?
This question asks about the main effect of the factor diet.
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 33 / 36
Example 4 (continued)
= [,
1
,
2
,
1
,
2
,
11
,
12
,
21
,
22
]

inoculation
diet 1 2
1 +
1
+
1
+
11
+
1
+
2
+
12
2 +
2
+
1
+
21
+
2
+
2
+
22
Inoculation Means +

+
1
+
1
+

+
2
+
2
Is the average over diet means for inoculation 1 different than the
average over diet means for inoculation 2?
C = [0, 0, 0, 1, 1, .5, .5, .5, .5] =
1

2
+
1

2
= 0?
This question asks about the main effect of the factor inoculation.
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 34 / 36
Example 4 (continued)
= [,
1
,
2
,
1
,
2
,
11
,
12
,
21
,
22
]

inoculation
diet 1 2
1 +
1
+
1
+
11
+
1
+
2
+
12
2 +
2
+
1
+
21
+
2
+
2
+
22
diet
1

2
+
11

21
Is there a difference between the diet means for inoculation 1?
C = [0, 1, 1, 0, 0, 1, 0, 1, 0] =
1

2
+
11

21
= 0?
This question asks about the simple effect of the factor diet for the rst
level of the factor inoculation.
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 35 / 36
Example 4 (continued)
= [,
1
,
2
,
1
,
2
,
11
,
12
,
21
,
22
]

inoculation
diet 1 2 inoculation
1 +
1
+
1
+
11
+
1
+
2
+
12

1

2
+
11

12
2 +
2
+
1
+
21
+
2
+
2
+
22

1

2
+
21

22
diet
1

2
+
11

21

1

2
+
12

22
Are all four treatment means identical?
C =

0 0 0 1 1 1 1 0 0
0 0 0 1 1 0 0 1 1
0 1 1 0 0 1 0 1 0

2
+
11

12

2
+
21

22

2
+
11

21

0
0
0

?
Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 511 36 / 36

You might also like