You are on page 1of 53

Random vectors

Random quadratic forms

Independence

Linear Statistical Models: Random vectors


Notes by Yao-ban Chan and Owen Jones

Linear Statistical Models: Random vectors

1/53

Random vectors

Random quadratic forms

Independence

Random vectors
The theory of linear algebra provides us with a good grounding to
analyse our linear models. However we must still do some more
groundwork. Once we have done this, the theoretical results come
out quite easily!
Previously, we were thinking of matrices and vectors simply as a
bunch of numbers. However, there is no reason why we cant think
of them as a bunch of random variables!
We can then extend the traditional concepts of expectation,
variance, etc. to random vectors.

Linear Statistical Models: Random vectors

2/53

Random vectors

Random quadratic forms

Independence

Expectation
Although traditionally random variables are denoted with capital
letters, in keeping with our linear algebra notation, we will denote
them by lowercase.
We define the expectation of a random vector y to be the vector
of expectations of its components:

y1
E [y1 ]
y2
E [y2 ]

If y = . then E [y] =
.. .
..

.
yk
E [yk ]

Linear Statistical Models: Random vectors

3/53

Random vectors

Random quadratic forms

Independence

Expectation properties
I

If a is a vector of constants, then E [a] = a.

If a is a vector of constants, then E [aT y] = aT E [y].

If A is a matrix of constants, then E [Ay] = AE [y].

Example. Let

A=

2 3
1 4


,y =

y1
y2

and assume that E [y1 ] = 10 and E [y2 ] = 20. Then




 

2 3
10
80
AE [y] =
=
.
1 4
20
90

Linear Statistical Models: Random vectors

4/53

Random vectors

Random quadratic forms

Independence

On the other hand,



E [Ay] = E

=

=

=

Linear Statistical Models: Random vectors

2y1 + 3y2
y1 + 4y2

E [2y1 + 3y2 ]
E [y1 + 4y2 ]




2E [y1 ] + 3E [y2 ]
E [y1 ] + 4E [y2 ]

80
= AE [y].
90

5/53

Random vectors

Random quadratic forms

Independence

Variance
Defining the variance of a random vector is slightly trickier. We
want to not just include the variance of the variables themselves,
but also how the variables affect each other.
Recall that the variance of a random variable Y with mean is
defined to be E [(Y )2 ]. Now let y be as before, a k 1 vector
of random variables. We define the variance of y (sometimes
known as the covariance matrix) to be
var y = E [(y )(y )T ]
where = E [y].

Linear Statistical Models: Random vectors

6/53

Random vectors

Random quadratic forms

Independence

The diagonal elements of the covariance matrix are just the


variances of the individual elements of y:
[var y]ii = var yi , i = 1, 2, . . . , k .

The off-diagonal elements of the covariance matrix are the


covariances of the individual elements:
[var y]ij = cov(yi , yj ) = E [(yi i )(yj j )].

This means that all covariance matrices are symmetric.

Linear Statistical Models: Random vectors

7/53

Random vectors

Random quadratic forms

Independence

Variance properties

Suppose that y is a random vector with var y = V . Then


I

If a is a vector of constants, then var aT y = aT V a.

If A is a matrix of constants, then var Ay = AVAT .

These can be derived from first principles quite easily.

It follows that any covariance matrix is symmetric and positive


semidefinite.

Linear Statistical Models: Random vectors

8/53

Random vectors

Random quadratic forms

Independence

Example. Let

y1
y = y2
y3
be a random vector, such that var yi = 2 for all i , and that the
elements of y are independent. This means that cov(yi , yj ) = 0 for
i 6= j , so the covariance matrix of y is
2

0 0
var y = V = 0 2 0 = 2 I .
0 0 2

Linear Statistical Models: Random vectors

9/53

Random vectors

Random quadratic forms

Independence

Example continued. Assume that X is a matrix of full rank (with


more rows than columns), which implies that X T X is nonsingular.
Let
z = (X T X )1 X T y = Ay
then
var z = AVAT = [(X T X )1 X T ] 2 I [(X T X )1 X T ]T
= (X T X )1 X T (X T )T [(X T X )1 ]T 2
= (X T X )1 X T X [(X T X )T ]1 2
= (X T X )1 2 .
We will be using this quite a bit later on!

Linear Statistical Models: Random vectors

10/53

Random vectors

Random quadratic forms

Independence

Matrix square root


A square matrix A has a square root if there exists a matrix B , the
same size, such that B 2 = A. In general the square root is not
unique.
For a symmetric positive semidefinite matrix A, there is a unique
symmetric positive semidefinite square root, called the principle
root, denoted A1/2 .
Suppose that P diagonalises A, that is P T AP = . Then
A = P P T
= P 1/2 P T P 1/2 P T
A1/2 = P 1/2 P T .

Linear Statistical Models: Random vectors

11/53

Random vectors

Random quadratic forms

Independence

Multivariate normal
Definition
Let z be a k 1 vector of i.i.d. standard normal r.v.s, A an n k
matrix, and b an n 1 vector, then we say that
x = Az + b
is (an n dimensional) multivariate normal, with mean = E x = b
and covariance matrix = var x = AAT .
We write x MVN (, ) or just x N (, ).
For any and any symmetric positive semidefinite matrix , let z
be a vector of i.i.d. standard normals, then
+ 1/2 z MVN (, ).

Linear Statistical Models: Random vectors

12/53

Random vectors

Random quadratic forms

Independence

If x MVN (, ) and is k k positive definite, then x has the


density
1
1
T 1
f (x) =
e 2 (x) (x) .
k
/2
1/2
(2) ||

Note that a symmetric positive definite matrix is necessarily


invertible. Also, be aware that some authors require the covariance
matrix to be positive definite, rather than just positive
semi-definite.

Linear Statistical Models: Random vectors

13/53

Random vectors

Random quadratic forms

Independence

If x MVN (, ) is k 1, A is n k , and b is n 1, then


y = Ax + b MVN (A + b, AAT ).
To see why, put x = 1/2 z + , then
y = A1/2 z + A + b.

Linear Statistical Models: Random vectors

14/53

Random vectors

Random quadratic forms

Independence

If the random vector z = (z1 , z2 )T is multivariate normal, then z1


and z2 are independent if and only if they are uncorrelated.

In general, if z1 and z2 are normal random variables, z = (z1 , z2 )T


does not have to me multivariate normal. Moreover, z1 and z2 can
be uncorrelated but not independent.

For example, suppose that z1 N (0, 1) and u U (1, 1), then


z2 = z1 sign(u) N (0, 1), but z = (z1 , z2 )T is not multivariate
normal. (Consider its support.) Moreover z1 and z2 are
uncorrelated, but clearly dependent.

Linear Statistical Models: Random vectors

15/53

Random vectors

Random quadratic forms

Independence

R example: multivariate normal

To generate a sample of size 100 with distribution


  

3
1 0.8
MVN
,
1
0.8
1
>
>
>
>
>

library(MASS)
a <- matrix(c(3, 1), 2, 1)
V <- matrix(c(1, .8, .8, 1), 2, 2)
y <- mvrnorm(100, mu = a, Sigma = V)
plot(y[,1], y[,2])

Linear Statistical Models: Random vectors

16/53

Random quadratic forms

Independence

1
1

y[, 2]

Random vectors

y[, 1]
Linear Statistical Models: Random vectors

17/53

Random vectors

Random quadratic forms

Independence

Alternatively, starting with standard normals


>
>
>
>
>

P <- eigen(V)$vectors
sqrtV <- P %*% diag(sqrt(eigen(V)$values)) %*% t(P)
z <- matrix(rnorm(200), 2, 100)
y_new <- sqrtV %*% z + rep(a, 100)
points(y_new[1,], y_new[2,], col = "red")

Linear Statistical Models: Random vectors

18/53

Random quadratic forms

Independence

1
1

y[, 2]

Random vectors

y[, 1]
Linear Statistical Models: Random vectors

19/53

Random vectors

Random quadratic forms

Independence

Random quadratic forms

Just as we can consider vectors and matrices to be composed of


random variables, we can see what happens when these random
vectors are combined into quadratic forms. The result is a function
of random variables which is scalar (not vector), and so it is itself a
random variable.

Quadratic forms will pop up regularly in our analysis of linear


models. To fully analyse our models, we will want to know the
distribution of these forms, under the assumptions that we make
on the distribution of the variables in the model.

Linear Statistical Models: Random vectors

20/53

Random vectors

Random quadratic forms

Independence

Theorem
Let y be a random vector with E [y] = and var y = V , and let
A be a matrix of constants. Then
E [yT Ay] = tr (AV ) + T A.

Linear Statistical Models: Random vectors

21/53

Random vectors

Linear Statistical Models: Random vectors

Random quadratic forms

Independence

22/53

Random vectors

Random quadratic forms

Independence

Example. Let y be a 2 1 random vector with


 


1
2 1
=
,V =
.
3
1 5
Let

A=

4 1
1 2


.

Consider the quadratic form


yT Ay = 4y12 + 2y1 y2 + 2y22 .

Linear Statistical Models: Random vectors

23/53

Random vectors

Random quadratic forms

Independence

The expectation of this form is


E [yT Ay] = 4E [y12 ] + 2E [y1 y2 ] + 2E [y22 ].
From the definition of variance and the given covariance matrix,
2 = var y1 = E [y12 ] E [y1 ]2 = E [y12 ] 1
5 = var y2 = E [y22 ] E [y2 ]2 = E [y22 ] 9
so E [y12 ] = 3 and E [y22 ] = 14.

Linear Statistical Models: Random vectors

24/53

Random vectors

Random quadratic forms

Independence

From the definition of covariance and the given covariance matrix,


1 = cov(y1 , y2 ) = E [y1 y2 ] E [y1 ]E [y2 ] = E [y1 y2 ] 3
so E [y1 y2 ] = 4. This gives
E [yT Ay] = 4 3 + 2 4 + 2 14 = 48.

Linear Statistical Models: Random vectors

25/53

Random vectors

Random quadratic forms

Independence

From the theorem,


E [yT Ay] = tr (AV ) + T A




 

 4 1
4 1
2 1
1
= tr
+ 1 3
1 2
1 5
1 2
3


 

 7
9 9
= tr
+ 1 3
4 11
7
= 9 + 11 + 7 + 21 = 48.

Linear Statistical Models: Random vectors

26/53

Random vectors

Random quadratic forms

Independence

Noncentral 2 distribution

Definition
Let y = (yi ) be a k 1 normally distributed
vector with
Prandom
k
T
2
mean and variance I . Then x = y y = i=1 yi follows a
noncentral 2 distribution with k degrees of freedom and
noncentrality parameter = 12 T . We write x 2k , .

Warning: some authors define to be T .

Note that the distribution of x depends on only through .

Linear Statistical Models: Random vectors

27/53

Random vectors

Random quadratic forms

Independence

Suppose y MVN (, Ik ) and x = yT y 2k , . Then


E [x ] = tr (Ik ) + T = k + 2.

The noncentrality parameter = 12 T is zero if and only if


= 0, in which case x is just the sum of k i.i.d. standard normals.
That is, x has an ordinary (central) 2 distribution with k degrees
of freedom.

Linear Statistical Models: Random vectors

28/53

Random quadratic forms

Independence

0.10
0.05
0.00

chisq 4 df lambda = 0, 1, 2

0.15

Random vectors

10

15

x
Linear Statistical Models: Random vectors

29/53

Random vectors

Random quadratic forms

Independence

Theorem
Let Xk21 ,1 , Xk22 ,2 , . . . , Xk2n ,n be a collection of n independent
noncentral 2 random variables, with k1 , k2 , . . . , kn degrees of
freedom respectively and noncentrality parameters 1 , 2 , . . . , n
respectively. Then
n
X

Xk2i ,i

i=1

Pn
has a noncentral
distribution
Pn with i=1 ki degrees of freedom
and noncentrality parameter i=1 i .
2

If we set i = 0 for all i , we get the result that the sum of


independent 2 variables is another 2 variable.

Linear Statistical Models: Random vectors

30/53

Random vectors

Random quadratic forms

Independence

Distribution of quadratic forms

Theorem
Let y be a n 1 normally distributed random vector with mean
and variance I and let A be a n n symmetric matrix. Then
yT Ay has a noncentral 2 distribution with k degrees of freedom
and noncentrality parameter = 12 T A if and only if A is
idempotent and has rank k .

Linear Statistical Models: Random vectors

31/53

Random vectors

Linear Statistical Models: Random vectors

Random quadratic forms

Independence

32/53

Random vectors

Linear Statistical Models: Random vectors

Random quadratic forms

Independence

33/53

Random vectors

Random quadratic forms

Independence

Corollary
Let y be a n 1 normally distributed random vector with mean 0
and variance I and let A be a n n symmetric matrix. Then
yT Ay has a (ordinary) 2 distribution with k degrees of freedom if
and only if A is idempotent and has rank k .

Corollary
Let y be a n 1 normally distributed random vector with mean
and variance 2 I and let A be a n n symmetric matrix. Then
1 T
y Ay has a noncentral 2 distribution with k degrees of
2
freedom and noncentrality parameter = 21 2 T A if and only if
A is idempotent and has rank k .

Linear Statistical Models: Random vectors

34/53

Random vectors

Random quadratic forms

Independence

Example. Let y1 and y2 be independent normal random variables


with means 3 and -2 respectively and common variance 1. Let


1 1 1
A=
.
2 1 1
It is easy to verify that A is symmetric and idempotent, and has
rank 1. Therefore

yT Ay =


1
y1 y2
2

1 1
1 1



y1
y2

1
1
= y12 + y1 y2 + y22
2
2

has a noncentral 2 distribution with 1 degree of freedom and


noncentrality parameter



 1 1
1
1
3
3 2
= .
=
1 1
2
4
4
Linear Statistical Models: Random vectors

35/53

Random vectors

Random quadratic forms

Independence

What happens if y does not have variance I ?

Theorem
Let y be a n 1 normal random vector with mean and variance
V , and let A be a n n symmetric matrix. Then yT Ay has a
noncentral 2 distribution with k degrees of freedom and
noncentrality parameter = 21 T A if and only if AV is
idempotent and has rank k .

Linear Statistical Models: Random vectors

36/53

Random vectors

Random quadratic forms

Independence

Corollary
Let y be a n 1 normal random vector with mean 0 and variance
V and let A be a n n symmetric matrix. Then yT Ay has a
(ordinary) 2 distribution with k degrees of freedom if and only if
AV is idempotent and has rank k .

Corollary
Let y be a n 1 normal random vector with mean and variance
V of full rank. Then yT V 1 y has a noncentral 2 distribution
with n degrees of freedom and noncentrality parameter
= 21 T V 1 .

Linear Statistical Models: Random vectors

37/53

Random vectors

Random quadratic forms

Independence

R example: noncentral chisquared


Consider the quadratic form yT Ay with

 




1
3
1 0.8
1 1
y MVN a =
,V =
, A=
.
1
0.8
1
3.6 1 1
> A <- matrix(1/3.6, 2, 2)
> A %*% V
[,1] [,2]
[1,] 0.5 0.5
[2,] 0.5 0.5
> library(Matrix)
> (df <- rankMatrix(A %*% V)[1])
[1] 1
> (lambda <- t(a) %*% A %*% a / 2)
[,1]
[1,] 2.222222
Linear Statistical Models: Random vectors

38/53

Random vectors

Random quadratic forms

Independence

> quadform <- function(y, A) t(y) %*% A %*% y


> x <- apply(y, 1, quadform, A = A)
> mean(x)
[1] 5.198274
> df + 2*lambda
[,1]
[1,] 5.444444
> hist(x, freq=F)
> curve(dchisq(x, df, 2*lambda), add = TRUE)

Linear Statistical Models: Random vectors

39/53

Random vectors

Random quadratic forms

Independence

0.08
0.06
0.00

0.02

0.04

Density

0.10

0.12

0.14

Histogram of x

10

15

20

x
Linear Statistical Models: Random vectors

40/53

Random vectors

Random quadratic forms

Independence

Example. Let y1 and y2 follow a multivariate normal distribution


with means
-1 and 4 respectively, and covariance matrix

3 2
V =
. Then
2 2

 

1
2 2
1 1
1
V =
=
,
3
1 3/2
3 2 2 2 2
and the quadratic form
3
yT V 1 y = y12 2y1 y2 + y22
2
has a noncentral 2 distribution with 2 degrees of freedom and
noncentrality parameter




1
33
1 1
1
1 4
=
= .
1 3/2
4
2
2

Linear Statistical Models: Random vectors

41/53

Random vectors

Random quadratic forms

Independence

Independence of quadratic forms

Sometimes we will want to know when two quadratic forms are


independent. The next theorem tells us when this happens.

Theorem
Let y be a n 1 normal random vector with mean and variance
V of full rank, and let A and B be symmetric n n matrices.
Then yT Ay and yT B y are independent if and only if
AVB = 0.

Linear Statistical Models: Random vectors

42/53

Random vectors

Linear Statistical Models: Random vectors

Random quadratic forms

Independence

43/53

Random vectors

Linear Statistical Models: Random vectors

Random quadratic forms

Independence

44/53

Random vectors

Linear Statistical Models: Random vectors

Random quadratic forms

Independence

45/53

Random vectors

Random quadratic forms

Independence

Example. Let y1 and y2 follow a multivariate normal distribution


with covariance matrix


1 c
V =
.
c 1
Consider the symmetric matrices




1 0
0 0
A=
,B =
.
0 0
0 1
It is obvious that
yT Ay = y12 , yT B y = y22 .

Linear Statistical Models: Random vectors

46/53

Random vectors

Random quadratic forms

Independence

Now these quadratic forms will be independent if and only if

AVB

1 0
0 0



1 c
c 1



1 0
0 0



0 c
0 1

0 c
0 0

=
=
=

0 0
0 1

is the 0 matrix. But this happens if and only if c = 0, i.e. if y1 and


y2 have zero covariance.

Linear Statistical Models: Random vectors

47/53

Random vectors

Random quadratic forms

Independence

Corollary
Let y be a random normal vector with mean and variance 2 I ,
and let A and B be symmetric matrices. Then yT Ay and yT B y
are independent if and only if AB = 0.

Linear Statistical Models: Random vectors

48/53

Random vectors

Random quadratic forms

Independence

Next we consider when a quadratic form is independent of a


random vector. Firstly, we define a random variable to be
independent of a random vector if and only if it is independent of
all elements of that vector.

Theorem
Let y be a n 1 normal random vector with mean and variance
V , and let A be a n n symmetric matrix and B a m n matrix.
Then yT Ay and B y are independent if and only if BVA = 0.
Lastly, we can combine several of the theorems we have seen
before to tell when a group of quadratic forms (more than two) are
independent.

Linear Statistical Models: Random vectors

49/53

Random vectors

Random quadratic forms

Independence

Theorem
Let y be a normal random vector with mean and variance I , and
let A1 , A2 , . . . , Am be a collection of m symmetric matrices. If any
two of the following statements are true:
I

All Ai are idempotent;


Pm
i=1 Ai is idempotent;

Ai Aj = 0 for all i 6= j ;

then so is the third, and


I

For all i , yT Ai y has a noncentral 2 distribution with r (Ai )


degrees of freedom and noncentrality parameter
i = 12 T Ai ;

yT Ai y and yT Aj y are independent for i 6= j ; and


Pm
Pm
i=1 r (Ai ) = r ( i=1 Ai ).

Linear Statistical Models: Random vectors

50/53

Random vectors

Random quadratic forms

Independence

P
When i Ai = I , the previous result can be seen as a special case
of the following result (which we will not prove):

Theorem (Cochran-Fisher Theorem)


Let y be a n 1 normal random vector with mean and variance
2 I . Decompose the sum of squares of y/ into the quadratic
forms
m

X 1
1 T
y
y
=
yT Ai y.
2
2
i=1

Then the quadratic forms are independent and have noncentral 2


distributions with parameters r (Ai ) and 21 2 T Ai , respectively,
if and only if
m
X

r (Ai ) = n.

i=1

Linear Statistical Models: Random vectors

51/53

Random vectors

Random quadratic forms

Independence

Example
> A <- matrix(1, 2, 2)
> B <- matrix(c(1,-1,-1,1), 2, 2)
> A %*% B
[1,]
[2,]
>
>
>
>

[,1] [,2]
0
0
0
0

y <- mvrnorm(200, c(0, 0), diag(c(2, 2)))


x1 <- apply(y, 1, quadform, A = A)
x2 <- apply(y, 1, quadform, A = B)
cor(x1, x2)

[1] 0.0662571

Linear Statistical Models: Random vectors

52/53

Random vectors

Random quadratic forms

Independence

15
0

10

x2

20

25

30

> plot(x1, x2)

10

Linear Statistical Models: Random vectors

20
x1

30

40
53/53

You might also like