You are on page 1of 5

LN02: Vector Dierentiation

Yasutomo Murasawa
October 19, 2004
Contents
1 Vector Dierentiation 1
1.1 Scalar-Valued Functions . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Vector-Valued Functions . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Dierentiation Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Application to OLS 3
2.1 Unconstrained Maximization . . . . . . . . . . . . . . . . . . . . . . 3
2.2 OLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Application to RLS 4
3.1 Constrained Maximization . . . . . . . . . . . . . . . . . . . . . . . . 4
3.2 RLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1 Vector Dierentiation
1.1 Scalar-Valued Functions
Let
X
n
, Y
m
,
f : X Y .
Let m = 1.
Denition 1 The gradient of f(.) at x is
df
dx
(x) :=
_
_
_
f
x
1
(x)
.
.
.
f
x
n
(x)
_
_
_.
Remark: We also write f(x).
Denition 2 The Hessian matrix of f(.) at x is
d
2
f
dxdx

(x) :=
_

2
f
x
2
1
(x) . . .

2
f
x
1
x
n
(x)
.
.
.
.
.
.
.
.
.

2
f
x
n
x
1
(x) . . .

2
f
x
2
n
(x)
_

_
.
Remark: We also write
2
f(x).
1
1.2 Vector-Valued Functions
Let m > 1.
Denition 3 The Jacobian matrix of f(.) at x is
df
dx

(x) :=
_

_
df
1
dx

(x)
.
.
.
df
m
dx

(x)
_

_.
Remark: We dene
df
dx
(x) :=
df
dx

(x)

.
1.3 Dierentiation Rules
Theorem 1 a, x
n
,
da

x
dx
= a.
Proof. By the denition of the gradient,
da

x
dx
:=
_
_
_
a

x
x
1
.
.
.
a

x
x
n
_
_
_
=
_
_
_

x
1

n
i=1
a
i
x
i
.
.
.

x
n

n
i=1
a
i
x
i
_
_
_
=
_
_
a
1
.
.
.
a
n
_
_
.
2
Theorem 2 A
mn
, x
n
,
dAx
dx

= A.
Proof. By the denition of the Jacobian matrix,
dAx
dx

:=
_

_
da
1
x
dx

.
.
.
da
m
x
dx

_
=
_
_
a
1
.
.
.
a
m
_
_
.
2
Theorem 3 A
nn
, x
n
,
dx

Ax
dx
= (A+A

)x.
2
Proof. We can write
x

Ax =
n

i=1
n

j=1
x
i
a
i,j
x
j
.
By the denition of the gradient,
dx

Ax
dx
:=
_
_
_
x

Ax
x
1
.
.
.
x

Ax
x
n
_
_
_
=
_
_
_

x
1

n
i=1

n
j=1
x
i
a
i,j
x
j
.
.
.

x
n

n
i=1

n
j=1
x
i
a
i,j
x
j
_
_
_
=
_
_
_
_

x
1
_
x
1

n
j=1
a
1,j
x
j
+

i=1

n
j=1
x
i
a
i,j
x
j
_
.
.
.

x
n
_
x
n

n
j=1
a
n,j
x
j
+

i=n

n
j=1
x
i
a
i,j
x
j
_
_
_
_
_
=
_
_
_

n
j=1
a
1,j
x
j
+

n
i=1
a
i,1
x
i
.
.
.

n
j=1
a
n,j
x
j
+

n
i=1
a
i,n
x
i
_
_
_
=
_
_
a
1
x +a
1

x
.
.
.
a
n
x +a
n

x
_
_
=
_
_
a
1
.
.
.
a
n
_
_
x +
_

_
a

1
.
.
.
a

n
_

_x.
2
Remark: If A is symmetric, then
dx

Ax
dx
= 2Ax.
2 Application to OLS
2.1 Unconstrained Maximization
Let
X
n
be open,
Y ,
f : X Y be dierentiable and concave.
Consider
max
x
f(x)
and x X.
The FOC is
f(x

) = 0.
Since f(.) is concave, the FOC is also sucient.
3
2.2 OLS
Let (y, X) be a (1 +k)-variate random sample of size n. Assume a linear regression
model for y given X s.th.
E(y|X) = X.
The ordinary least squares (OLS) estimator of solves
max

1
2
(y X)

(y X)
and
k
.
The FOC is
X

(y Xb) = 0.
Assume that X has full column rank. Then
b = (X

X)
1
X

y.
3 Application to RLS
3.1 Constrained Maximization
Let
X
n
be open and convex,
Y ,
f : X Y be dierentiable and concave,
g : X
m
be dierentiable and convex.
Consider
max
x
f(x)
s.t. g(x) = 0,
and x X.
Assume that a constraint qualication holds. The Lagrange function is
L(x; ) = f(x) +

g(x).
The FOCs are
f(x

) +g(x

= 0,
g(x

) = 0.
3.2 RLS
The restricted least squares (RLS) estimator of s.t. R = r solves
max

1
2
(y X)

(y X)
s.t. R r = 0,
and
k
.
4
Assume that R has full row rank (constraint qualication). The Lagrange function
is
L(; ) :=
1
2
(y X)

(y X) +

(R r)
=
1
2
y

y +
1
2

y +
1
2
y

X
1
2

X +

r.
The FOCs are
X

y X

Xb

+R

= 0,
Rb

r = 0.
Premultiplying the rst equation by R(X

X)
1
,
R(X

X)
1
X

y Rb

+R(X

X)
1
R

= 0,
or
Rb r +R(X

X)
1
R

= 0,
where b is the OLS estimator of . Assume that X has full column rank. Then

=
_
R(X

X)
1
R

1
(Rb r).
Plugging this back into the rst equation,
X

y X

Xb

+R

_
R(X

X)
1
R

1
(Rb r) = 0,
or
X

Xb

= X

y R

_
R(X

X)
1
R

1
(Rb r),
or
b

= b (X

X)
1
R

_
R(X

X)
1
R

1
(Rb r).
5

You might also like