You are on page 1of 5

The Role of the Implicit Function

in the Lagrange Multiplier Rule


E. Bendito, A. Carmona, A.M. Encinas

Departament de Matem`atica Aplicada III


Universitat Polit`ecnica de Catalunya
Abstract
In this note we present the necessary condition of constrained extremum point
and the criteria for its classication using the implicit function, the chain rule and
elementary techniques of linear algebra.
Key words. Multiplier rule, Implicit Function Theorem.
AMS subject classication. 2601, 26B10
The constrained extremum problem is a topic that appears in most calculus courses.
However it seems to be dicult to decide the kind of critical point. We may nd two types
of textbooks in the literature:
Those that introduce the Lagrange Function to give the necessary condition but do not
explain the sucient condition because its proof exceeds their scope. (See [1, 2, 5].)
Those that give the necessary condition by means of the Lagrange Function and, to
obtain the sucient condition, they introduce the concept of dierential manifold and its
tangent space. ( See [3, 4].)
Our purpose is to show that the constrained extremum problem is a natural extension
of the unconstrained case, and therefore it can be handled in a similar way. Firstly, recall
that an unconstrained extremum problem can be written as follows:
Given
n
IR
n
an open set and f :
n
IR a C
2
-function, obtain the local extrema
of f.
Both, the necessary and sucient conditions for this problem are well known. (See
[1].)
On the other hand, a constrained extremum problem can be written as follows:

e-mail: angeles.carmona@upc.es
1
Given
n
IR
n
an open set, f :
n
IR and g :
n
IR
m
C
2
-functions such that
rank(Dg(y)) = m < n, y
n
and M={y
n
: g(y) = 0}, obtain the local extrema of
f
|
M
.
How can we handle this problem? Why must we introduce the Lagrange function?
A possible answer to the last question could be because the constrained extremum
problem is transformed into an unconstrained extremum problem. Let us examine the
Lagrange function which is dened as:
L :
n
IR
m
IR
L(y, ) = f(y)
1
g
1
(y)
m
g
m
(y).
We can see that this function depends on m new variables, so the dimension of the under-
lying space has been increased. It would be desirable that this increase would allow us to
apply the techniques of unconstrained extremum classication, but this is not possible as
the following simple example shows:
The function f(y
1
, y
2
) = y
2
1
+y
2
2
attains a local minimum on M = {(y
1
, y
2
) IR
2
: y
2
= 1}
at the point (0, 1) with multiplier = 2. Nevertheless, the point (0, 1, 2) is a saddle
point of function L.
Hence, we cannot act the same way as in the unconstrained case. Maybe, it would
be good to reect on the initial problem, that is, to obtain the local extrema of f on
M = {y
n
: g(y) = 0}. The dierence between this problem and the unconstrained one
is that the set M is a closed dierential manifold instead of an open set of IR
n
. Is it then
necessary to introduce the dierential manifolds theory to classify the critical points of a
constrained extremum problem? The answer is negative, because it suces to consider
the implicit function that describes M from an open set of IR
nm
. Let us show how to do
it.
Let y
0

n
. As rank(Dg(y
0
)) = m < n, we can apply the Implicit Function Theorem
to g(y) = 0. Then, if x = (y
1
, . . . , y
k
), k = n m, there exist an open set
k
IR
k
and a
C
2
-function :
k

n
such that
(x) = (x, y
k+1
(x), . . . , y
n
(x)), with (x
0
) = y
0
(1)
and g 0. Hence,
f = f
|
M
.
Then, the problem of nding the local extrema of f
|
M
is equivalent to the problem of
nding the local extrema of f :
k
IR with the additional information given by the
identity g 0.
Consequently, we have achieved our purpose, since the constrained extremum problem
can be treated as an unconstrained one. In addition, the dimension of the underlying space
is smaller, which is natural, since g(y) = 0 denes some relations between the variables.
2
We must note that the above mentioned references arrive to this point. However, they
leave this way in favour of the Lagrange Function. We will keep the way and will proceed
applying the unconstrained extremum theory.
Let us make an abstraction of the role played by the implicit function to enlighten the
problem.
Study the local extrema of the function fh, knowing that gh 0, where f :
n
IR,
g :
n
IR
m
, and h :
k
IR
n
are C
2
-functions with
n
IR
n
,
k
IR
k
open sets
and h(
k
)
n
.
Let x
0

k
be a point where a extremum of f h occurs such that rank(Dg(y
0
)) =
n rank(Dh(x
0
)), where y
0
= h(x
0
).
It will be useful to consider the transposed Jacobian matrices of h and g denoted by
A = (h
1
(x
0
), . . . , h
n
(x
0
)) B = (g
1
(y
0
), . . . , g
m
(y
0
)),
respectively, and
V = span{g
1
(y
0
), . . . , g
m
(y
0
)}.
By applying the chain rule and using that g h 0, we obtain
Dg(y
0
) Dh(x
0
) = 0.
If we transpose this expression we get
AB = 0,
and hence V KerA. As dimV = rank(B) = n rank(A) = dimKerA, we arrive to
V = KerA. (2)
Let us examine the necessary condition. If f h attains an extremum at the point x
0
,
then D(f h)(x
0
) = 0. Applying again the chain rule we obtain
Df(y
0
) Dh(x
0
) = 0.
If we transpose this expression we get
Af(y
0
) = 0
which means that f(y
0
) KerA. Hence, by (2) there exists IR
m
such that
f(y
0
) =
m

l=1

l
g
l
(y
0
). (3)
Note that is not unique, except when rank(Dg(y
0
)) = m.
3
To derive the sucient condition, we must know H(f h), the Hessian matrix of f h.
D
ij
(f h)(x
0
) =
n

p,q=1
D
pq
f(y
0
)D
j
h
q
(x
0
)D
i
h
p
(x
0
) +
n

p=1
D
p
f(y
0
)D
ij
h
p
(x
0
). (4)
Analogously, for all l = 1, . . . , m,
D
ij
(g
l
h)(x
0
) =
n

p,q=1
D
pq
g
l
(y
0
)D
j
h
q
(x
0
)D
i
h
p
(x
0
) +
n

p=1
D
p
g
l
(y
0
)D
ij
h
p
(x
0
) = 0. (5)
From (3) and (5) we get,
n

p=1
D
p
f(y
0
)D
ij
h
p
(x
0
) =
m

l=1
n

p,q=1

l
D
pq
g
l
(y
0
)D
j
h
q
(x
0
)D
i
h
p
(x
0
). (6)
Replacing the last term of (4) by the right hand side of equation (6) we obtain,
D
ij
(f h)(x
0
) =
n

p,q=1
D
pq

f
m

l=1

l
g
l

(y
0
)D
j
h
q
(x
0
)D
i
h
p
(x
0
).
Finally,
H(f h)(x
0
) = AH(f
m

l=1

l
g
l
)(y
0
) A
T
. (7)
Let q and Q be the quadratic forms on IR
k
and IR
n
dened by
q(w) =< H(f h)(x
0
)w, w > and Q(v) =< H(f
m

l=1

l
g
l
)(y
0
)v, v >,
respectively. Then from (7),
q(w) =< H(f h)(x
0
)w, w >=< H(f
m

l=1

l
g
l
)(y
0
)A
T
w, A
T
w >= Q(A
T
w)
and hence, q = Q
|
ImA
T
. On the other hand, from (2) and taking into account that
(KerA)

= ImA
T
, we get ImA
T
= V

and therefore, q = Q
|
V

. Hence, H(f h)(x


0
)
is denite [strictly denite] i H(f
m

l=1

l
g
l
)(y
0
) is denite [strictly denite] on V

.
So, applying the standard criterium for classication of unconstrained extrema we have
that: If the function f h attains a local extremum at x
0
, then H(f
m

l=1

l
g
l
)(y
0
) is denite
on V

. Moreover, if H(f
m

l=1

l
g
l
)(y
0
) is strictly denite on V

, then the function f h


attains a local extremum at x
0
.
4
Note that the function h has disappeared in both necessary and sucient conditions.
An immediate consequence of the above results is the Lagrange Multiplier Rule to
the solution of the constrained extremum problem. For this, it suces to take as h the
function, dened in (1).
Lagrange Multiplier Rule. Let y
0

n
such that rank(Dg(y
0
)) = m. If the function
f attains a local extremum on M at the point y
0
, then there exists a unique

IR
m
such
that f(y
0
) =
m

i=1

i
g
i
(y
0
).
Moreover we have obtained a criterion for the classication of such critical points:
If the function f attains a local extremum on M at the point y
0
, then H
y
(L)(y
0
,

) is
denite on span{g
1
(y
0
), . . . , g
m
(y
0
)}

. Moreover, if H
y
(L)(y
0
,

) is strictly denite
on span{g
1
(y
0
), . . . , g
m
(y
0
)}

then f attains a local extremum on M at the point y


0
,
where L is the Lagrange Function.
We must note that the Lagrange Function appears here as a consequence of the rea-
soning and not as an a priori construction.
To sum up, the role played by the implicit function in the constrained extremum
problem is to turn it into an unconstrained one. Moreover, the additional functional
information g 0, allows us to obtain the successive derivatives of f h independently
of the knowledge of the function . Therefore, the Implicit Function Theorem has been
used here only once. In contrast, those that classify constrained extrema need again the
Implicit Function Theorem to introduce the concept of dierential manifold.
Finally, in the language of dierential manifolds, we get from (7) that
H(f
|
M
)(y
0
) = H
y
(L)(y
0
,

)
|
T
y
0
(M)T
y
0
(M)
.
This result was already obtained in [6].
References
[1] T. M. Apostol, 1974, Mathematical Analysis, (2
o
Ed. Addison Wesley).
[2] R. Courant, F. John, 1974, Introduction to Calculus and Analysis, Vol. II. (John
Wiley & Sons).
[3] C.H. Edwards, 1973, Advanced Calculus of Several Variables, (Academic Press).
[4] W. Fleming, 1977, Function of Several Variables, (Springer-Verlag).
[5] W.H. Protter, C.B. Morrey Jr., 1977, A First Course in Real Analysis, (Springer-
Verlag).
[6] P. Shutler, 1995, Constrained Critical Points, Amer. Math. Monthly, 102, 4952.
5

You might also like