III )
Section 3.5 Extremum Problems and Lagrange Multipliers 147
148 Chapter 3 Calculus in Banach Spaces
This
H
is a function of (Xl , X2 , X3 , A, J.L). The five equations to solve are 2(xl -cd
+ Aal + J.Lbl = 2(X2 - C2 )+
Aa2+
J.Lb2 = 2(X3 - C3)+
Aa3+
J.Lb3 =°
(a, x) -
k
= (b, x) -f =°
We see that x is of the form x = c + oa
+
;3b. When this is substituted in the second set of equations, we obtain two linear equations for determining a and;3:
(a, a)o
+
(a, b)f3 =k
-(a, c) and (a, b)o + (b, b)f3=
f -(b, c) •Theorem 2. Lagrange Multiplier. Let J and 9 be continuously differentiable real-valued functions on an open set n in a Banach space.
Let AI = {x E n : g(x) = a}. If Xo is a local minimum point of JIM and if g'(xo ) # 0, then f'(xo) = Ag'(xo) for some A E JR.
Proof. Let X be the Banach space in question. Select a neighborhood U of Xo such that
x E u n AI ===? J(xo) � J(x)
We can assume U c n. Define F : U � JR2 by F(x) = (J(x), g(x)). Then F(xo ) = (J(xo), O) and F'(x)v = (J'(x)v, g' (x)v) for all V E X . Observe that
if r < J(xoL then (r,O) is
not
in F(U). Hence F(U) is not a neighborhood ofF(TO) ' By the Corollary in Section 4.4, F' (xo) is
not
surjective (as a linear map from X to JR2). Hence F'(xo)v = o( v)( 8,J.L)
for some continuous linear functional o. (Thus a E X · . ) It follows that f'(xo)v = 0(v)8 and g'(xo)v = o(v)J.L. Since g'(xo) # 0,J.L #
0. Therefore,Theorem 3. Lagrange Multipliers. Let J, gl , . . . ,gn be contin
uOllsly differentiable real-valued functions defined on an open set n in a Banach space X . Let M = {x E n : gl (x) = '" = gn (X) = O}. If Xo is a local minimum point of JIM (the restriction of J to M), then
there is a nontrivial linear relation of the form
•
Proof. Select a neighborhood U of Xo such that U C n and such that J(xo) �
J(x) for all x E u n M. Define F : U � JRn+ 1 by the equation
If r < J(xo), then the point (r, O, O, . . . . O) is
not
in F(U). Thus F(U)does
not
contain a neighborhood of the point (f(XO ) , gl (XO), . . . ,gn(XO) ==(J(xo), 0, 0, . . . , 0). By the Corollary in Section 3.4, page 143, F'(xo) is
not
surjective. Since the range of F'(xo) is a linear subspace of JRn+ l , we now know that it is a
proper
subspace of JRn+ l . Hence it is contained in a hyperplane through the origin. This means that for someJ.L, AI , . . . , An
(not all zero) we haveSection 3.5 Extremum Problems and Lagrange Multipliers 149 for all
v E X.This implies the equation in the statement of the theorem.
• Example 4.Let A be a compact Hermitian operator on a Hilbert space
X.Then IIAII = max{IAI : A
EA(A)}, where A{A) is the set of eigenvalues of A.
This is proved by Lemma 2, page 92, together with Problem 22, page 101. Then by Lemma 2 in Section 2.3, page
85,we have !!AI!
=sup{I(Ax,x)1 : IIxll
=I}.
Hence we can find an eigenvalue of A by determining an extremum of (Ax,x)
on the set defined by IIxli = 1. An alternative is given by the next result.
• Lemma.(Ax,x)/(x,x)
Ifhas a stationary value at each eigenvector. A is Hermitian, then the "Rayleigh Quotient" f{x)
==Proof.
Let Ax
=AX, X #- O. Then f{x)
==(Ax, x) / (x, x)
==A. Recall that the eigenvalues of a Hermitian operator are real. Let us compute the derivative of
f at x and show that it is O.
lim If{x + h) - f{x)I/llhll
==lim I (Ax + Ah,x + h) - AI/llhll
h
...
O(x
+h, x
+h)
I==
lim I(Ax,x)
+(Ah,x) + (Ax, h)
+(Ah, h) - Allx
+hll21/lIhll llx
+hll2
==
lim I(h, Ax)
+A(X, h)
+(Ah, h) - 2ARe(x, h) - A(h, h)I/llhll llx
+hl12
==
lim IA(h, x)
+A(X, h)
+(Ah, h) - 2ARe(x, h) - A(h, h) 1/lIhll llx
+hll2
= lim I (Ah, h) - A(h, h) I/llhll llx
+hl12
==
lim I(Ah - Ah, h)I/llhll llx
+hll2
::;; lim IIAh - Ahll llhll/llhll llx
+hll2
::;; lim IIA - Alll llhlllllx
+hl12
==0
Thus from the definition of f'{x)
asthe operator that makes the equation lim If{x
+h) - f{x) - !,{x)hl / IIhl!
==0
h
...
Otrue, we have !,(x) == O.
Since the Rayleigh quotient can be written
as(Ax, x) / ( X ) x )
Tx1f2 = \A W ' w
•
it is possible to consider the simpler function F{x)
=(Ax, x) restricted to the unit sphere.
Theorem 4. If
A is a Hermitian operator on a Hilbert space, then each local constrained minimum or maximum point of (Ax, x) on
the unit sphere is an eigenvector of A. The value of (Ax, x) is the
corresponding eigenvalue.
Proof.
Use F{x)
==(Ax, x) and G(x)
==IIxll2 -
1 .Then
F'(x)h
==2(Ax,h) G'(x)h
==2(x,h)
150 Chapter 3 Calculus in Banach Spaces
Our theorem about Lagrange multipliers gives a necessary condition in order that x be a local extremum, namely that p.F'(x) + AG'(X) = 0 in a nontrivial manner. Since Ilxll = 1, G'(x) '" O. Hence p. '" 0, and by the homogeneity we can set p. =
-1
.This leads to
-2(Ax, h) + 2A(X, h) = 0 (h E
X )whence Ax = AX.
•Extremum problems with inequality constraints can also be discussed in a general setting free of dimensionality restrictions. This leads to the so-called
K
uhn-Tucker Theory.
Inequalities in a vector space require some elucidation. An
ordered vector spaceis a pair
(X,�) in which
Xis a real vector space and � is a partial order in
Xthat is consistent. with the linear structure. This means simply that
x � y
=:::}x + z � y+ z x � y , A � 0
=:::}AX � AY
In an ordered vector space, the
positive coneis
P = {x : x � O}
A cone having vertex
vis a set C such that
v+ A(X
- v)E C when x E C
and A � O. It is elementary to prove that P is a convex cone having vertex at
O. Also, the partial order can be recovered from P by defining x � Y to mean x - y E P. If
Xis a normed space with an order as described, then
X 'is ordered in a standard way; namely, we define <I> � 0 to mean ¢(x) � 0 for all x � O. Here
¢ E
X ' .These matters are well illustrated by the space C[a, b], in which the natural order f �
9is defined to mean f(t) � g(t) for all t E [a, b]. The conjugate space con;:;ists of signed measures.
In the next theorem,
Xand Y are normed linear spaces, and Y is an ordered vector space. Differentiable functions f
: X --+1R and G
: X --+Y are given. We seek necessary conditions for a point Xo to maximize f(x) subject to G(x) � O.
Theorem 5.
If Xo is a local maximum point of f on the set {x : G(x) � O} interior point of the positive cone, then there is a nonnegative functional and if there is an h E X such that G(xo) + G'(xo)h is an
¢ E Y' such that ¢(G(xo)) = 0 and !'(xo) = -¢oG'(xo).
Proof.