3.4 Variational Proofs of Some Basic Theorems of
3.4 Variational Proofs of Some Basic Theorems of Nonlinear Analysis 77 Lemma 3.20.LetT :X→Y be a continuous linear mapping from a Banach spaceX onto a Banach spaceY. Then there exists a constantτ >0such that τ BY ⊆A(BX), whereBX ={x∈X:kxk ≤1}andBY ={y∈Y :kyk ≤1} are the closed unit balls inX andY, respectively.
Proof. SinceAis onto,
Y =A(X) =A(∪∞n=1nBX) =∪∞n=1A(nBX) =∪∞n=1nA(BX).
It follows from the Baire category theorem that at least one set nA(BX) contains an open set. This implies that A(BX) contains an open set, say y+τ BY ⊆ A(BX). Since BX =−BX, we also have −y+τ BY ⊆A(BX).
Ifz ∈Y such thatkzk < τ, then there exist{uk}∞1 and{vk}∞1 in BX such thaty+z= limAuk and−y+z= limAvk. But thenz= limA(uk+vk)/2∈
A(BX), provingτ BY ⊆A(BX). ut
We are now ready to state and prove Graves’s theorem. It should be noticed that the proof below needs only Lemma 3.20 above and not the full statement of the open mapping theorem, that it is shorter, and that it proves a somewhat stronger result than Theorem C.2. Moreover, we deduce the full proof of the open mapping theorem from it.
Theorem 3.21. (Graves’s theorem) Let X andY be Banach spaces,r >
0, and let f :rBX→Y be a mapping such that f(0) = 0. LetA:X →Y be a continuous linear mapping ontoY satisfyingτ BY ⊆A(BX). Letf −A be Lipschitz continuous on D with a constantδ,0≤δ < τ, that is,
kf(x1)−f(x2)−A(x1−x2)k ≤δkx1−x2kfor all x1, x2∈rBX. Then
(τ−δ)rBY ⊆f(rBX), (3.11)
that is, the equationy=f(x)has a solutionkxk ≤rwheneverkyk<(τ−δ)r.
Moreover,
c d(x, f−1(y))≤ kf(x)−yk, for all kxk ≤r, kyk< cr, (3.12) wherec:τ−δ >0.
Proof. Define D = rBX, and for each point y ∈ Y, define the function fy onD,
fy(x) :=kf(x)−yk. We first claim that
|∇fy|(x)≥c >0 for all x∈D such thatfy(x)6= 0.
Ifx∈Dis such a point, then Lemma 3.20 implies that there exists a sequence {dn} ⊂X withkdnk ≤1 such that
−τ f(x)−y
kf(x)−yk =−τf(x)−y fy(x) = lim
n→∞Adn. We have
f(x+tdn)−y=h
(f−A)(x+tdn)−(f−A)(x)i +th
Adn+τf(x)−y fy(x)
i
+ 1− tτ
fy(x)
(f(x)−y),
implying that for sufficiently smallt >0 and sufficiently largen, fy(x+tdn)≤tδkdnk+o(t) +
1− tτ fy(x)
fy(x)
≤fy(x) +t(δ−τ) +o(t)
< fy(x),
where the last equality follows fromδ < τ. This shows thatxis not a local minimizer offy. We have therefore
|∇fy|(x) = lim
z→x
fy(x)−fy(z)
kx−zk ≥ lim
t&0,n→∞
fy(x)−fy(x+tdn) tkdnk
≥ lim
t&0,n→∞
fy(x)−fy(x+tdn)
t ≥τ−δ=c, proving our claim.
If (3.11) is false, then there existsy∈Y,kyk< cr, such thatfyis positive on D. Choosing := kyk < cr and noting that kyk = fy(0) ≤ infDfy+, it follows from Theorem 3.2 (with λ= r) that there exists a point x ∈ D satisfying
fy(x)≤fy(x) +
rkx−xk for all x∈D.
Therefore, c ≤ |∇fy|(x) ≤/r < c, where the first inequality follows from our first claim, and the second one follows sincex is an (/r)d-point. This is a contradiction that settles (3.11).
Finally, let kyk < cr and x ∈ D such that f(x) 6= y. Define δ :=
d(x, f−1(y)) >0, := fy(x) = kf(x)−yk > 0, and pick 0 < λ < δ. Since infDfy= 0, we havefy(x) = infDfy+, and Theorem 3.2 implies that there exists a pointx∈D satisfyingkx−xk ≤λ < δ, so thatfy(x)>0, and that
fy(x)≤fy(z) +
λkx−zk for all z∈D, so that|∇fy|(x)≤/λ. Therefore,
c≤ |∇fy|(x)≤
λ = kf(x)−yk
λ ,
for all 0< λ < δ=d(x, f−1(y)). This proves (3.12). ut
3.4 Variational Proofs of Some Basic Theorems of Nonlinear Analysis 79 Corollary 3.22. (Open mapping theorem) Let X and Y be Banach spaces and let A : X → Y be a continuous linear mapping onto Y. Then Ais an open mapping, that is, if O⊆X is open, thenA(O)is open in Y. Proof. Applying Theorem 3.21 withf =A,y= 0, and δ= 0, we obtain
τ d(x, A−10)≤ kAxk,
for all xwith small enough norm, and hence for all x, by the homogeneity of the above inequality. If y = Ax satisfies kyk < τ, then d(x, A−10) < 1.
Thus there exists a point u∈ X such that Au= 0 and kx−uk <1. Since A(x−u) =y, this shows thatτ BY ⊆A(BX). SinceAis linear, it follows that
Ais an open mapping. ut
The references [16] and [145] contain more applications of Ekeland’s - variational principle along these lines.
3.4.2 Lyusternik’s Theorem
Theorem 3.23. (Lyusternik’s theorem) Let X andY be Banach spaces, U ⊆X an open set, andf :U →Y. Let TM(x0)be the tangent cone of the level setM =f−1(f(x0))at the point x0∈U.
If f is C1 in a neighborhood of a point x0 such that Df(x0) is a linear mapping onto Y, then TM(x0) is the null space of the linear map Df(x0), that is,
TM(x0) = KerDf(x0) :={d∈X :Df(x0)(d) = 0}. (3.13) Proof. As in the proof of Theorem 2.29, we may assume that x0 = 0 and f(x0) = 0. DefineA =Df(0). The proof of the inclusion TM(0)⊆KerA is the standard one given there, namely if d ∈ TM(0), then there exist points x(t) =td+o(t) inM, so that
0 =f(td+o(t)) =f(0) +tDf(0)(d) +o(t) =tAd+o(t).
Dividing both sides bytand lettingt→0, we obtainAd= 0.
To prove the reverse inclusion KerA ⊆ TM(0), note that Theorem 1.18 implies
kf(x2)−f(x1)−A(x2−x1)k ≤ kx2−x1)k · sup
t∈[0,1]kDf(x1+t(x2−x1))−Ak. Therefore, given >0, there exists a neighborhoodU 30 such thatf−A is Lipschitz continuous with a constantonU. It follows from Theorem 3.21 that there exists a constantc >0 such thatcd(x, M)≤ kf(x)k for allxin a neighborhoodV 30.
Ifd∈KerA, thenf(td) =f(0)+tAd+o(t) =o(t), so thatd(td, M) =o(t).
Thus, there existsx(t)∈M satisfyingx(t)−td=o(t). We obtain
x(t)−0
t =d+o(t)
t →dast→0,
proving thatd∈TM(0). ut
The references [78, 146, 145] provide much more information on Lyusternik’s theorem and its uses in optimization and related fields. The reference Ioffe [145]
is a recent survey on Graves’s theorem and the associated concept ofmetric regularity.
3.4.3 The Inverse and Implicit Function Theorems
The inverse function theorem and the closely related implicit function theo- rems are among the most important results in all of nonlinear analysis. We turn to these results next.
Theorem 3.24. (Inverse function theorem) Let X and Y be Banach spaces,x0∈X, andf is aC1 mapping from a neighborhood ofx0 intoY.
If Df(x0) : X → Y is invertible, then f is a C1 diffeomorphism on a neighborhood ofx0.
Moreover, if f isCk onΩ, thenf is aCk diffeomorphism on a neighbor- hood ofx0.
Proof. Define A=Df(x0). Corollary 3.22 implies that A−1 is a continuous linear map; thus kA−1k = supkyk=1kA−1yk < ∞. As noted in the proof Theorem 3.23, given >0, there exists a neighborhoodU ofx0 such that
kf(x2)−f(x1)−A(x2−x1)k ≤kx2−x1k for all x1, x2∈U. (3.14) This implies thatf is one-to-one in a neighborhood ofx0, because ifx16=x2
andf(x1) =f(x2) in the above inequality, then
kA(x1−x2)k=kf(x1)−f(x2)−A(x1−x2)k ≤kx1−x2k
≤kA−1k · kA(x1−x2)k, which cannot hold if >0 is small enough.
Moreover, the inclusion (3.11) in Theorem 3.21 implies that there exist open neighborhoodsU 3x0 and V 3y0 such that f :U →V is one-to-one and onto, and the inequality (3.12) implies thatf−1 :V →U is continuous.
Settingyi:=f(xi),i= 1,2, we have
kf−1(y2)−f−1(y1)−A−1(y2−y1)k
=kx2−x1−A−1(f(x2)−f(x1))k
≤ kA−1k · kf(x2)−f(x1)−A(x2−x1)k
≤kA−1k · kx2−x1k
≤ kA−1k
c ky2−y1k,
3.4 Variational Proofs of Some Basic Theorems of Nonlinear Analysis 81 for all y1, y2 ∈ V, where the second inequality follows from (3.14), and the last inequality from (3.12). This proves thatf−1 is differentiable at y0 with Df−1(y0) =A−1=Df(x0)−1.
If the neighborhood U containing x0 is chosen small enough such that Df(x) is invertible for every x∈U, then f is a diffeomorphism between U andV =f(U). It follows that if y=f(x), then
Df−1(y) =Df(x)−1, that is,
D(f−1) = Inv◦Df◦f−1 (3.15) on f(U), where Inv is the map sending a nonsingular matrix to its inverse.
Since Inv is infinitely differentiable (see Example 1.28) andDf and f−1 are continuous, it follows thatf−1∈C1. If, moreover,f ∈C2, that is,Df ∈C1, it follows from 3.15 that D(f−1) ∈ C1, that is, f−1 ∈ C2. Induction on k
settles the general case. ut
Remark 3.25.The above proof of the inverse function theorem differs from the usual proofs in that Graves’s theorem (for which we gave a variational as well as a classical proof) is used instead of Banach’s fixed point theorem to establish the fact thatf is an open mapping near x0. The common proof based on the latter theorem can be found in many books; see for example Lang [183] for a clean presentation.
The inverse function theorem may fail if the continuity assumption onDf is removed; see an example in [77], page 273.
If X and Y are Banach spaces, x0 ∈ X, y0 ∈ Y, and f is a C1 map in a neighborhood of (x0, y0), we denote byDxf(x0, y0) the derivative of f at (x0, y0) with respect tox, that is, Dxf(x0, y0) is the derivative of the map x7→f(x, y0) atx0.
Theorem 3.26. (Implicit function theorem) LetX,Y be Banach spaces, x0 ∈ X, y0 ∈ Y, and f a C1 map in a neighborhood of (x0, y0). Define w0=f(x0, y0).
If Dyf(x0, y0) :Y → Y is a linear isomorphism, then there exist neigh- borhoodsU 3x0 andV 3y0, and a C1 mapping y:U →V such that a point (x, y)∈U ×V satisfiesf(x, y) =w0 if and only if y =y(x). The derivative ofy atx0 is given by
Dy(x0) =−Dyf(x0, y0)−1Dxf(x0, y0).
Moreover, if f isCk, then so is y(x).
Proof. Define the map F(x, y) = (x, f(x, y)). At a point z0 = (x0, y0), it follows from Taylor’s formula
F(x0+th, y0+tk) = (x0+th, f(x0+th, y0+tk))
=
x0+th, f(x0, y0) +t[Dxf(x0, y0)h+Dyf(x0, y0)k] +o(t)
=F(x0, y0) +t
h, Dxf(x0, y0)h+Dyf(x0, y0)k +o(t), that the derivativeDF(x0, y0) :X×Y →X×Y given by
DF(x0, y0)(h, k) = h, Dxf(x0, y0)h+Dyf(x0, y0)k
is a linear isomorphism. It follows from Theorem 3.24 that there exist neigh- borhoods U 3 x0, V 3y0, and W 3 (x0, w0) such thatF :U ×V → W is a bijection andF−1is C1 (Ck iff isCk). Note thatF−1(x, y) = (x, g(x, y)) for some functiong∈C1.
Let (x, y)∈U×V. We havef(x, y) =w0if and onlyF(x, y) = (x, w0), or (x, y) =F−1(x, w0) = (x, g(x, w0)), that is, if and only ify=g(x, w0) =:y(x).
Sinceg∈C1, we havey(x)∈C1as well.
The chain rule applied to the functionf(x, y(x)) =w0givesDxf(x, y(x))+
Dyf(x, y(x))Dy(x) = 0, leading to the formula forDy(x0). ut