Chapter 5 Changes of Variable
5.3 The Divergence Theorem
5.3.2. Mass Transport
To be precise, if one interprets R
1G(x)dx−R
1G◦Φ(t, x)dxas the net loss or gain due to the flow at timet, then (5.3.5) says thatR
Gdiv V(x)
dxis the rate of loss or gain at timet= 0. On the other hand, there is another way in which to think about this computation. Namely,
(∗)
Z
1G(x)dx− Z
1G◦Φ(t, x)dx
= Z
G
1G{ Φ(t, x) dx−
Z
G{
1G Φ(t, x) dx,
which indicates that one should be able to do the same calculation by ob- serving how much mass has moved in each direction across the boundary of G during the time interval [0, t]. To carry out this approach, I will assume that G is a non-empty, bounded open set that is a smooth region in the sense that for each p∈∂Gthere exist an open neighborhood W 3pand an F ∈C3(W;R) such that|∇F| >0 and G∩W ={x∈W : F(x)< 0}. In particular,∂Gis a compact hypersurface and, forx∈W∩∂G, |∇F|∇F (x) is the outward pointing unit normalto∂Gatx.
A key role will be played by the following application of the ideas in§5.2.2.
Lemma 5.3.6. Assume that Gis a bounded, smooth region in RN. Then there exist aρ >0and twice continuously differentiable mapsn: (∂G)(ρ)−→
SN−1, p : (∂G)(ρ) −→ ∂G, and ξ : (∂G)(ρ) −→ (−ρ, ρ) such that, for each x∈G(ρ),p(x)is the one and onlyp∈∂Gsatisfying|x−p|=|x−∂G|,n(x) = n p(x)
⊥Tp(x)∂G,ξ(x)<0 ⇐⇒ x∈G∩(∂G)(ρ), andx=p(x) +ξ(x)n(x).
Proof: Given q ∈∂G, choose an open W 3 q and aFq ∈ C3(W;R) such that |∇Fq| >0 and G∩W = {Fq <0}. By Lemma 5.2.11, there exists an rq > 0 withBRN(q,3rq) ⊂⊂ W and twice continuously differentiable maps (cf. (5.2.8))
pq : ∂G∩BRN(q,3rq)
(3rq)−→∂G∩BRN(q,3rq) andξq : ∂G∩BRN(q,3rq)
(3rq)−→(−3rq,3rq) such that, for each x∈ ∂G∩BRN(q,3rq)
(3rq), pq(x) +ξq(x)|∇F∇Fq
q| pq(x) is the one and only way to write xin the formp+ξ|∇F∇F|(p) withp∈∂G∩ BRN(q,3rq) and |ξ| < 3rq. Now choose q1, . . . , qM ∈ ∂G such that ∂G ⊆ SM
m=1BRN(qm, rm), where rm ≡ rqm, and set ρ = r1∧ · · · ∧rM. If x ∈
∂G∩BRN(qm, rm)
(ρ), then, for any p∈∂Gsatisfying|x−p|< ρ,
|p−qm| ≤ |p−x|+|x−qm|<3ρ≤3rm.
Hence pqm(x) is the one and only p∈∂Gfor which|p−x|< ρandx−p⊥ Tp(∂G), and so we can definex p(x),x ξ(x), andx n(x) on∂G(ρ)by
5 Changes of Variable taking p(x) =pqm(x), ξ(x) =ξqm(x), andn(x) = |∇F∇Fqm
qm| pqm(x)
whenx∈ (∂G∩BRN(qm, rm)
(ρ). Finally, to check that|x−p(x)|=|x−∂G|, suppose that p∈∂G and that |p−x|=|x−∂G|. Then, for anyγ : (−, )−→∂G withγ(0) =p,
2 x−p,γ(0)˙
RN = d
dt|x−γ(t)|2
t=0= 0, and so x−p⊥Tp(∂G), which is possible only if p=p(x).
Returning to the problem posed earlier, begin with the observation that, because|Φ(t, x)−x| ≤ kVkut,x∈Gand Φ(t, x)∈/ Gonly if|x−∂G| ≤ kVkut.
Thus ifρis taken as in Lemma 5.3.6 andT >0 is chosen such thatkVkuT < ρ, then, in the notation of that lemma, we know that, for each t∈[0, T],x∈G and Φ(t, x)∈/ Gif and only if
x∈G∩(∂G)(kVkut) and
n Φ(t, x)
,Φ(t, x)−p Φ(t, x)
RN
≥0.
Next, write
n Φ(t, x)
,Φ(t, x)−p Φ(t, x)
RN
=
n Φ(t, x)
,Φ(t, x)−x
RN
+
n Φ(t, x)
, x−p(x)
RN
+
n Φ(t, x)
, p(x)−p Φ(t, x)
RN
. Because |x−p(x)| ∨ |Φ(t, x)−x| ≤ kVkut, we have that
n Φ(t, x)
,Φ(t, x)−x
RN
=t
n Φ(t, x) , V(x)
RN
+O(t2) =t
n p(x)
, V p(x)
RN
+O(t2)
and
n Φ(t, x)
, x−p(x)
RN
=
n p(x)
, x−p(x)
RN
+O(t2).
At the same time, because p Φ(t, x)
∈ ∂G for all t ∈ [0, T] and therefore
d
dtp Φ(t, x)
t=0∈Tp(x)(∂G),
n Φ(t, x)
, p(x)−p Φ(t, x)
RN
=O(t2).
Thus, we now know that, fort∈[0, T],x∈Gand Φ(t, x)∈/ Gif and only if 0<
n p(x)
, x−p(x)
RN
+t
n p(x)
, V p(x)
RN
+E(t, x), where |E(t, x)| ≤Ct2 for someC <∞.
Now let (U,Ψ) be a coordinate chart for ∂G, and define the associated diffeomorphism ˜Ψ on the open set ˜U ⊆ RN as in Lemma 5.2.11. Given (u, t)∈U×[0, T], set
I(t, u) =n ξ: −t
n Ψ(u)
, V Ψ(u)
RN
−E t,Ψ(u, ξ)˜
< ξ <0o . Then for sufficiently smallt >0, the preceding says that ˜Ψ(u, ξ)∈G∩Ψ( ˜˜ U) and Φ t,Ψ(u, ξ)˜
∈/Gif and only ifu∈U andξ∈I(t, u). Hence, if Γ∈ B∂G
with Γ⊆Ψ(U) and ˜Γ ={x∈(∂G)(ρ): p(x)∈Γ}, then, by Theorems 5.2.2 and 4.1.6 and (5.2.13),
Z
G∩˜Γ
1G{ Φ(t, x) dx=
Z
Ψ−1(Γ)
Z
I(u,t)
JΨ(u, ξ)˜ dξ
! du
=t Z
Ψ−1(Γ)
n Ψ(u)
, V Ψ(u)+
RN
JΨ(u)du+O(t2) for sufficiently smallt∈(0, T]. After an obvious covering argument, this leads (cf. (5.2.17)) to the conclusion that
lim
t&0
1 t
Z
G
1G{ Φ(t, x) dx=
Z
∂G
n(x), V(x)+
RNλ∂G(dx).
By essentially the same argument, we also have that
t&0lim 1 t
Z
G{
1G Φ(t, x) dx=
Z
∂G
n(x), V(x)−
RNλ∂G(dx) and therefore, by (∗), that
lim
t&0
1 t
Z
1G(x)dx− Z
1G Φ(t, x) dx
= Z
∂G
n(x), V(x)
RNλ∂G(dx).
By combining the preceding calculation with the earlier one, the one that was based on (5.3.5), we arrive at the following statement.
Theorem 5.3.7 (Divergence Theorem). Let G be a bounded, smooth region in RN andV :RN −→ RN a twice continuously differentiable vector field with uniformly bounded first derivative. Then
Z
G
div(V)(x)dx= Z
∂G
n(x), V(x)
RNλ∂G(dx), where n(x)is the outward pointing unit normal to ∂Gatx∈∂G.
5 Changes of Variable
There are so many applications of Theorem 5.3.7 that it is hard to choose among them. However, here is one that is particularly useful. In its statement, LV is the directional derivative operator PN
i=1Vi∂xi determined by V, and L>V is the corresponding formal adjoint operator given by
L>Vf =−
N
X
i=1
∂xi(f Vi) =−LVf −fdiv(V).
Corollary 5.3.8. Referring to the preceding, one has Z
G
f LVg dλRN = Z
G
gL>Vf dλRN+ Z
∂G
f g n, V
RNdλ∂G
for allf, g∈Cb2(RN;R).
Proof: Simply observe that div(f gV) =gLVf−f L>Vg, and apply Theorem 5.3.7 to the vector fieldf gV.
Exercises for §5.3
Exercise 5.3.9. Let G and V be as in Theorem 5.3.7. Choose ρ > 0, p: (∂G)(ρ) −→ ∂G, n : (∂G)(ρ) −→ SN−1, and ξ : (∂G)(ρ) −→ (−ρ, ρ) as in Lemma 5.3.6; and assume that V(p),n(p)
RN = 0 for allp∈∂G.
(i) Show that∇ξ=non (∂G)(ρ).
Hint: From x = p(x) +ξ(x)n(x), show that ei = ∂xip+ (∂xiξ)n+ξ∂xin, where (ei)j =δi,j. Show that ∂xip(x) ∈ Tp(x)(M) and that ∂xin,n
RN =
1
2∂xi|n(x)|2= 0, and conclude thatn(x)i=∂xiξ.
(ii) Show that there is aC <∞ such that
V(x),n(p(x))
RN
≤C|ξ(x)|
for allx∈(∂G)(ρ).
(iii) Show that there is a T >0 such that Φ(t, p)∈ ∂G(ρ) for all p∈∂G and |t| ≤T. Next, set u(t, p) =ξ◦Φ(t, p) for |t| ≤ T, and show that there exists aC <∞such that|u(t, p)| ≤˙ C|u(t, p)|and therefore that Φ(t, p)∈∂G for allp∈∂Gand|t| ≤T.
Hint: Use induction onn≥0 to show that|u(t, p)| ≤ ρ(Ct)n! n for|t| ≤T. (iv) Show that Φ(t, x)∈∂Gfor allt∈Rif Φ(s, x)∈∂Gfor somes∈R, and use this to conclude that, depending on whetherx∈Gorx /∈G, Φ(t, x)¯ ∈G or Φ(t, x)∈/ G¯ for allt≥0.
Hint: Use (iii) and the flow property Φ(s+t, p) = Φ t,Φ(s, p) .
(v) Under the additional assumption that div(V) = 0 on G, show that (Φ(t,·)G)∗λG =λG for allt∈R, where λG =λRN BG.
Exercise 5.3.10. Use ∆ =PN
i=1∂x2i to denote the EuclideanLaplacianon RN. Given a pair of functions u, v ∈ Cb2(RN;RN) and a bounded smooth regionGinRN, prove Green’s formula:
Z
G
u∆v−v∆u
dλRN = Z
∂G
u n,∇v
RN −v n,∇u
RN
dλ∂G, where ndenotes the outward pointing unit normal. In particular,
Z
G
∆u dλRN = Z
∂G
n,∇u
RNdλ∂G. Hint: Note that u∆v−v∆u= div u∇v−v∇u
.
Exercise 5.3.11. TakeN = 2, and assume that∂Gis a closed, simple curve in the sense that there is aγ∈C2 [0,1];R2
with the properties that t∈[0,1)7−→γ(t)∈∂Gis an injective surjection,
γ(0) =γ(1), γ(0) = ˙˙ γ(1), ¨γ(0) = ¨γ(1), and|γ(t)|˙ >0 fort∈[0,1].
(i) Show that Z
∂G
ϕ λ∂G= Z
[0,1]
ϕ◦γ(t)|γ(t)|˙ dt for all bounded measurableϕon∂G.
(ii) Letn(t) denote the outer unit normal toGatγ(t), check that n(t) =±|γ(t)|˙ −1 γ˙2(t),−γ˙1(t)
,
with the same sign for allt∈[0,1), and assume thatγhas been parametrized so that the plus sign is the correct one. Next, suppose thatu, v∈Cb2 R2;R
, and set f =u+iv. If∂z¯≡ 12(∂x+i∂y), show that
2i Z
G
∂z¯f dλR2= Z 1
0
f γ(t) dz(t),
where z(t) ≡ γ1(t) +iγ2(t) and dz(t) = ˙z(t)dt. When f is analytic in G, the Cauchy–Riemann equations imply that ∂¯zf = 0 there, and the preceding becomes the renownedCauchy Integral Theorem. See Exercise 5.3.14 for a continuation of this exercise.
Hint: Check that 2∂z¯f = div(V) +idiv(W), where V =
u
−v
and W = v
u
.
Now apply the Divergence Theorem, and check that (V,n)R2 +i(W,n)R2 =
−ifz.˙
5 Changes of Variable
Exercise 5.3.12. Suppose thatu∈Cb2 RN;R) and that ∆u= 0 onBRN(x, R).
The goal of this exercise is to prove themean-value property
(5.3.13) u(x) = 1
ωN−1 Z
SN−1
u(x+Rω)λSN−1(dω),
and the first step is to show that it suffices to handle the case in whichx is the origin. Second, observe that there is hardly anything to do whenN = 1, since in that case there exist a, b ∈R for which u(x) =ax+b for |x| ≤ R, and so u(0) = u(−R)+u(R)2 . Thus, assume thatN ≥2, and define
gN(r) =
log1r ifN = 2
(N−2)−1r2−N ifN ≥3 forr >0, and, for >0, set
v(x) =gN
p2+|x|2
−gN
p2+R2 .
(i) Given 0< r < R, setGr=BRN(0, R)\BRN(0, r), and show that Z
∂Gr
n,∇v
RNu− n,∇u
RNv
dλ∂Gr
= r
√2+r2 NZ
SN−1
u(rω)λSN−1(dω)
−
R
√2+R2 NZ
SN−1
u(Rω)λSN−1(dω) +rN−1A(r, R)
Z
SN−1
ω,∇u(rω)
RNλSN−1(dω), where A(r, R)≡gN √
2+R2
−gN √
2+r2 .
(ii) Using the fact that ∆w(x) = ϕ00(|x|) + (N−1)ϕ0|x|(|x|) if w(x) = ϕ(|x|), show that
∆v(x) =− N 2 (2+|x|2)1+N2 ,
and combine this with (cf. Exercise 5.3.10) Green’s formula and (i) to conclude that
N 2 Z
Gr
u(x)
(2+|x|2)1+N2 λRN(dx)
= r
√ 2+r2
NZ
SN−1
u(rω)λSN−1(dω)
−
R
√2+R2 NZ
SN−1
u(Rω)λSN−1(dω) +rN−1A(r, R)
Z
SN−1
ω,∇u(rω)
RNλSN−1(dω).
Now let&0 and thenr&0 to arrive at (5.3.13).
Exercise 5.3.14. Refer to the setting in Exercise 5.3.11, especially part (ii), and use f(z) to denote f(x, y) whenz =x+iy. The goal of this exercise is to show that
2i Z
G
∂¯zf(z)
z−ζ dλR2 = Z 1
0
f z(t)
z(t) dz(t)−2πif(ζ) forζ∈G.
In particular, when f is analytic in G, this proves the Cauchy Integral Formula.
(i) First reduce to the case 0∈Gand ζ= 0. Second, show that∂z¯f(z)z =
∂z¯f(z)
z forz6= 0.
(ii) Define
η(z) =
1
z if|z|>1
1
z 1−16(1− |z|)44
if 12 <|z| ≤1 0 if|z| ≤ 12, and check thatη∈Cb2 R2; [0,1]
.
(iii) Given r >0 with B(0, r)⊂⊂ G, set fr(z) =η(r−1z)f(z), and apply parts (i) here and (ii) of Exercise 5.3.11 tofr to see that
2i Z
G\B(0,r)
∂z¯f(z)
z dλR2 = Z
∂G
f z(t)
z(t) dz(t)−2πi Z 1
0
f rei2πt dt.
Finally, let r&0.
chapter
Basic Inequalities and Lebesgue Spaces
I have already introduced (cf. §§3.1.2 and 3.2.3) the vector space L1(µ;R) with the norm1 k · kL1(µ;R) and shown it to be aBanach space: that is, a normed vector space that is complete with respect to the metric determined by its norm. Although, from the measure-theoretic point of view, L1(µ;R) is an obvious space with which to deal, from a geometric standpoint, it is flawed. To understand its flaw, consider the two point space E={1,2}and the measureµthat assigns measure 1 to both points. ThenL1(µ;R) is easily identified with R2, and the length thatk · kL1(µ;R)assigns x= (x1, x2)∈R2 is |x1|+|x2|. Hence, the unit ball in this space is the equilateral diamond whose center is the origin and whose vertices lie on the coordinate axes, and, as such, its boundary has nasty corners. For this reason, it is reasonable to ask whether there are measure-theoretically natural Banach spaces that have better geometric properties.
In the first part of this chapter I will develop a few inequalities that will allow me in the second part to introduce the sort of Banach spaces alluded to in the preceding. Once I have done so, I will conclude with a cursory presentation of results about the boundedness properties of linear maps between these spaces.
§6.1 Jensen, Minkowski, and H¨older
In this section I will derive some inequalities that generalize the inequalities, like the triangle inequality, which are familiar in the Euclidean context.
Since all the inequalities here are consequences of convexity considerations, I will begin by reviewing a few elementary facts about convex sets and concave functions on them. LetV be a real or complex vector space. A subsetC⊆V is said to be convexif (1−α)x+αy∈C wheneverx, y∈C and α∈[0,1].
Given a convex setC⊆V,g:C−→Ris said to be aconcave functionon C if
g (1−α)x+αy
≥(1−α)g(x) +αg(y) for all x, y∈Cand α∈[0,1].
1Given a vector spaceV, a normk · konV is a non-negative map with the properties that kvk= 0 if and only ifv= 0,kαvk=|α|kvkfor allα∈Randv∈V, andkv+wk ≤ kvk+kwk for allv, w∈V. The metric onV determined by the normk · kis the one for whichkw−vk gives the distance betweenvandw.
DOI 10.1007/978-1-4614-1135-2_6, © Springer Science+Business Media, LLC 2011,
D.W. Stroock Essentials of Integration Theory f Analysis, Graduate Texts in Mathematicsor 262, 146
Note that g is concave on C if and only if
(x, t)∈ C×R: t ≤ g(x) is a convex subset of V ⊕R. In addition, one can use induction on n≥2 to see
that n
X
1
αkyk∈C and g
n
X
1
αkyk
!
≥
n
X
1
αkg(yk) for all n ≥ 2, {y1, . . . , yn} ⊆ C and {α1, . . . , αn} ⊆ [0,1] with Pn
1αk = 1.
Namely, if n= 2 orαn ∈ {0,1}, then there is nothing to do. On the other hand, ifn≥3 andαn∈(0,1), setx= (1−αn)−1Pn−1
k=1αkyk, and, assuming the result for n−1, conclude that
g
n
X
1
αkyk
!
=g (1−αn)x+αnyn
≥(1−αn)g
n−1
X
k=1
αk(1−αn)−1yk
!
+αng(yn)≥
n
X
k=1
αkg(yk).
The essence of the relationship between these notions and measure theory is contained in the following.
Theorem 6.1.1 (Jensen’s inequality). LetC be a closed, convex subset ofRN, and suppose thatg is a continuous, concave, non-negative function on C. If (E,B, µ)is a probability space andF:E−→C a measurable function on(E,B)with the property that|F| ∈L1(µ;R), then
Z
E
Fdµ≡
R
EF1dµ ... R
EFNdµ
∈C and
Z
E
g◦Fdµ≤g Z
E
Fdµ
. (See Exercise 6.1.9 for another derivation.)
Proof: First assume thatFis simple. Then F=Pn
k=0yk1Γk for somen∈ Z+,y0, . . . , yn ∈C, and cover{Γ0, . . . ,Γn}ofEby mutually disjoint elements ofB. Thus, sincePn
0µ(Γk) = 1 andC is convex,R
EFdµ=Pn
0ykµ(Γk)∈C and, becausegis concave and Pn
k=0µ(Γk) = 1, g
Z
E
Fdµ
=g n
X
k=0
ykµ(Γk)
≥
n
X
k=0
g(yk)µ(Γk) = Z
E
g◦Fdµ.
Now let F be general. The idea is to approximate F byC-valued simple functions. For this purpose, choose and fix some element y0 of C, and let
6 Basic Inequalities and Lebesgue Spaces
{yk : k≥1} be a dense sequence inC. Givenm∈Z+, choose Rm>0 and nm∈Z+ for which
Z
{|F|≥Rm}
|F|+|y0|
dµ≤ 1
m and C∩B(0, Rm)⊆
nm
[
k=1
B yk,m1 . Next, set Γm,0=
ξ∈E:|F(ξ)| ≥Rm , and use induction to define Γm,`=
( ξ∈E\
`−1
[
k=0
Γm,k:F(ξ)∈B y`,m1 )
for 1≤`≤nm. Finally, setFm=Pnm
k=0yk1Γm,k.
By construction, theFm’s are simple andC-valued. Hence, by the preced- ing,
Z
E
Fmdµ∈C and g Z
E
Fmdµ
≥ Z
E
g◦Fmdµ for eachm∈Z+. Moreover, since|F−Fm| ≤ m1 onSnm
1 Γm,`=E\Γm,0, Z
E
F−Fm
dµ=
nm
X
`=0
Z
Γm,`
F−Fm
dµ≤ 1 m +
Z
Γm,0
|F|+|y0|
dµ≤ 2 m. Thus,
|Fm−F|
L1(µ;
R)−→0 as m→ ∞; and so, becauseC is closed, we now see that R
EFdµ ∈C. At the same time, because (cf. Exercise 3.2.17) g is continuous, g◦Fm −→ g◦F in µ-measure as m → ∞. Hence, by the version of Fatou’s Lemma in Theorem 3.2.12,
Z
E
g◦Fdµ≤ lim
m→∞
Z
E
g◦Fmdµ≤ lim
m→∞
g Z
E
Fmdµ
=g Z
E
Fdµ
.
In order to apply Jensen’s inequality, we need to develop a criterion for recognizing when a function is concave. Such a criterion is contained in the next theorem. Recall that theHessian matrixHg(x) of a functiong that is twice continuously differentiable at xis the symmetric matrix given by
Hg(x)≡
∂2g
∂xi∂xj(x)
1≤i,j≤N
.
Also, a symmetric, real N×N matrixAis said to be non-positive definite if all of its eigenvalues are non-positive, or, equivalently, if ξ, Aξ
RN ≤0 for all ξ∈RN.
Lemma 6.1.2. Suppose that U is an open, convex subset of RN, and set C =U. ThenC is also convex. Moreover, if g :C −→R is continuous and g U is concave, then g is concave on all of C. Finally, if g : C −→ R is continuous and g U is twice continuously differentiable, then g is concave onC if and only if its Hessian matrix is non-positive definite for eachx∈U.
Proof: The convexity ofC is obvious. In addition, ifgU is concave, the concavity of g on C follows trivially by continuity. Thus, what remains to show is that if g : U −→ R is twice continuously differentiable, then g is concave onU if and only if its Hessian is non-positive definite at eachx∈U.
In order to prove thatgis concave onU ifHg(x) is non-positive definite at every x∈U, we will use the following simple result about functions on the interval [0,1]. Namely, suppose that u∈C2 [0,1];R
, u(0) = 0 = u(1), and u00≤0. Thenu≥0. To see this (cf. Exercise 6.1.10 for another approach), let >0 be given, and consider the functionu≡u+t(1−t). Clearly it is enough to show that u ≥0 on [0,1] for every > 0. Note that u(0) = u(1) = 0 and u00(t)<0 for every t∈[0,1]. On the other hand, if u(t)<0 for some t∈[0,1], then there is ans∈(0,1) at whichuachieves its absolute minimum.
But this is impossible, since then, by the second derivative test, we would have that u00(s)≥0.
Now assume that Hg(x) is non-positive definite for every x ∈ U. Given x, y∈U, defineu(t) =g((1−t)x+ty)−(1−t)g(x)−tg(y) fort∈[0,1]. Then u(0) =u(1) = 0 and
u00(t) =
y−x, Hg((1−t)x+ty)(y−x)
RN
≤0
for every t∈ [0,1]. Hence, by the preceding paragraph, u≥0 on [0,1]; and so g((1−t)x+ty)≥(1−t)g(x) +tg(y) for all t∈[0,1]. In other words,g is concave onU and therefore onC.
To complete the proof, suppose that Hg(x) has a positive eigenvalue for some x ∈ U. We can then find an e ∈ SN−1 and an > 0 such that
e, Hg(x)e
RN >0 andx+te∈U for allt∈(−, ). Setu(t) =g(x+te) for t∈(−, ). Thenu00(0) = e, Hg(x)e
RN >0. On the other hand, u00(0) = lim
t→0
u(t) +u(−t)−2u(0)
t2 ,
and, ifg were concave, 2u(0) = 2u
t−t 2
= 2g
1
2(x+te) +12(x−te)
≥g(x+te) +g(x−te) =u(t) +u(−t), from which we would get the contradiction 0< u00(0)≤0.
When N = 2, the following lemma provides a useful test for non-positive definiteness.
Lemma 6.1.3. LetA=a b
b c
be a real symmetric matrix. ThenAis non- positive definite if and only if both a+c ≤ 0 and ac ≥ b2. In particular, for each α ∈ (0,1), the functions (x, y) ∈ [0,∞)2 7−→ xαy1−α and (x, y) ∈ [0,∞)27−→ xα+yα1α
are continuous and concave.
6 Basic Inequalities and Lebesgue Spaces
Proof: In view of Lemma 6.1.2, it suffices to check the first assertion. To this end, let T =a+c be the trace and D =ac−b2 the determinant of A.
Also, letλandµdenote the eigenvalues ofA. Then,T =λ+µandD=λµ.
IfAis non-positive definite and thereforeλ∨µ≤0, then it is obvious that T ≤0 and that D ≥0. Conversely, If D >0, then either both λand µ are positive or both are negative. Hence if, in addition,T ≤0, thenλand µare negative. Finally, ifD = 0 and T ≤0, then either λ= 0 andµ=T ≤0 or µ= 0 andλ=T≤0.
My first application of these considerations provides a generalization, known as Minkowski’s inequality, of the triangle inequality.
Theorem 6.1.4 (Minkowski’s inequality). Let f1 and f2 be non-nega- tive, measurable functions on the measure space (E,B, µ). Then, for every p∈[1,∞),
Z
E
f1+f2p dµ
p1
≤ Z
E
f1pdµ 1p
+ Z
E
f2pdµ 1p
.
Proof: The case p = 1 follows from (3.1.10), and so we will assume that p∈ (1,∞). Also, without loss of generality, we will assume thatf1p and f2p areµ-integrable and thatf1 andf2are [0,∞)-valued.
Letp∈(1,∞) be given. If we assume thatµ(E) = 1 and takeα= 1p, then, by Lemma 6.1.3 and Jensen’s inequality,
Z
E
f1+f2
p
dµ= Z
E
h f1pα
+ f2pαiα1 dµ
≤ Z
E
f1pdµ α
+ Z
E
f2pdµ αα1
=
"Z
E
f1pdµ 1p
+ Z
E
f2pdµ 1p#p
.
More generally, ifµ(E) = 0 there is nothing to do, and if 0< µ(E)<∞ we can replace µ by µ(E)µ and apply the preceding. Hence, all that remains is the case µ(E) =∞. But ifµ(E) =∞, take En =
f1∨f2≥ 1n , note that µ(En) ≤npR
f1pdµ+npR
f2pdµ <∞, apply the preceding to f1, f2, and µ all restricted toEn, and letn→ ∞.
The next application, which is known as H¨older’s inequality, gives a generalization of the inner product inequality|(ξ, η)RN| ≤ |ξ||η|forξ, η∈RN. In the Euclidean context, this inequality can be seen as an application of the law of the cosine, which says that the inner product of vectors is the product of their lengths and the cosine of the angle between them.
Theorem 6.1.5 (H¨older’s inequality). Givenp∈(1,∞), define theH¨ol- der conjugatep0 of p∈(1,∞)by the equation 1p +p10 = 1(i.e., p0 = p−1p ).
Then, for every pair of non-negative, measurable functions f1 andf2 on the measure space(E,B, µ),
Z
E
f1f2dµ≤ Z
E
f1pdµ 1pZ
E
f2p0dµ p10
for every p∈(1,∞).
Proof: First note that if either factor on the right-hand side of the above inequality is 0, then f1f2 = 0 (a.e.,µ), and so the left-hand side is also 0.
Thus we will assume that both factors on the right are strictly positive, in which case, we may and will assume in addition that bothf1p andf2p0 are µ- integrable and thatf1andf2are both [0,∞)-valued. Also, just as in the proof of Minkowski’s inequality, we can reduce everything to the caseµ(E) = 1. But then we can apply Jensen’s inequality and Lemma 6.1.3 with α = 1p to see that
Z
E
f1f2dµ= Z
E
f1pα
f2p01−α
dµ≤ Z
E
f1pdµ αZ
E
f2p0dµ 1−α
= Z
E
f1pdµ 1pZ
E
f2p0dµ p10
.
Exercises for §6.1
Exercise 6.1.6. Here are a couple of easy applications of the preceding ideas.
(i) Show that log is continuous and concave on every interval [,∞) with > 0. Use this together with Jensen’s inequality to show that for every n∈Z+, µ1, . . . , µn∈(0,1) satisfyingPn
m=1µm= 1, anda1, . . . , an∈[0,∞),
n
Y
m=1
aµmm ≤
n
X
m=1
µmam.
In particular, when µm = n1 for every 1 ≤ m ≤ n, this yields a1· · ·ann1
≤ n1Pn
m=1am, which is the statement that the arithmetic mean dominates the geometric mean.
(ii) Letn∈Z+, and suppose thatf1, . . . , fn are non-negative, measurable functions on the measure space (E,B, µ). Givenp1, . . . , pn∈(1,∞) satisfying Pn
m=1 1
pm = 1, show that Z
E
f1· · ·fndµ≤
n
Y
m=1
Z
E
fmpmdµ pm1
.
6 Basic Inequalities and Lebesgue Spaces
Exercise 6.1.7. When p = 2, Minkowski’s and H¨older’s inequalities are intimately related and are both very simple to prove. Indeed, letf1andf2be bounded, measurable functions on the finite measure space (E,B, µ). Given any α6= 0, observe that
0≤ Z
E
αf1± 1 αf2
2
dµ=α2 Z
E
f12dµ±2 Z
E
f1f2dµ+ 1 α2
Z
E
f22dµ, from which it follows that
2 Z
E
f1f2dµ
≤t Z
E
f12dµ+1 t Z
E
f22dµ
for every t >0. If either integral on the right vanishes, show from the pre- ceding that R
Ef1f2dµ= 0. On the other hand, if neither integral vanishes, choose t >0 so that the preceding yields
(6.1.8)
Z
E
f1f2dµ
≤ Z
E
f12dµ 12Z
E
f22dµ 12
.
Hence, in any case, (6.1.8) holds. Finally, argue that one can remove the restriction that f1 and f2 be bounded, and then remove the condition that µ(E) <∞. In particular, even if they are not bounded, so long as f12 and f22 areµ-integrable, conclude thatf1f2must beµ-integrable and that (6.1.8) continues to hold.
Clearly (6.1.8) is the special case of H¨older’s inequality whenp= 2. Because it is a particularly significant case, it is often referred to by a different name and is called Schwarz’s inequality. Assuming that bothf12 and f22 are µ- integrable, show that the inequality in Schwarz’s inequality is an equality if and only if there exist (α, β)∈R2\ {0}such thatαf1+βf2= 0 (a.e.,µ).
Finally, use Schwarz’s inequality to obtain Minkowski’s inequality for the case p= 2. Notice the similarity between the development here and that of the classical triangle inequalityfor the Euclidean metric onRN.
Exercise 6.1.9. A geometric proof of Jensen’s inequality can be based on the following. Given a closed, convex subsetCofRN, show thatq /∈Cif and only if there is aeq ∈SN−1such that eq, q−x
RN >0 for allx∈C. Next, given a probability space (E,B, µ) and aµ-integrableF:E−→C, use the preceding to show that p≡R
Fdµ∈C. Finally, let g :C −→[0,∞) be a continuous, concave function, and use the first part to prove Jensen’s inequality. Here are some steps that you might want follow in proving Jensen’s inequality.
(i) Show that ifg1andg2are continuous, concave functions onC, then so is g1∧g2. In particular, ifg is a non-negative, continuous, concave function, then g∧nis also, and use this to reduce the proof of Jensen’s inequality to the case in whichg is bounded.
(ii) Assume thatg :C −→[0,∞) is a bounded, continuous, concave func- tion, and set ˆC ={(x, t)∈RN ×R : x∈C andt≤g(x)}. Show that ˆC is a closed, convex subset of RN+1. Next, define ˆF:E −→ Cˆ by ˆF = F
g◦F
, note that ˆFisµ-integrable, and apply the first part to see that itsµ-integral is an element of ˆC. Finally, notice that
Z Fˆdµ∈Cˆ =⇒ Z
g◦Fdµ≤g Z
Fdµ
. Exercise 6.1.10. Suppose that u∈C2 [0,1];R
satisfies u(0) = 0 =u(1).
The goal of this exercise is to show that
(∗) u(t) =−
Z
[0,1]
(s∧t−st)u00(s)ds fort∈[0,1].
In particular, ifu00≤0, thenu≥0.
(i) Use integration by parts to show that u(t) =tu0(0) +
Z
[0,t]
(t−s)u00(s)ds fort∈[0,1].
(ii) Using (i), show thatu0(0) =−R
[0,1](1−s)u00(s)dsand therefore that (∗) holds.
§6.2 The Lebesgue Spaces
In the first part of this section I will introduce and briefly discuss the standard Lebesgue spacesLp(µ;R). In the second part, I will look at mixed Lebesgue spaces, one of the many useful variations on the standard ones.
§6.2.1. TheLp-Spaces: Given a measure space (E,B, µ) and ap∈[1,∞), define
kfkLp(µ;R)= Z
E
|f|pdµ 1p
for measurable functions f on (E,B). Also, iff is a measurable function on (E,B) define
kfkL∞(µ;R)= inf
M ∈[0,∞] :|f| ≤M (a.e.,µ) .
Although information about f can be gleaned from a study of kfkLp(µ;R)
as pchanges (for example, spikes in f will be emphasized by takingpto be large), all these quantities share the same flaw as kfkL1(µ;R): they cannot detect properties off that occur on sets havingµ-measure 0. Thus, before we can hope to use any of them to put a metric on measurable functions, we must invoke the same subterfuge that I introduced in§3.1.2 in connection with the space L1(µ;R). That is, for p∈ [1,∞], denote byLp(µ;R) =Lp(E,B, µ;R)