Mass Transport - The Divergence Theorem

Chapter 5 Changes of Variable

5.3 The Divergence Theorem

5.3.2. Mass Transport

To be precise, if one interprets R

1G(x)dx−R

1G◦Φ(t, x)dxas the net loss or gain due to the flow at timet, then (5.3.5) says thatR

Gdiv V(x)

dxis the rate of loss or gain at timet= 0. On the other hand, there is another way in which to think about this computation. Namely,

(∗)

1G(x)dx− Z

1G◦Φ(t, x)dx

= Z

1_G_{ Φ(t, x) dx−

G^{

1_G Φ(t, x) dx,

which indicates that one should be able to do the same calculation by ob- serving how much mass has moved in each direction across the boundary of G during the time interval [0, t]. To carry out this approach, I will assume that G is a non-empty, bounded open set that is a smooth region in the sense that for each p∈∂Gthere exist an open neighborhood W 3pand an F ∈C³(W;R) such that|∇F| >0 and G∩W ={x∈W : F(x)< 0}. In particular,∂Gis a compact hypersurface and, forx∈W∩∂G, _|∇F|^∇F (x) is the outward pointing unit normalto∂Gatx.

A key role will be played by the following application of the ideas in§5.2.2.

Lemma 5.3.6. Assume that Gis a bounded, smooth region in R^N. Then there exist aρ >0and twice continuously differentiable mapsn: (∂G)^(ρ)−→

S^N−1, p : (∂G)^(ρ) −→ ∂G, and ξ : (∂G)^(ρ) −→ (−ρ, ρ) such that, for each x∈G^(ρ),p(x)is the one and onlyp∈∂Gsatisfying|x−p|=|x−∂G|,n(x) = n p(x)

⊥Tp(x)∂G,ξ(x)<0 ⇐⇒ x∈G∩(∂G)^(ρ), andx=p(x) +ξ(x)n(x).

Proof: Given q ∈∂G, choose an open W 3 q and aF_q ∈ C³(W;R) such that |∇Fq| >0 and G∩W = {Fq <0}. By Lemma 5.2.11, there exists an r_q > 0 withB_RN(q,3r_q) ⊂⊂ W and twice continuously differentiable maps (cf. (5.2.8))

pq : ∂G∩B_RN(q,3rq)

(3rq)−→∂G∩B_RN(q,3rq) andξq : ∂G∩B_RN(q,3rq)

(3rq)−→(−3rq,3rq) such that, for each x∈ ∂G∩B_RN(q,3r_q)

(3r_q), p_q(x) +ξ_q(x)_|∇F^∇F^q

q| p_q(x) is the one and only way to write xin the formp+ξ_|∇F^∇F_|(p) withp∈∂G∩ B_RN(q,3r_q) and |ξ| < 3r_q. Now choose q₁, . . . , q_M ∈ ∂G such that ∂G ⊆ SM

m=1B_RN(qm, rm), where rm ≡ rq_m, and set ρ = r1∧ · · · ∧rM. If x ∈

∂G∩B_RN(q_m, r_m)

(ρ), then, for any p∈∂Gsatisfying|x−p|< ρ,

|p−q_m| ≤ |p−x|+|x−q_m|<3ρ≤3r_m.

Hence p_q_m(x) is the one and only p∈∂Gfor which|p−x|< ρandx−p⊥ T_p(∂G), and so we can definex p(x),x ξ(x), andx n(x) on∂G^(ρ)by

5 Changes of Variable taking p(x) =pq_m(x), ξ(x) =ξq_m(x), andn(x) = _|∇F^∇F^qm

qm| pq_m(x)

whenx∈ (∂G∩B_RN(qm, rm)

(ρ). Finally, to check that|x−p(x)|=|x−∂G|, suppose that p∈∂G and that |p−x|=|x−∂G|. Then, for anyγ : (−, )−→∂G withγ(0) =p,

2 x−p,γ(0)˙

R^N = d

dt|x−γ(t)|²

_t=0= 0, and so x−p⊥T_p(∂G), which is possible only if p=p(x).

Returning to the problem posed earlier, begin with the observation that, because|Φ(t, x)−x| ≤ kVkut,x∈Gand Φ(t, x)∈/ Gonly if|x−∂G| ≤ kVkut.

Thus ifρis taken as in Lemma 5.3.6 andT >0 is chosen such thatkVkuT < ρ, then, in the notation of that lemma, we know that, for each t∈[0, T],x∈G and Φ(t, x)∈/ Gif and only if

x∈G∩(∂G)^(kV^k^u^t) and

n Φ(t, x)

,Φ(t, x)−p Φ(t, x)

R^N

≥0.

Next, write

n Φ(t, x)

,Φ(t, x)−p Φ(t, x)

R^N

n Φ(t, x)

,Φ(t, x)−x

R^N

n Φ(t, x)

, x−p(x)

R^N

n Φ(t, x)

, p(x)−p Φ(t, x)

R^N

. Because |x−p(x)| ∨ |Φ(t, x)−x| ≤ kVkut, we have that

n Φ(t, x)

,Φ(t, x)−x

R^N

n Φ(t, x) , V(x)

R^N

+O(t²) =t

n p(x)

, V p(x)

R^N

+O(t²)

and

n Φ(t, x)

, x−p(x)

R^N

n p(x)

, x−p(x)

R^N

+O(t²).

At the same time, because p Φ(t, x)

∈ ∂G for all t ∈ [0, T] and therefore

dtp Φ(t, x)

_t=0∈T_p(x)(∂G),

n Φ(t, x)

, p(x)−p Φ(t, x)

R^N

=O(t²).

Thus, we now know that, fort∈[0, T],x∈Gand Φ(t, x)∈/ Gif and only if 0<

n p(x)

, x−p(x)

R^N

n p(x)

, V p(x)

R^N

+E(t, x), where |E(t, x)| ≤Ct² for someC <∞.

Now let (U,Ψ) be a coordinate chart for ∂G, and define the associated diffeomorphism ˜Ψ on the open set ˜U ⊆ R^N as in Lemma 5.2.11. Given (u, t)∈U×[0, T], set

I(t, u) =n ξ: −t

n Ψ(u)

, V Ψ(u)

R^N

−E t,Ψ(u, ξ)˜

< ξ <0o . Then for sufficiently smallt >0, the preceding says that ˜Ψ(u, ξ)∈G∩Ψ( ˜˜ U) and Φ t,Ψ(u, ξ)˜

∈/Gif and only ifu∈U andξ∈I(t, u). Hence, if Γ∈ B∂G

with Γ⊆Ψ(U) and ˜Γ ={x∈(∂G)^(ρ): p(x)∈Γ}, then, by Theorems 5.2.2 and 4.1.6 and (5.2.13),

G∩˜Γ

1_G{ Φ(t, x) dx=

Ψ⁻¹(Γ)

I(u,t)

JΨ(u, ξ)˜ dξ

! du

=t Z

Ψ⁻¹(Γ)

n Ψ(u)

, V Ψ(u)⁺

R^N

JΨ(u)du+O(t²) for sufficiently smallt∈(0, T]. After an obvious covering argument, this leads (cf. (5.2.17)) to the conclusion that

lim

t&0

1 t

1_G_{ Φ(t, x) dx=

∂G

n(x), V(x)⁺

R^Nλ_∂G(dx).

By essentially the same argument, we also have that

t&0lim 1 t

1G Φ(t, x) dx=

∂G

n(x), V(x)⁻

R^Nλ∂G(dx) and therefore, by (∗), that

lim

t&0

1 t

1_G(x)dx− Z

1_G Φ(t, x) dx

= Z

∂G

n(x), V(x)

R^Nλ_∂G(dx).

By combining the preceding calculation with the earlier one, the one that was based on (5.3.5), we arrive at the following statement.

Theorem 5.3.7 (Divergence Theorem). Let G be a bounded, smooth region in R^N andV :R^N −→ R^N a twice continuously differentiable vector field with uniformly bounded first derivative. Then

div(V)(x)dx= Z

∂G

n(x), V(x)

R^Nλ∂G(dx), where n(x)is the outward pointing unit normal to ∂Gatx∈∂G.

5 Changes of Variable

There are so many applications of Theorem 5.3.7 that it is hard to choose among them. However, here is one that is particularly useful. In its statement, LV is the directional derivative operator PN

i=1Vi∂x_i determined by V, and L^>_V is the corresponding formal adjoint operator given by

L^>_Vf =−

i=1

∂_x_i(f V_i) =−LVf −fdiv(V).

Corollary 5.3.8. Referring to the preceding, one has Z

f LVg dλ_RN = Z

gL^>_Vf dλ_RN+ Z

∂G

f g n, V

R^Ndλ∂G

for allf, g∈C_b²(R^N;R).

Proof: Simply observe that div(f gV) =gLVf−f L^>_Vg, and apply Theorem 5.3.7 to the vector fieldf gV.

Exercises for §5.3

Exercise 5.3.9. Let G and V be as in Theorem 5.3.7. Choose ρ > 0, p: (∂G)^(ρ) −→ ∂G, n : (∂G)^(ρ) −→ S^N⁻¹, and ξ : (∂G)^(ρ) −→ (−ρ, ρ) as in Lemma 5.3.6; and assume that V(p),n(p)

R^N = 0 for allp∈∂G.

(i) Show that∇ξ=non (∂G)^(ρ).

Hint: From x = p(x) +ξ(x)n(x), show that e_i = ∂_x_ip+ (∂_x_iξ)n+ξ∂_x_in, where (e_i)_j =δ_i,j. Show that ∂_x_ip(x) ∈ T_p(x)(M) and that ∂_x_in,n

R^N =

2∂x_i|n(x)|²= 0, and conclude thatn(x)i=∂x_iξ.

(ii) Show that there is aC <∞ such that

V(x),n(p(x))

R^N

≤C|ξ(x)|

for allx∈(∂G)^(ρ).

(iii) Show that there is a T >0 such that Φ(t, p)∈ ∂G^(ρ) for all p∈∂G and |t| ≤T. Next, set u(t, p) =ξ◦Φ(t, p) for |t| ≤ T, and show that there exists aC <∞such that|u(t, p)| ≤˙ C|u(t, p)|and therefore that Φ(t, p)∈∂G for allp∈∂Gand|t| ≤T.

Hint: Use induction onn≥0 to show that|u(t, p)| ≤ ^ρ(Ct)_n! ⁿ for|t| ≤T. (iv) Show that Φ(t, x)∈∂Gfor allt∈Rif Φ(s, x)∈∂Gfor somes∈R, and use this to conclude that, depending on whetherx∈Gorx /∈G, Φ(t, x)¯ ∈G or Φ(t, x)∈/ G¯ for allt≥0.

Hint: Use (iii) and the flow property Φ(s+t, p) = Φ t,Φ(s, p) .

(v) Under the additional assumption that div(V) = 0 on G, show that (Φ(t,·)G)_∗λ_G =λ_G for allt∈R, where λ_G =λ_RN BG.

Exercise 5.3.10. Use ∆ =PN

i=1∂_x²_i to denote the EuclideanLaplacianon R^N. Given a pair of functions u, v ∈ C_b²(R^N;R^N) and a bounded smooth regionGinR^N, prove Green’s formula:

u∆v−v∆u

dλ_RN = Z

∂G

u n,∇v

R^N −v n,∇u

R^N

dλ∂G, where ndenotes the outward pointing unit normal. In particular,

∆u dλ_RN = Z

∂G

n,∇u

R^Ndλ_∂G. Hint: Note that u∆v−v∆u= div u∇v−v∇u

Exercise 5.3.11. TakeN = 2, and assume that∂Gis a closed, simple curve in the sense that there is aγ∈C² [0,1];R²

with the properties that t∈[0,1)7−→γ(t)∈∂Gis an injective surjection,

γ(0) =γ(1), γ(0) = ˙˙ γ(1), ¨γ(0) = ¨γ(1), and|γ(t)|˙ >0 fort∈[0,1].

(i) Show that Z

∂G

ϕ λ∂G= Z

[0,1]

ϕ◦γ(t)|γ(t)|˙ dt for all bounded measurableϕon∂G.

(ii) Letn(t) denote the outer unit normal toGatγ(t), check that n(t) =±|γ(t)|˙ ⁻¹ γ˙2(t),−γ˙1(t)

with the same sign for allt∈[0,1), and assume thatγhas been parametrized so that the plus sign is the correct one. Next, suppose thatu, v∈C_b² R²;R

, and set f =u+iv. If∂z¯≡ ¹₂(∂x+i∂y), show that

2i Z

∂_z_¯f dλ_R2= Z 1

f γ(t) dz(t),

where z(t) ≡ γ₁(t) +iγ₂(t) and dz(t) = ˙z(t)dt. When f is analytic in G, the Cauchy–Riemann equations imply that ∂_¯_zf = 0 there, and the preceding becomes the renownedCauchy Integral Theorem. See Exercise 5.3.14 for a continuation of this exercise.

Hint: Check that 2∂_z_¯f = div(V) +idiv(W), where V =

−v

and W = v

Now apply the Divergence Theorem, and check that (V,n)_R² +i(W,n)_R² =

−ifz.˙

5 Changes of Variable

Exercise 5.3.12. Suppose thatu∈C_b² R^N;R) and that ∆u= 0 onB_RN(x, R).

The goal of this exercise is to prove themean-value property

(5.3.13) u(x) = 1

ω_N−1 Z

S^N−1

u(x+Rω)λ_SN−1(dω),

and the first step is to show that it suffices to handle the case in whichx is the origin. Second, observe that there is hardly anything to do whenN = 1, since in that case there exist a, b ∈R for which u(x) =ax+b for |x| ≤ R, and so u(0) = ^u(−R)+u(R)₂ . Thus, assume thatN ≥2, and define

gN(r) =

log¹_r ifN = 2

(N−2)⁻¹r^2−N ifN ≥3 forr >0, and, for >0, set

v(x) =gN

p²+|x|²

−gN

p²+R² .

(i) Given 0< r < R, setGr=B_RN(0, R)\B_RN(0, r), and show that Z

∂G_r

n,∇v

R^Nu− n,∇u

R^Nv

dλ∂G_r

= r

√²+r² ^NZ

S^N−1

u(rω)λ_SN−1(dω)

−

√²+R² ^NZ

S^N−1

u(Rω)λ_SN−1(dω) +r^N⁻¹A(r, R)

S^N−1

ω,∇u(rω)

R^Nλ_SN−1(dω), where A(r, R)≡g_N √

²+R²

−g_N √

²+r² .

(ii) Using the fact that ∆w(x) = ϕ⁰⁰(|x|) + (N−1)^ϕ⁰_|x|^(|x|) if w(x) = ϕ(|x|), show that

∆v(x) =− N ² (²+|x|²)¹⁺^N² ,

and combine this with (cf. Exercise 5.3.10) Green’s formula and (i) to conclude that

N ² Z

u(x)

(²+|x|²)¹⁺^N² λ_RN(dx)

= r

√ ²+r²

S^N−1

u(rω)λ_S^N−1(dω)

−

√²+R² ^NZ

S^N−1

u(Rω)λ_SN−1(dω) +r^N−1A(r, R)

S^N−1

ω,∇u(rω)

R^Nλ_S^N−1(dω).

Now let&0 and thenr&0 to arrive at (5.3.13).

Exercise 5.3.14. Refer to the setting in Exercise 5.3.11, especially part (ii), and use f(z) to denote f(x, y) whenz =x+iy. The goal of this exercise is to show that

2i Z

∂¯zf(z)

z−ζ dλ_R2 = Z 1

f z(t)

z(t) dz(t)−2πif(ζ) forζ∈G.

In particular, when f is analytic in G, this proves the Cauchy Integral Formula.

(i) First reduce to the case 0∈Gand ζ= 0. Second, show that∂_z_¯^f(z)_z =

∂_z_¯f(z)

z forz6= 0.

(ii) Define

η(z) =







z if|z|>1

z 1−16(1− |z|)⁴⁴

if ¹₂ <|z| ≤1 0 if|z| ≤ ¹₂, and check thatη∈C_b² R²; [0,1]

(iii) Given r >0 with B(0, r)⊂⊂ G, set fr(z) =η(r⁻¹z)f(z), and apply parts (i) here and (ii) of Exercise 5.3.11 tofr to see that

2i Z

G\B(0,r)

∂z¯f(z)

z dλ_R2 = Z

∂G

f z(t)

z(t) dz(t)−2πi Z 1

f re^i2πt dt.

Finally, let r&0.

chapter

Basic Inequalities and Lebesgue Spaces

I have already introduced (cf. §§3.1.2 and 3.2.3) the vector space L¹(µ;R) with the norm¹ k · kL¹(µ;R) and shown it to be aBanach space: that is, a normed vector space that is complete with respect to the metric determined by its norm. Although, from the measure-theoretic point of view, L¹(µ;R) is an obvious space with which to deal, from a geometric standpoint, it is flawed. To understand its flaw, consider the two point space E={1,2}and the measureµthat assigns measure 1 to both points. ThenL¹(µ;R) is easily identified with R², and the length thatk · k_L1(µ;R)assigns x= (x1, x2)∈R² is |x1|+|x2|. Hence, the unit ball in this space is the equilateral diamond whose center is the origin and whose vertices lie on the coordinate axes, and, as such, its boundary has nasty corners. For this reason, it is reasonable to ask whether there are measure-theoretically natural Banach spaces that have better geometric properties.

In the first part of this chapter I will develop a few inequalities that will allow me in the second part to introduce the sort of Banach spaces alluded to in the preceding. Once I have done so, I will conclude with a cursory presentation of results about the boundedness properties of linear maps between these spaces.

§6.1 Jensen, Minkowski, and H¨older

In this section I will derive some inequalities that generalize the inequalities, like the triangle inequality, which are familiar in the Euclidean context.

Since all the inequalities here are consequences of convexity considerations, I will begin by reviewing a few elementary facts about convex sets and concave functions on them. LetV be a real or complex vector space. A subsetC⊆V is said to be convexif (1−α)x+αy∈C wheneverx, y∈C and α∈[0,1].

Given a convex setC⊆V,g:C−→Ris said to be aconcave functionon C if

g (1−α)x+αy

≥(1−α)g(x) +αg(y) for all x, y∈Cand α∈[0,1].

1Given a vector spaceV, a normk · konV is a non-negative map with the properties that kvk= 0 if and only ifv= 0,kαvk=|α|kvkfor allα∈Randv∈V, andkv+wk ≤ kvk+kwk for allv, w∈V. The metric onV determined by the normk · kis the one for whichkw−vk gives the distance betweenvandw.

DOI 10.1007/978-1-4614-1135-2_6, © Springer Science+Business Media, LLC 2011,

D.W. Stroock Essentials of Integration Theory f Analysis, Graduate Texts in Mathematicsor 262, 146

Note that g is concave on C if and only if

(x, t)∈ C×R: t ≤ g(x) is a convex subset of V ⊕R. In addition, one can use induction on n≥2 to see

that n

αkyk∈C and g

αkyk

≥

αkg(yk) for all n ≥ 2, {y₁, . . . , y_n} ⊆ C and {α₁, . . . , α_n} ⊆ [0,1] with Pn

1α_k = 1.

Namely, if n= 2 orα_n ∈ {0,1}, then there is nothing to do. On the other hand, ifn≥3 andαn∈(0,1), setx= (1−αn)⁻¹Pn−1

k=1αkyk, and, assuming the result for n−1, conclude that

αkyk

=g (1−αn)x+αnyn

≥(1−α_n)g

n−1

k=1

α_k(1−α_n)⁻¹y_k

+α_ng(y_n)≥

k=1

α_kg(y_k).

The essence of the relationship between these notions and measure theory is contained in the following.

Theorem 6.1.1 (Jensen’s inequality). LetC be a closed, convex subset ofR^N, and suppose thatg is a continuous, concave, non-negative function on C. If (E,B, µ)is a probability space andF:E−→C a measurable function on(E,B)with the property that|F| ∈L¹(µ;R), then

Fdµ≡





 R

EF1dµ ... R

EFNdµ





∈C and

g◦Fdµ≤g Z

Fdµ

. (See Exercise 6.1.9 for another derivation.)

Proof: First assume thatFis simple. Then F=Pn

k=0yk1Γ_k for somen∈ Z⁺,y0, . . . , yn ∈C, and cover{Γ0, . . . ,Γn}ofEby mutually disjoint elements ofB. Thus, sincePn

0µ(Γk) = 1 andC is convex,R

EFdµ=Pn

0ykµ(Γk)∈C and, becausegis concave and Pn

k=0µ(Γ_k) = 1, g

Fdµ

=g ⁿ

k=0

y_kµ(Γ_k)

≥

k=0

g(y_k)µ(Γ_k) = Z

g◦Fdµ.

Now let F be general. The idea is to approximate F byC-valued simple functions. For this purpose, choose and fix some element y₀ of C, and let

6 Basic Inequalities and Lebesgue Spaces

{yk : k≥1} be a dense sequence inC. Givenm∈Z⁺, choose Rm>0 and nm∈Z⁺ for which

{|F|≥R_m}

|F|+|y₀|

dµ≤ 1

m and C∩B(0, R_m)⊆

[

k=1

B y_k,_m¹ . Next, set Γm,0=

ξ∈E:|F(ξ)| ≥Rm , and use induction to define Γ_m,`=

( ξ∈E\

`−1

[

k=0

Γ_m,k:F(ξ)∈B y_`,_m¹ )

for 1≤`≤n_m. Finally, setF_m=Pn_m

k=0y_k1_Γ_m,k.

By construction, theF_m’s are simple andC-valued. Hence, by the preceding,

F_mdµ∈C and g Z

F_mdµ

≥ Z

g◦F_mdµ for eachm∈Z⁺. Moreover, since|F−F_m| ≤ _m¹ onSn_m

1 Γ_m,`=E\Γ_m,0, Z

F−Fm

dµ=

n_m

`=0

Γm,`

F−Fm

dµ≤ 1 m +

Γm,0

|F|+|y0|

dµ≤ 2 m. Thus,

|Fm−F|

_L₁_(µ;

R)−→0 as m→ ∞; and so, becauseC is closed, we now see that R

EFdµ ∈C. At the same time, because (cf. Exercise 3.2.17) g is continuous, g◦Fm −→ g◦F in µ-measure as m → ∞. Hence, by the version of Fatou’s Lemma in Theorem 3.2.12,

g◦Fdµ≤ lim

m→∞

g◦Fmdµ≤ lim

m→∞

g Z

Fmdµ

=g Z

Fdµ

In order to apply Jensen’s inequality, we need to develop a criterion for recognizing when a function is concave. Such a criterion is contained in the next theorem. Recall that theHessian matrixH_g(x) of a functiong that is twice continuously differentiable at xis the symmetric matrix given by

H_g(x)≡

∂²g

∂x_i∂x_j(x)

1≤i,j≤N

Also, a symmetric, real N×N matrixAis said to be non-positive definite if all of its eigenvalues are non-positive, or, equivalently, if ξ, Aξ

R^N ≤0 for all ξ∈R^N.

Lemma 6.1.2. Suppose that U is an open, convex subset of R^N, and set C =U. ThenC is also convex. Moreover, if g :C −→R is continuous and g U is concave, then g is concave on all of C. Finally, if g : C −→ R is continuous and g U is twice continuously differentiable, then g is concave onC if and only if its Hessian matrix is non-positive definite for eachx∈U.

Proof: The convexity ofC is obvious. In addition, ifgU is concave, the concavity of g on C follows trivially by continuity. Thus, what remains to show is that if g : U −→ R is twice continuously differentiable, then g is concave onU if and only if its Hessian is non-positive definite at eachx∈U.

In order to prove thatgis concave onU ifH_g(x) is non-positive definite at every x∈U, we will use the following simple result about functions on the interval [0,1]. Namely, suppose that u∈C² [0,1];R

, u(0) = 0 = u(1), and u⁰⁰≤0. Thenu≥0. To see this (cf. Exercise 6.1.10 for another approach), let >0 be given, and consider the functionu≡u+t(1−t). Clearly it is enough to show that u ≥0 on [0,1] for every > 0. Note that u(0) = u(1) = 0 and u⁰⁰(t)<0 for every t∈[0,1]. On the other hand, if u(t)<0 for some t∈[0,1], then there is ans∈(0,1) at whichuachieves its absolute minimum.

But this is impossible, since then, by the second derivative test, we would have that u⁰⁰(s)≥0.

Now assume that Hg(x) is non-positive definite for every x ∈ U. Given x, y∈U, defineu(t) =g((1−t)x+ty)−(1−t)g(x)−tg(y) fort∈[0,1]. Then u(0) =u(1) = 0 and

u⁰⁰(t) =

y−x, Hg((1−t)x+ty)(y−x)

R^N

≤0

for every t∈ [0,1]. Hence, by the preceding paragraph, u≥0 on [0,1]; and so g((1−t)x+ty)≥(1−t)g(x) +tg(y) for all t∈[0,1]. In other words,g is concave onU and therefore onC.

To complete the proof, suppose that Hg(x) has a positive eigenvalue for some x ∈ U. We can then find an e ∈ S^N⁻¹ and an > 0 such that

e, Hg(x)e

R^N >0 andx+te∈U for allt∈(−, ). Setu(t) =g(x+te) for t∈(−, ). Thenu⁰⁰(0) = e, Hg(x)e

R^N >0. On the other hand, u⁰⁰(0) = lim

t→0

u(t) +u(−t)−2u(0)

t² ,

and, ifg were concave, 2u(0) = 2u

t−t 2

= 2g

2(x+te) +¹₂(x−te)

≥g(x+te) +g(x−te) =u(t) +u(−t), from which we would get the contradiction 0< u⁰⁰(0)≤0.

When N = 2, the following lemma provides a useful test for non-positive definiteness.

Lemma 6.1.3. LetA=_{a b}

b c

be a real symmetric matrix. ThenAis non- positive definite if and only if both a+c ≤ 0 and ac ≥ b². In particular, for each α ∈ (0,1), the functions (x, y) ∈ [0,∞)² 7−→ x^αy^1−α and (x, y) ∈ [0,∞)²7−→ x^α+y^α¹_α

are continuous and concave.

6 Basic Inequalities and Lebesgue Spaces

Proof: In view of Lemma 6.1.2, it suffices to check the first assertion. To this end, let T =a+c be the trace and D =ac−b² the determinant of A.

Also, letλandµdenote the eigenvalues ofA. Then,T =λ+µandD=λµ.

IfAis non-positive definite and thereforeλ∨µ≤0, then it is obvious that T ≤0 and that D ≥0. Conversely, If D >0, then either both λand µ are positive or both are negative. Hence if, in addition,T ≤0, thenλand µare negative. Finally, ifD = 0 and T ≤0, then either λ= 0 andµ=T ≤0 or µ= 0 andλ=T≤0.

My first application of these considerations provides a generalization, known as Minkowski’s inequality, of the triangle inequality.

Theorem 6.1.4 (Minkowski’s inequality). Let f1 and f2 be non-negative, measurable functions on the measure space (E,B, µ). Then, for every p∈[1,∞),

f1+f2^p dµ

_p¹

≤ Z

f₁^pdµ ¹_p

+ Z

f₂^pdµ ¹_p

Proof: The case p = 1 follows from (3.1.10), and so we will assume that p∈ (1,∞). Also, without loss of generality, we will assume thatf₁^p and f₂^p areµ-integrable and thatf₁ andf₂are [0,∞)-valued.

Letp∈(1,∞) be given. If we assume thatµ(E) = 1 and takeα= ¹_p, then, by Lemma 6.1.3 and Jensen’s inequality,

f1+f2

dµ= Z

h f₁^pα

+ f₂^pαi_α¹ dµ

≤ Z

f₁^pdµ ^α

+ Z

f₂^pdµ ^αα¹

f₁^pdµ ¹_p

+ Z

f₂^pdµ ¹_p#^p

More generally, ifµ(E) = 0 there is nothing to do, and if 0< µ(E)<∞ we can replace µ by _µ(E)^µ and apply the preceding. Hence, all that remains is the case µ(E) =∞. But ifµ(E) =∞, take E_n =

f₁∨f₂≥ ¹_n , note that µ(En) ≤n^pR

f₁^pdµ+n^pR

f₂^pdµ <∞, apply the preceding to f1, f2, and µ all restricted toEn, and letn→ ∞.

The next application, which is known as H¨older’s inequality, gives a generalization of the inner product inequality|(ξ, η)_RN| ≤ |ξ||η|forξ, η∈R^N. In the Euclidean context, this inequality can be seen as an application of the law of the cosine, which says that the inner product of vectors is the product of their lengths and the cosine of the angle between them.

Theorem 6.1.5 (H¨older’s inequality). Givenp∈(1,∞), define theH¨older conjugatep⁰ of p∈(1,∞)by the equation ¹_p +_p¹0 = 1(i.e., p⁰ = _p−1^p ).

Then, for every pair of non-negative, measurable functions f1 andf2 on the measure space(E,B, µ),

f₁f₂dµ≤ Z

f₁^pdµ ¹_pZ

f₂^p⁰dµ _p¹0

for every p∈(1,∞).

Proof: First note that if either factor on the right-hand side of the above inequality is 0, then f1f2 = 0 (a.e.,µ), and so the left-hand side is also 0.

Thus we will assume that both factors on the right are strictly positive, in which case, we may and will assume in addition that bothf₁^p andf₂^p⁰ are µ- integrable and thatf1andf2are both [0,∞)-valued. Also, just as in the proof of Minkowski’s inequality, we can reduce everything to the caseµ(E) = 1. But then we can apply Jensen’s inequality and Lemma 6.1.3 with α = ¹_p to see that

f1f2dµ= Z

f₁^p^α

f₂^p⁰1−α

dµ≤ Z

f₁^pdµ ^αZ

f₂^p⁰dµ 1−α

= Z

f₁^pdµ ¹_pZ

f₂^p⁰dµ _p¹0

Exercises for §6.1

Exercise 6.1.6. Here are a couple of easy applications of the preceding ideas.

(i) Show that log is continuous and concave on every interval [,∞) with > 0. Use this together with Jensen’s inequality to show that for every n∈Z⁺, µ1, . . . , µn∈(0,1) satisfyingPn

m=1µm= 1, anda1, . . . , an∈[0,∞),

m=1

a^µ_m^m ≤

m=1

µmam.

In particular, when µm = _n¹ for every 1 ≤ m ≤ n, this yields a1· · ·an_n¹

≤ _n¹Pn

m=1a_m, which is the statement that the arithmetic mean dominates the geometric mean.

(ii) Letn∈Z⁺, and suppose thatf₁, . . . , f_n are non-negative, measurable functions on the measure space (E,B, µ). Givenp₁, . . . , p_n∈(1,∞) satisfying Pn

m=1 1

pm = 1, show that Z

f1· · ·fndµ≤

m=1

f_m^p^mdµ _pm¹

6 Basic Inequalities and Lebesgue Spaces

Exercise 6.1.7. When p = 2, Minkowski’s and H¨older’s inequalities are intimately related and are both very simple to prove. Indeed, letf1andf2be bounded, measurable functions on the finite measure space (E,B, µ). Given any α6= 0, observe that

0≤ Z

αf₁± 1 αf₂

dµ=α² Z

f₁²dµ±2 Z

f₁f₂dµ+ 1 α²

f₂²dµ, from which it follows that

2 Z

f1f2dµ

≤t Z

f₁²dµ+1 t Z

f₂²dµ

for every t >0. If either integral on the right vanishes, show from the preceding that R

Ef1f2dµ= 0. On the other hand, if neither integral vanishes, choose t >0 so that the preceding yields

(6.1.8)

f1f2dµ

≤ Z

f₁²dµ ¹₂Z

f₂²dµ ¹₂

Hence, in any case, (6.1.8) holds. Finally, argue that one can remove the restriction that f₁ and f₂ be bounded, and then remove the condition that µ(E) <∞. In particular, even if they are not bounded, so long as f₁² and f₂² areµ-integrable, conclude thatf1f2must beµ-integrable and that (6.1.8) continues to hold.

Clearly (6.1.8) is the special case of H¨older’s inequality whenp= 2. Because it is a particularly significant case, it is often referred to by a different name and is called Schwarz’s inequality. Assuming that bothf₁² and f₂² are µ- integrable, show that the inequality in Schwarz’s inequality is an equality if and only if there exist (α, β)∈R²\ {0}such thatαf1+βf2= 0 (a.e.,µ).

Finally, use Schwarz’s inequality to obtain Minkowski’s inequality for the case p= 2. Notice the similarity between the development here and that of the classical triangle inequalityfor the Euclidean metric onR^N.

Exercise 6.1.9. A geometric proof of Jensen’s inequality can be based on the following. Given a closed, convex subsetCofR^N, show thatq /∈Cif and only if there is aeq ∈S^N⁻¹such that eq, q−x

R^N >0 for allx∈C. Next, given a probability space (E,B, µ) and aµ-integrableF:E−→C, use the preceding to show that p≡R

Fdµ∈C. Finally, let g :C −→[0,∞) be a continuous, concave function, and use the first part to prove Jensen’s inequality. Here are some steps that you might want follow in proving Jensen’s inequality.

(i) Show that ifg1andg2are continuous, concave functions onC, then so is g1∧g2. In particular, ifg is a non-negative, continuous, concave function, then g∧nis also, and use this to reduce the proof of Jensen’s inequality to the case in whichg is bounded.

(ii) Assume thatg :C −→[0,∞) is a bounded, continuous, concave function, and set ˆC ={(x, t)∈R^N ×R : x∈C andt≤g(x)}. Show that ˆC is a closed, convex subset of R^N+1. Next, define ˆF:E −→ Cˆ by ˆF = _F

g◦F

, note that ˆFisµ-integrable, and apply the first part to see that itsµ-integral is an element of ˆC. Finally, notice that

Z Fˆdµ∈Cˆ =⇒ Z

g◦Fdµ≤g Z

Fdµ

. Exercise 6.1.10. Suppose that u∈C² [0,1];R

satisfies u(0) = 0 =u(1).

The goal of this exercise is to show that

(∗) u(t) =−

[0,1]

(s∧t−st)u⁰⁰(s)ds fort∈[0,1].

In particular, ifu⁰⁰≤0, thenu≥0.

(i) Use integration by parts to show that u(t) =tu⁰(0) +

[0,t]

(t−s)u⁰⁰(s)ds fort∈[0,1].

(ii) Using (i), show thatu⁰(0) =−R

[0,1](1−s)u⁰⁰(s)dsand therefore that (∗) holds.

§6.2 The Lebesgue Spaces

In the first part of this section I will introduce and briefly discuss the standard Lebesgue spacesL^p(µ;R). In the second part, I will look at mixed Lebesgue spaces, one of the many useful variations on the standard ones.

§6.2.1. TheL^p-Spaces: Given a measure space (E,B, µ) and ap∈[1,∞), define

kfk_Lp(µ;R)= Z

|f|^pdµ ¹_p

for measurable functions f on (E,B). Also, iff is a measurable function on (E,B) define

kfk_L^∞_(µ;_R₎= inf

M ∈[0,∞] :|f| ≤M (a.e.,µ) .

Although information about f can be gleaned from a study of kfk_Lp(µ;R)

as pchanges (for example, spikes in f will be emphasized by takingpto be large), all these quantities share the same flaw as kfk_L1(µ;R): they cannot detect properties off that occur on sets havingµ-measure 0. Thus, before we can hope to use any of them to put a metric on measurable functions, we must invoke the same subterfuge that I introduced in§3.1.2 in connection with the space L¹(µ;R). That is, for p∈ [1,∞], denote byL^p(µ;R) =L^p(E,B, µ;R)

Dalam dokumen Essentials of Integration Theory for Analysis (Halaman 152-159)