Jensen, Minkowski, and H¨ older - Basic Inequalities and Lebesgue Spaces

Chapter 6 Basic Inequalities and Lebesgue Spaces

6.1 Jensen, Minkowski, and H¨ older

In this section I will derive some inequalities that generalize the inequalities, like the triangle inequality, which are familiar in the Euclidean context.

Since all the inequalities here are consequences of convexity considerations, I will begin by reviewing a few elementary facts about convex sets and concave functions on them. LetV be a real or complex vector space. A subsetC⊆V is said to be convexif (1−α)x+αy∈C wheneverx, y∈C and α∈[0,1].

Given a convex setC⊆V,g:C−→Ris said to be aconcave functionon C if

g (1−α)x+αy

≥(1−α)g(x) +αg(y) for all x, y∈Cand α∈[0,1].

1Given a vector spaceV, a normk · konV is a non-negative map with the properties that kvk= 0 if and only ifv= 0,kαvk=|α|kvkfor allα∈Randv∈V, andkv+wk ≤ kvk+kwk for allv, w∈V. The metric onV determined by the normk · kis the one for whichkw−vk gives the distance betweenvandw.

DOI 10.1007/978-1-4614-1135-2_6, © Springer Science+Business Media, LLC 2011,

D.W. Stroock Essentials of Integration Theory f Analysis, Graduate Texts in Mathematicsor 262, 146

Note that g is concave on C if and only if

(x, t)∈ C×R: t ≤ g(x) is a convex subset of V ⊕R. In addition, one can use induction on n≥2 to see

that n

αkyk∈C and g

αkyk

≥

αkg(yk) for all n ≥ 2, {y₁, . . . , y_n} ⊆ C and {α₁, . . . , α_n} ⊆ [0,1] with Pn

1α_k = 1.

Namely, if n= 2 orα_n ∈ {0,1}, then there is nothing to do. On the other hand, ifn≥3 andαn∈(0,1), setx= (1−αn)⁻¹Pn−1

k=1αkyk, and, assuming the result for n−1, conclude that

αkyk

=g (1−αn)x+αnyn

≥(1−α_n)g

n−1

k=1

α_k(1−α_n)⁻¹y_k

+α_ng(y_n)≥

k=1

α_kg(y_k).

The essence of the relationship between these notions and measure theory is contained in the following.

Theorem 6.1.1 (Jensen’s inequality). LetC be a closed, convex subset ofR^N, and suppose thatg is a continuous, concave, non-negative function on C. If (E,B, µ)is a probability space andF:E−→C a measurable function on(E,B)with the property that|F| ∈L¹(µ;R), then

Fdµ≡





 R

EF1dµ ... R

EFNdµ





∈C and

g◦Fdµ≤g Z

Fdµ

. (See Exercise 6.1.9 for another derivation.)

Proof: First assume thatFis simple. Then F=Pn

k=0yk1Γ_k for somen∈ Z⁺,y0, . . . , yn ∈C, and cover{Γ0, . . . ,Γn}ofEby mutually disjoint elements ofB. Thus, sincePn

0µ(Γk) = 1 andC is convex,R

EFdµ=Pn

0ykµ(Γk)∈C and, becausegis concave and Pn

k=0µ(Γ_k) = 1, g

Fdµ

=g ⁿ

k=0

y_kµ(Γ_k)

≥

k=0

g(y_k)µ(Γ_k) = Z

g◦Fdµ.

Now let F be general. The idea is to approximate F byC-valued simple functions. For this purpose, choose and fix some element y₀ of C, and let

6 Basic Inequalities and Lebesgue Spaces

{yk : k≥1} be a dense sequence inC. Givenm∈Z⁺, choose Rm>0 and nm∈Z⁺ for which

{|F|≥R_m}

|F|+|y₀|

dµ≤ 1

m and C∩B(0, R_m)⊆

[

k=1

B y_k,_m¹ . Next, set Γm,0=

ξ∈E:|F(ξ)| ≥Rm , and use induction to define Γ_m,`=

( ξ∈E\

`−1

[

k=0

Γ_m,k:F(ξ)∈B y_`,_m¹ )

for 1≤`≤n_m. Finally, setF_m=Pn_m

k=0y_k1_Γ_m,k.

By construction, theF_m’s are simple andC-valued. Hence, by the preceding,

F_mdµ∈C and g Z

F_mdµ

≥ Z

g◦F_mdµ for eachm∈Z⁺. Moreover, since|F−F_m| ≤ _m¹ onSn_m

1 Γ_m,`=E\Γ_m,0, Z

F−Fm

dµ=

n_m

`=0

Γm,`

F−Fm

dµ≤ 1 m +

Γm,0

|F|+|y0|

dµ≤ 2 m. Thus,

|Fm−F|

_L₁_(µ;

R)−→0 as m→ ∞; and so, becauseC is closed, we now see that R

EFdµ ∈C. At the same time, because (cf. Exercise 3.2.17) g is continuous, g◦Fm −→ g◦F in µ-measure as m → ∞. Hence, by the version of Fatou’s Lemma in Theorem 3.2.12,

g◦Fdµ≤ lim

m→∞

g◦Fmdµ≤ lim

m→∞

g Z

Fmdµ

=g Z

Fdµ

In order to apply Jensen’s inequality, we need to develop a criterion for recognizing when a function is concave. Such a criterion is contained in the next theorem. Recall that theHessian matrixH_g(x) of a functiong that is twice continuously differentiable at xis the symmetric matrix given by

H_g(x)≡

∂²g

∂x_i∂x_j(x)

1≤i,j≤N

Also, a symmetric, real N×N matrixAis said to be non-positive definite if all of its eigenvalues are non-positive, or, equivalently, if ξ, Aξ

R^N ≤0 for all ξ∈R^N.

Lemma 6.1.2. Suppose that U is an open, convex subset of R^N, and set C =U. ThenC is also convex. Moreover, if g :C −→R is continuous and g U is concave, then g is concave on all of C. Finally, if g : C −→ R is continuous and g U is twice continuously differentiable, then g is concave onC if and only if its Hessian matrix is non-positive definite for eachx∈U.

Proof: The convexity ofC is obvious. In addition, ifgU is concave, the concavity of g on C follows trivially by continuity. Thus, what remains to show is that if g : U −→ R is twice continuously differentiable, then g is concave onU if and only if its Hessian is non-positive definite at eachx∈U.

In order to prove thatgis concave onU ifH_g(x) is non-positive definite at every x∈U, we will use the following simple result about functions on the interval [0,1]. Namely, suppose that u∈C² [0,1];R

, u(0) = 0 = u(1), and u⁰⁰≤0. Thenu≥0. To see this (cf. Exercise 6.1.10 for another approach), let >0 be given, and consider the functionu≡u+t(1−t). Clearly it is enough to show that u ≥0 on [0,1] for every > 0. Note that u(0) = u(1) = 0 and u⁰⁰(t)<0 for every t∈[0,1]. On the other hand, if u(t)<0 for some t∈[0,1], then there is ans∈(0,1) at whichuachieves its absolute minimum.

But this is impossible, since then, by the second derivative test, we would have that u⁰⁰(s)≥0.

Now assume that Hg(x) is non-positive definite for every x ∈ U. Given x, y∈U, defineu(t) =g((1−t)x+ty)−(1−t)g(x)−tg(y) fort∈[0,1]. Then u(0) =u(1) = 0 and

u⁰⁰(t) =

y−x, Hg((1−t)x+ty)(y−x)

R^N

≤0

for every t∈ [0,1]. Hence, by the preceding paragraph, u≥0 on [0,1]; and so g((1−t)x+ty)≥(1−t)g(x) +tg(y) for all t∈[0,1]. In other words,g is concave onU and therefore onC.

To complete the proof, suppose that Hg(x) has a positive eigenvalue for some x ∈ U. We can then find an e ∈ S^N⁻¹ and an > 0 such that

e, Hg(x)e

R^N >0 andx+te∈U for allt∈(−, ). Setu(t) =g(x+te) for t∈(−, ). Thenu⁰⁰(0) = e, Hg(x)e

R^N >0. On the other hand, u⁰⁰(0) = lim

t→0

u(t) +u(−t)−2u(0)

t² ,

and, ifg were concave, 2u(0) = 2u

t−t 2

= 2g

2(x+te) +¹₂(x−te)

≥g(x+te) +g(x−te) =u(t) +u(−t), from which we would get the contradiction 0< u⁰⁰(0)≤0.

When N = 2, the following lemma provides a useful test for non-positive definiteness.

Lemma 6.1.3. LetA=_{a b}

b c

be a real symmetric matrix. ThenAis non- positive definite if and only if both a+c ≤ 0 and ac ≥ b². In particular, for each α ∈ (0,1), the functions (x, y) ∈ [0,∞)² 7−→ x^αy^1−α and (x, y) ∈ [0,∞)²7−→ x^α+y^α¹_α

are continuous and concave.

6 Basic Inequalities and Lebesgue Spaces

Proof: In view of Lemma 6.1.2, it suffices to check the first assertion. To this end, let T =a+c be the trace and D =ac−b² the determinant of A.

Also, letλandµdenote the eigenvalues ofA. Then,T =λ+µandD=λµ.

IfAis non-positive definite and thereforeλ∨µ≤0, then it is obvious that T ≤0 and that D ≥0. Conversely, If D >0, then either both λand µ are positive or both are negative. Hence if, in addition,T ≤0, thenλand µare negative. Finally, ifD = 0 and T ≤0, then either λ= 0 andµ=T ≤0 or µ= 0 andλ=T≤0.

My first application of these considerations provides a generalization, known as Minkowski’s inequality, of the triangle inequality.

Theorem 6.1.4 (Minkowski’s inequality). Let f1 and f2 be non-negative, measurable functions on the measure space (E,B, µ). Then, for every p∈[1,∞),

f1+f2^p dµ

_p¹

≤ Z

f₁^pdµ ¹_p

+ Z

f₂^pdµ ¹_p

Proof: The case p = 1 follows from (3.1.10), and so we will assume that p∈ (1,∞). Also, without loss of generality, we will assume thatf₁^p and f₂^p areµ-integrable and thatf₁ andf₂are [0,∞)-valued.

Letp∈(1,∞) be given. If we assume thatµ(E) = 1 and takeα= ¹_p, then, by Lemma 6.1.3 and Jensen’s inequality,

f1+f2

dµ= Z

h f₁^pα

+ f₂^pαi_α¹ dµ

≤ Z

f₁^pdµ ^α

+ Z

f₂^pdµ ^αα¹

f₁^pdµ ¹_p

+ Z

f₂^pdµ ¹_p#^p

More generally, ifµ(E) = 0 there is nothing to do, and if 0< µ(E)<∞ we can replace µ by _µ(E)^µ and apply the preceding. Hence, all that remains is the case µ(E) =∞. But ifµ(E) =∞, take E_n =

f₁∨f₂≥ ¹_n , note that µ(En) ≤n^pR

f₁^pdµ+n^pR

f₂^pdµ <∞, apply the preceding to f1, f2, and µ all restricted toEn, and letn→ ∞.

The next application, which is known as H¨older’s inequality, gives a generalization of the inner product inequality|(ξ, η)_RN| ≤ |ξ||η|forξ, η∈R^N. In the Euclidean context, this inequality can be seen as an application of the law of the cosine, which says that the inner product of vectors is the product of their lengths and the cosine of the angle between them.

Theorem 6.1.5 (H¨older’s inequality). Givenp∈(1,∞), define theH¨older conjugatep⁰ of p∈(1,∞)by the equation ¹_p +_p¹0 = 1(i.e., p⁰ = _p−1^p ).

Then, for every pair of non-negative, measurable functions f1 andf2 on the measure space(E,B, µ),

f₁f₂dµ≤ Z

f₁^pdµ ¹_pZ

f₂^p⁰dµ _p¹0

for every p∈(1,∞).

Proof: First note that if either factor on the right-hand side of the above inequality is 0, then f1f2 = 0 (a.e.,µ), and so the left-hand side is also 0.

Thus we will assume that both factors on the right are strictly positive, in which case, we may and will assume in addition that bothf₁^p andf₂^p⁰ are µ- integrable and thatf1andf2are both [0,∞)-valued. Also, just as in the proof of Minkowski’s inequality, we can reduce everything to the caseµ(E) = 1. But then we can apply Jensen’s inequality and Lemma 6.1.3 with α = ¹_p to see that

f1f2dµ= Z

f₁^p^α

f₂^p⁰1−α

dµ≤ Z

f₁^pdµ ^αZ

f₂^p⁰dµ 1−α

= Z

f₁^pdµ ¹_pZ

f₂^p⁰dµ _p¹0

Exercises for §6.1

Exercise 6.1.6. Here are a couple of easy applications of the preceding ideas.

(i) Show that log is continuous and concave on every interval [,∞) with > 0. Use this together with Jensen’s inequality to show that for every n∈Z⁺, µ1, . . . , µn∈(0,1) satisfyingPn

m=1µm= 1, anda1, . . . , an∈[0,∞),

m=1

a^µ_m^m ≤

m=1

µmam.

In particular, when µm = _n¹ for every 1 ≤ m ≤ n, this yields a1· · ·an_n¹

≤ _n¹Pn

m=1a_m, which is the statement that the arithmetic mean dominates the geometric mean.

(ii) Letn∈Z⁺, and suppose thatf₁, . . . , f_n are non-negative, measurable functions on the measure space (E,B, µ). Givenp₁, . . . , p_n∈(1,∞) satisfying Pn

m=1 1

pm = 1, show that Z

f1· · ·fndµ≤

m=1

f_m^p^mdµ _pm¹

6 Basic Inequalities and Lebesgue Spaces

Exercise 6.1.7. When p = 2, Minkowski’s and H¨older’s inequalities are intimately related and are both very simple to prove. Indeed, letf1andf2be bounded, measurable functions on the finite measure space (E,B, µ). Given any α6= 0, observe that

0≤ Z

αf₁± 1 αf₂

dµ=α² Z

f₁²dµ±2 Z

f₁f₂dµ+ 1 α²

f₂²dµ, from which it follows that

2 Z

f1f2dµ

≤t Z

f₁²dµ+1 t Z

f₂²dµ

for every t >0. If either integral on the right vanishes, show from the preceding that R

Ef1f2dµ= 0. On the other hand, if neither integral vanishes, choose t >0 so that the preceding yields

(6.1.8)

f1f2dµ

≤ Z

f₁²dµ ¹₂Z

f₂²dµ ¹₂

Hence, in any case, (6.1.8) holds. Finally, argue that one can remove the restriction that f₁ and f₂ be bounded, and then remove the condition that µ(E) <∞. In particular, even if they are not bounded, so long as f₁² and f₂² areµ-integrable, conclude thatf1f2must beµ-integrable and that (6.1.8) continues to hold.

Clearly (6.1.8) is the special case of H¨older’s inequality whenp= 2. Because it is a particularly significant case, it is often referred to by a different name and is called Schwarz’s inequality. Assuming that bothf₁² and f₂² are µ- integrable, show that the inequality in Schwarz’s inequality is an equality if and only if there exist (α, β)∈R²\ {0}such thatαf1+βf2= 0 (a.e.,µ).

Finally, use Schwarz’s inequality to obtain Minkowski’s inequality for the case p= 2. Notice the similarity between the development here and that of the classical triangle inequalityfor the Euclidean metric onR^N.

Exercise 6.1.9. A geometric proof of Jensen’s inequality can be based on the following. Given a closed, convex subsetCofR^N, show thatq /∈Cif and only if there is aeq ∈S^N⁻¹such that eq, q−x

R^N >0 for allx∈C. Next, given a probability space (E,B, µ) and aµ-integrableF:E−→C, use the preceding to show that p≡R

Fdµ∈C. Finally, let g :C −→[0,∞) be a continuous, concave function, and use the first part to prove Jensen’s inequality. Here are some steps that you might want follow in proving Jensen’s inequality.

(i) Show that ifg1andg2are continuous, concave functions onC, then so is g1∧g2. In particular, ifg is a non-negative, continuous, concave function, then g∧nis also, and use this to reduce the proof of Jensen’s inequality to the case in whichg is bounded.

(ii) Assume thatg :C −→[0,∞) is a bounded, continuous, concave function, and set ˆC ={(x, t)∈R^N ×R : x∈C andt≤g(x)}. Show that ˆC is a closed, convex subset of R^N+1. Next, define ˆF:E −→ Cˆ by ˆF = _F

g◦F

, note that ˆFisµ-integrable, and apply the first part to see that itsµ-integral is an element of ˆC. Finally, notice that

Z Fˆdµ∈Cˆ =⇒ Z

g◦Fdµ≤g Z

Fdµ

. Exercise 6.1.10. Suppose that u∈C² [0,1];R

satisfies u(0) = 0 =u(1).

The goal of this exercise is to show that

(∗) u(t) =−

[0,1]

(s∧t−st)u⁰⁰(s)ds fort∈[0,1].

In particular, ifu⁰⁰≤0, thenu≥0.

(i) Use integration by parts to show that u(t) =tu⁰(0) +

[0,t]

(t−s)u⁰⁰(s)ds fort∈[0,1].

(ii) Using (i), show thatu⁰(0) =−R

[0,1](1−s)u⁰⁰(s)dsand therefore that (∗) holds.

Dalam dokumen Essentials of Integration Theory for Analysis (Halaman 159-166)