Chapter 6 Basic Inequalities and Lebesgue Spaces
6.1 Jensen, Minkowski, and H¨ older
In this section I will derive some inequalities that generalize the inequalities, like the triangle inequality, which are familiar in the Euclidean context.
Since all the inequalities here are consequences of convexity considerations, I will begin by reviewing a few elementary facts about convex sets and concave functions on them. LetV be a real or complex vector space. A subsetC⊆V is said to be convexif (1−α)x+αy∈C wheneverx, y∈C and α∈[0,1].
Given a convex setC⊆V,g:C−→Ris said to be aconcave functionon C if
g (1−α)x+αy
≥(1−α)g(x) +αg(y) for all x, y∈Cand α∈[0,1].
1Given a vector spaceV, a normk · konV is a non-negative map with the properties that kvk= 0 if and only ifv= 0,kαvk=|α|kvkfor allα∈Randv∈V, andkv+wk ≤ kvk+kwk for allv, w∈V. The metric onV determined by the normk · kis the one for whichkw−vk gives the distance betweenvandw.
DOI 10.1007/978-1-4614-1135-2_6, © Springer Science+Business Media, LLC 2011,
D.W. Stroock Essentials of Integration Theory f Analysis, Graduate Texts in Mathematicsor 262, 146
Note that g is concave on C if and only if
(x, t)∈ C×R: t ≤ g(x) is a convex subset of V ⊕R. In addition, one can use induction on n≥2 to see
that n
X
1
αkyk∈C and g
n
X
1
αkyk
!
≥
n
X
1
αkg(yk) for all n ≥ 2, {y1, . . . , yn} ⊆ C and {α1, . . . , αn} ⊆ [0,1] with Pn
1αk = 1.
Namely, if n= 2 orαn ∈ {0,1}, then there is nothing to do. On the other hand, ifn≥3 andαn∈(0,1), setx= (1−αn)−1Pn−1
k=1αkyk, and, assuming the result for n−1, conclude that
g
n
X
1
αkyk
!
=g (1−αn)x+αnyn
≥(1−αn)g
n−1
X
k=1
αk(1−αn)−1yk
!
+αng(yn)≥
n
X
k=1
αkg(yk).
The essence of the relationship between these notions and measure theory is contained in the following.
Theorem 6.1.1 (Jensen’s inequality). LetC be a closed, convex subset ofRN, and suppose thatg is a continuous, concave, non-negative function on C. If (E,B, µ)is a probability space andF:E−→C a measurable function on(E,B)with the property that|F| ∈L1(µ;R), then
Z
E
Fdµ≡
R
EF1dµ ... R
EFNdµ
∈C and
Z
E
g◦Fdµ≤g Z
E
Fdµ
. (See Exercise 6.1.9 for another derivation.)
Proof: First assume thatFis simple. Then F=Pn
k=0yk1Γk for somen∈ Z+,y0, . . . , yn ∈C, and cover{Γ0, . . . ,Γn}ofEby mutually disjoint elements ofB. Thus, sincePn
0µ(Γk) = 1 andC is convex,R
EFdµ=Pn
0ykµ(Γk)∈C and, becausegis concave and Pn
k=0µ(Γk) = 1, g
Z
E
Fdµ
=g n
X
k=0
ykµ(Γk)
≥
n
X
k=0
g(yk)µ(Γk) = Z
E
g◦Fdµ.
Now let F be general. The idea is to approximate F byC-valued simple functions. For this purpose, choose and fix some element y0 of C, and let
6 Basic Inequalities and Lebesgue Spaces
{yk : k≥1} be a dense sequence inC. Givenm∈Z+, choose Rm>0 and nm∈Z+ for which
Z
{|F|≥Rm}
|F|+|y0|
dµ≤ 1
m and C∩B(0, Rm)⊆
nm
[
k=1
B yk,m1 . Next, set Γm,0=
ξ∈E:|F(ξ)| ≥Rm , and use induction to define Γm,`=
( ξ∈E\
`−1
[
k=0
Γm,k:F(ξ)∈B y`,m1 )
for 1≤`≤nm. Finally, setFm=Pnm
k=0yk1Γm,k.
By construction, theFm’s are simple andC-valued. Hence, by the preced- ing,
Z
E
Fmdµ∈C and g Z
E
Fmdµ
≥ Z
E
g◦Fmdµ for eachm∈Z+. Moreover, since|F−Fm| ≤ m1 onSnm
1 Γm,`=E\Γm,0, Z
E
F−Fm
dµ=
nm
X
`=0
Z
Γm,`
F−Fm
dµ≤ 1 m +
Z
Γm,0
|F|+|y0|
dµ≤ 2 m. Thus,
|Fm−F|
L1(µ;
R)−→0 as m→ ∞; and so, becauseC is closed, we now see that R
EFdµ ∈C. At the same time, because (cf. Exercise 3.2.17) g is continuous, g◦Fm −→ g◦F in µ-measure as m → ∞. Hence, by the version of Fatou’s Lemma in Theorem 3.2.12,
Z
E
g◦Fdµ≤ lim
m→∞
Z
E
g◦Fmdµ≤ lim
m→∞
g Z
E
Fmdµ
=g Z
E
Fdµ
.
In order to apply Jensen’s inequality, we need to develop a criterion for recognizing when a function is concave. Such a criterion is contained in the next theorem. Recall that theHessian matrixHg(x) of a functiong that is twice continuously differentiable at xis the symmetric matrix given by
Hg(x)≡
∂2g
∂xi∂xj(x)
1≤i,j≤N
.
Also, a symmetric, real N×N matrixAis said to be non-positive definite if all of its eigenvalues are non-positive, or, equivalently, if ξ, Aξ
RN ≤0 for all ξ∈RN.
Lemma 6.1.2. Suppose that U is an open, convex subset of RN, and set C =U. ThenC is also convex. Moreover, if g :C −→R is continuous and g U is concave, then g is concave on all of C. Finally, if g : C −→ R is continuous and g U is twice continuously differentiable, then g is concave onC if and only if its Hessian matrix is non-positive definite for eachx∈U.
Proof: The convexity ofC is obvious. In addition, ifgU is concave, the concavity of g on C follows trivially by continuity. Thus, what remains to show is that if g : U −→ R is twice continuously differentiable, then g is concave onU if and only if its Hessian is non-positive definite at eachx∈U.
In order to prove thatgis concave onU ifHg(x) is non-positive definite at every x∈U, we will use the following simple result about functions on the interval [0,1]. Namely, suppose that u∈C2 [0,1];R
, u(0) = 0 = u(1), and u00≤0. Thenu≥0. To see this (cf. Exercise 6.1.10 for another approach), let >0 be given, and consider the functionu≡u+t(1−t). Clearly it is enough to show that u ≥0 on [0,1] for every > 0. Note that u(0) = u(1) = 0 and u00(t)<0 for every t∈[0,1]. On the other hand, if u(t)<0 for some t∈[0,1], then there is ans∈(0,1) at whichuachieves its absolute minimum.
But this is impossible, since then, by the second derivative test, we would have that u00(s)≥0.
Now assume that Hg(x) is non-positive definite for every x ∈ U. Given x, y∈U, defineu(t) =g((1−t)x+ty)−(1−t)g(x)−tg(y) fort∈[0,1]. Then u(0) =u(1) = 0 and
u00(t) =
y−x, Hg((1−t)x+ty)(y−x)
RN
≤0
for every t∈ [0,1]. Hence, by the preceding paragraph, u≥0 on [0,1]; and so g((1−t)x+ty)≥(1−t)g(x) +tg(y) for all t∈[0,1]. In other words,g is concave onU and therefore onC.
To complete the proof, suppose that Hg(x) has a positive eigenvalue for some x ∈ U. We can then find an e ∈ SN−1 and an > 0 such that
e, Hg(x)e
RN >0 andx+te∈U for allt∈(−, ). Setu(t) =g(x+te) for t∈(−, ). Thenu00(0) = e, Hg(x)e
RN >0. On the other hand, u00(0) = lim
t→0
u(t) +u(−t)−2u(0)
t2 ,
and, ifg were concave, 2u(0) = 2u
t−t 2
= 2g
1
2(x+te) +12(x−te)
≥g(x+te) +g(x−te) =u(t) +u(−t), from which we would get the contradiction 0< u00(0)≤0.
When N = 2, the following lemma provides a useful test for non-positive definiteness.
Lemma 6.1.3. LetA=a b
b c
be a real symmetric matrix. ThenAis non- positive definite if and only if both a+c ≤ 0 and ac ≥ b2. In particular, for each α ∈ (0,1), the functions (x, y) ∈ [0,∞)2 7−→ xαy1−α and (x, y) ∈ [0,∞)27−→ xα+yα1α
are continuous and concave.
6 Basic Inequalities and Lebesgue Spaces
Proof: In view of Lemma 6.1.2, it suffices to check the first assertion. To this end, let T =a+c be the trace and D =ac−b2 the determinant of A.
Also, letλandµdenote the eigenvalues ofA. Then,T =λ+µandD=λµ.
IfAis non-positive definite and thereforeλ∨µ≤0, then it is obvious that T ≤0 and that D ≥0. Conversely, If D >0, then either both λand µ are positive or both are negative. Hence if, in addition,T ≤0, thenλand µare negative. Finally, ifD = 0 and T ≤0, then either λ= 0 andµ=T ≤0 or µ= 0 andλ=T≤0.
My first application of these considerations provides a generalization, known as Minkowski’s inequality, of the triangle inequality.
Theorem 6.1.4 (Minkowski’s inequality). Let f1 and f2 be non-nega- tive, measurable functions on the measure space (E,B, µ). Then, for every p∈[1,∞),
Z
E
f1+f2p dµ
p1
≤ Z
E
f1pdµ 1p
+ Z
E
f2pdµ 1p
.
Proof: The case p = 1 follows from (3.1.10), and so we will assume that p∈ (1,∞). Also, without loss of generality, we will assume thatf1p and f2p areµ-integrable and thatf1 andf2are [0,∞)-valued.
Letp∈(1,∞) be given. If we assume thatµ(E) = 1 and takeα= 1p, then, by Lemma 6.1.3 and Jensen’s inequality,
Z
E
f1+f2
p
dµ= Z
E
h f1pα
+ f2pαiα1 dµ
≤ Z
E
f1pdµ α
+ Z
E
f2pdµ αα1
=
"Z
E
f1pdµ 1p
+ Z
E
f2pdµ 1p#p
.
More generally, ifµ(E) = 0 there is nothing to do, and if 0< µ(E)<∞ we can replace µ by µ(E)µ and apply the preceding. Hence, all that remains is the case µ(E) =∞. But ifµ(E) =∞, take En =
f1∨f2≥ 1n , note that µ(En) ≤npR
f1pdµ+npR
f2pdµ <∞, apply the preceding to f1, f2, and µ all restricted toEn, and letn→ ∞.
The next application, which is known as H¨older’s inequality, gives a generalization of the inner product inequality|(ξ, η)RN| ≤ |ξ||η|forξ, η∈RN. In the Euclidean context, this inequality can be seen as an application of the law of the cosine, which says that the inner product of vectors is the product of their lengths and the cosine of the angle between them.
Theorem 6.1.5 (H¨older’s inequality). Givenp∈(1,∞), define theH¨ol- der conjugatep0 of p∈(1,∞)by the equation 1p +p10 = 1(i.e., p0 = p−1p ).
Then, for every pair of non-negative, measurable functions f1 andf2 on the measure space(E,B, µ),
Z
E
f1f2dµ≤ Z
E
f1pdµ 1pZ
E
f2p0dµ p10
for every p∈(1,∞).
Proof: First note that if either factor on the right-hand side of the above inequality is 0, then f1f2 = 0 (a.e.,µ), and so the left-hand side is also 0.
Thus we will assume that both factors on the right are strictly positive, in which case, we may and will assume in addition that bothf1p andf2p0 are µ- integrable and thatf1andf2are both [0,∞)-valued. Also, just as in the proof of Minkowski’s inequality, we can reduce everything to the caseµ(E) = 1. But then we can apply Jensen’s inequality and Lemma 6.1.3 with α = 1p to see that
Z
E
f1f2dµ= Z
E
f1pα
f2p01−α
dµ≤ Z
E
f1pdµ αZ
E
f2p0dµ 1−α
= Z
E
f1pdµ 1pZ
E
f2p0dµ p10
.
Exercises for §6.1
Exercise 6.1.6. Here are a couple of easy applications of the preceding ideas.
(i) Show that log is continuous and concave on every interval [,∞) with > 0. Use this together with Jensen’s inequality to show that for every n∈Z+, µ1, . . . , µn∈(0,1) satisfyingPn
m=1µm= 1, anda1, . . . , an∈[0,∞),
n
Y
m=1
aµmm ≤
n
X
m=1
µmam.
In particular, when µm = n1 for every 1 ≤ m ≤ n, this yields a1· · ·ann1
≤ n1Pn
m=1am, which is the statement that the arithmetic mean dominates the geometric mean.
(ii) Letn∈Z+, and suppose thatf1, . . . , fn are non-negative, measurable functions on the measure space (E,B, µ). Givenp1, . . . , pn∈(1,∞) satisfying Pn
m=1 1
pm = 1, show that Z
E
f1· · ·fndµ≤
n
Y
m=1
Z
E
fmpmdµ pm1
.
6 Basic Inequalities and Lebesgue Spaces
Exercise 6.1.7. When p = 2, Minkowski’s and H¨older’s inequalities are intimately related and are both very simple to prove. Indeed, letf1andf2be bounded, measurable functions on the finite measure space (E,B, µ). Given any α6= 0, observe that
0≤ Z
E
αf1± 1 αf2
2
dµ=α2 Z
E
f12dµ±2 Z
E
f1f2dµ+ 1 α2
Z
E
f22dµ, from which it follows that
2 Z
E
f1f2dµ
≤t Z
E
f12dµ+1 t Z
E
f22dµ
for every t >0. If either integral on the right vanishes, show from the pre- ceding that R
Ef1f2dµ= 0. On the other hand, if neither integral vanishes, choose t >0 so that the preceding yields
(6.1.8)
Z
E
f1f2dµ
≤ Z
E
f12dµ 12Z
E
f22dµ 12
.
Hence, in any case, (6.1.8) holds. Finally, argue that one can remove the restriction that f1 and f2 be bounded, and then remove the condition that µ(E) <∞. In particular, even if they are not bounded, so long as f12 and f22 areµ-integrable, conclude thatf1f2must beµ-integrable and that (6.1.8) continues to hold.
Clearly (6.1.8) is the special case of H¨older’s inequality whenp= 2. Because it is a particularly significant case, it is often referred to by a different name and is called Schwarz’s inequality. Assuming that bothf12 and f22 are µ- integrable, show that the inequality in Schwarz’s inequality is an equality if and only if there exist (α, β)∈R2\ {0}such thatαf1+βf2= 0 (a.e.,µ).
Finally, use Schwarz’s inequality to obtain Minkowski’s inequality for the case p= 2. Notice the similarity between the development here and that of the classical triangle inequalityfor the Euclidean metric onRN.
Exercise 6.1.9. A geometric proof of Jensen’s inequality can be based on the following. Given a closed, convex subsetCofRN, show thatq /∈Cif and only if there is aeq ∈SN−1such that eq, q−x
RN >0 for allx∈C. Next, given a probability space (E,B, µ) and aµ-integrableF:E−→C, use the preceding to show that p≡R
Fdµ∈C. Finally, let g :C −→[0,∞) be a continuous, concave function, and use the first part to prove Jensen’s inequality. Here are some steps that you might want follow in proving Jensen’s inequality.
(i) Show that ifg1andg2are continuous, concave functions onC, then so is g1∧g2. In particular, ifg is a non-negative, continuous, concave function, then g∧nis also, and use this to reduce the proof of Jensen’s inequality to the case in whichg is bounded.
(ii) Assume thatg :C −→[0,∞) is a bounded, continuous, concave func- tion, and set ˆC ={(x, t)∈RN ×R : x∈C andt≤g(x)}. Show that ˆC is a closed, convex subset of RN+1. Next, define ˆF:E −→ Cˆ by ˆF = F
g◦F
, note that ˆFisµ-integrable, and apply the first part to see that itsµ-integral is an element of ˆC. Finally, notice that
Z Fˆdµ∈Cˆ =⇒ Z
g◦Fdµ≤g Z
Fdµ
. Exercise 6.1.10. Suppose that u∈C2 [0,1];R
satisfies u(0) = 0 =u(1).
The goal of this exercise is to show that
(∗) u(t) =−
Z
[0,1]
(s∧t−st)u00(s)ds fort∈[0,1].
In particular, ifu00≤0, thenu≥0.
(i) Use integration by parts to show that u(t) =tu0(0) +
Z
[0,t]
(t−s)u00(s)ds fort∈[0,1].
(ii) Using (i), show thatu0(0) =−R
[0,1](1−s)u00(s)dsand therefore that (∗) holds.