Conditional Measures and Algebras
5.2 Martingales
5.2 Martingales 127
εμ(E)
E
E(fA)
=
E
f f1
as required. The next lemma will be a very useful generalization of this simple observation, which is an analog of a maximal inequality (Theorem 2.24).
Lemma 5.6 (Doob’s inequality). Letf ∈L1(X,B, μ), let A1⊆A2⊆ · · · ⊆AN ⊆B
be an increasing list ofσ-algebras, and fix λ >0. Let E={x| max
1iN
E(fAi)> λ}. Then
μ(E) 1 λf1.
If (An)n1 is an increasing (or decreasing) sequence of σ-algebras then the same conclusion holds for the set
E={x|sup
i1
E fAi
> λ}.
Proof.Assume thatf 0 (if necessary replacingfby|f|, which makesμ(E) no smaller). Let
En={x|E(fAn)> λbut E(fAi)λfor 1in−1}. Then E = E1 · · · EN and En ∈ An since A1,A2, . . . ,An−1 ⊆An. (In the decreasing case of finitely many σ-algebras we may reverse the order of theσ-algebras, since the statement we wish to prove is independent of the order.) It follows that
f1
E
fdμ= N n=1
En
fdμ
= N n=1
En
E(fAn) dμ
N n=1
λμ(En) =λμ(E).
TakingN→ ∞shows the final remark.
Proof of Theorem5.5. Using Theorem5.1(4), we may replace the func- tionf byE(fA) without changingE(fAn).
The theorem holds for allf ∈L1(X,An, μ),n1. Now
n1L1(X,An, μ) is dense inL1(X,A, μ). To see this, notice that
{B∈A |for every ε >0 there existm1, A∈Am withμ(A B)< ε} is aσ-algebra by Theorem A.7. Given anyf ∈L1(X,A, μ) andε >0, findm andg∈L1(X,Am, μ) withf−g1< ε, so that
E(fAn)−f1E(fAn)−E(gAn)1+E(gAn)−g1
= 0 fornm
+g−f1<2ε
fornm. It follows that μ
{x|lim sup
n→∞
E(fAn)−f>√ ε}
=μ
{x|lim sup
n→∞
E(f−gAn)−(f −g)>√ ε}
μ
{x|sup
n1
E(f −gAn)> 12√ ε}
+μ
{x| |f−g|> 12√ ε} √2εf−g1+√2εf−g14√
ε
by Lemma5.6, so
lim sup
n→∞
E(fAn)−f= 0
almost everywhere, showing the almost everywhere convergence.
A similar result holds for decreasing sequences of σ-algebras as follows.
The notation An A∞ used below means that An+1 ⊆An for all n 1 and
A∞=
n1
An.
Example 5.7.LetBdenote the Borelσ-algebra on [0,1] and let An={B∈B|B+21n =B (mod 1)} so thatAn N ={∅, X} modulo m (meaning that
n1An =
m{∅, X}, wheremdenotes Lebesgue measure on [0,1]). As before, what is the connec- tion between the convergence ofσ-algebras and the convergence ofE(fAn)?
As mentioned at the start of this section, the kind of convergence sought here resembles an ergodic theorem(58). Indeed, the proof is similar in some ways to the proofs of the ergodic theorems (Theorems 2.21 and 2.30). The usual proof of the decreasing martingale theorem is somewhat opaque because it takes place in L1 rather than inL2, forcing us to replace the geometric methods available in Hilbert space with more flexible methods from functional analysis. To illuminate the different approaches—and the more geometrical approach that working inL2allows—we give two different arguments for the
5.2 Martingales 129
first part of the proof. Of course the theorem itself is an assertion aboutL1 convergence, so at some point we must work inL1.
Theorem 5.8 (Decreasing martingale theorem). Let (X,B, μ) be a probability space. If An A∞ is a decreasing sequence of sub-σ-algebras ofB then
E(fAn)−→E(fA∞) almost everywhere and inL1, for any f ∈L1(X,B, μ).
First part of proof of Theorem 5.8, using L2.Recall from the proof of Theorem 5.1 that in L2(X,B, μ) the conditional expectation with re- spect toAn (orA∞) is precisely the orthogonal projection to L2(X,An, μ) (resp.L2(X,A∞, μ)). LetVn=L2(X,An, μ)⊥ and letV∗=
n1Vn. Notice that forf ∈L2(X,A∞, μ) +V∗ the theorem holds trivially because
E(fAn) =E(fA∞) for sufficiently largen. We claim that
V =L2(X,A∞, μ) +V∗
is dense in L2(X,B, μ) with respect to the L2μ norm. To see this, we may use the Riesz representation theorem (see Sect. B.5). If V is not dense inL2(X,B, μ), then there is a continuous non-zero linear functional
f →
f¯hdμ defined by someh∈L2(X,B, μ) such that
f¯hdμ= 0
for allf ∈V, and this leads to a contradiction as follows. Clearly h−EhAn
∈Vn⊆V∗,
so
h−E(hAn)¯hdμ= 0.
Sincef →E(fAn) is the orthogonal projection, we also have h−E(hAn)E(hAn) dμ= 0, which implies that
h−E(hAn)2dμ= 0
and soh=E(hAn)∈L2(X,An, μ) for alln1. We conclude that h∈L2(X,A∞, μ)⊆V,
and
h¯hdμ = 0, so h = 0. This contradiction shows that V is dense inL2(X,B, μ) with respect to theL2μ norm.
Now · 1 · 2 andL2(X,B, μ)⊆L1(X,B, μ) is dense with respect to the L1μ norm. It follows that V is also dense inL1(X,B, μ) with respect
to theL1μ norm.
It might seem unsatisfactory to useL2arguments in this way to avoid the more complicated theory of the spaceL1and its dualL∞. To give an example of how it is sometimes possible to decompose functions in a way that mimics the orthogonal decomposition available in Hilbert space, we now do the same part of the proof avoidingL2.
First part of proof of Theorem5.8, usingL1 directly.Let Vn ={f ∈L1(X,B, μ)|E
fAn
= 0}
forn1, soV1⊆V2⊆ · · · is an increasing sequence of subspaces ofL1(X).
We claim thatV∗ =
n1Vn is L1-dense in V∞={f ∈L1(X,B, μ)|E
fA∞
= 0}.
This claim will be crucial for the proof, since it will allow us to split any functionf into two parts for which the result will be easier to prove.
By the Hahn–Banach theorem (Theorem B.1), V∗ is dense in V∞ if any continuous linear functionalΛ:L1(X)→RwithV∗⊆kerΛhasV∞⊆kerΛ.
Any continuous linear functional onL1(X) has the form Λh(f) =
X
f hdμ
for some h ∈ L∞(X), and h is uniquely determined by Λh. So suppose thatVn⊆kerΛh for alln1; it follows that
(f−E(fAn))hdμ= 0
for allf ∈L1(X) andn1. In particular, we may takef =h(sinceL∞(X) is a subset ofL1(X)), so
(h−E(hAn))hdμ= 0.
On the other hand, by Theorem5.1(3),
E(hAn)E(hAn) dμ=
E
E(hAn)hAn
dμ=
E hAn
hdμ
5.2 Martingales 131
so
(h−E(hAn))E(hAn) dμ= 0.
Now
(h−E(hAn))h−(h−E(hAn))E(hAn) =
h−E(hAn)2
and therefore
h−E(hAn)2
dμ= 0.
It follows that h = E(hAn) ∈ L∞(X,An, μ), and so h ∈ L∞(X,A∞, μ).
Thus
E(fA∞) = 0 implies that
f hdμ=
E(f hA∞) dμ=
hE(fA∞) dμ= 0, showing that kerΛh⊇V∞ whenever kerΛh⊇V∗ as required.
Clearly the theorem holds for functions in the space V =L1(X,A∞, μ) +V∗,
which isL1-dense inL1(X) (to see that this space is dense, write anyf ∈L1 asf =E
fA∞ +
f−E
fA∞
where the second term belongs toV∞).
The remainder of the proof of Theorem 5.8 of necessity takes place inL1(X,B, μ).
Second part of proof of Theorem 5.8. Given f ∈ L1(X) and ε > 0, findg∈V with
f−g1< ε.
Then
E(fAn)−E(fA∞)dμ E
(f −g)An
−E
(f−g)A∞dμ + E(gAn)−E(gA∞)dμ
2
|f−g|dμ+ E(gAn)−E(gA∞)dμ, so
lim sup
n→∞
E(fAn)−E(fA∞)dμ2
|f−g|dμ2ε, which shows the convergence inL1.
To see the almost everywhere convergence, notice that μ
{x|lim sup
n→∞
E(fAn)−E(fA∞)>√ ε}
μ
{x|lim sup
n→∞
E
(f−g)An
−E
(f−g)A∞ + lim sup
n→∞
E gAn
−E
gA∞>√ ε}
μ
{x|sup
n1
E
(f−g)An
−E
(f−g)A∞>√ ε}
μ
{x|sup
n1
E
(f−g)An 12√ ε +μ
{x|sup
n1
E
(f −g)A∞>12√ ε} √2εf−g1+√2εf−g14√
ε,
by Doob’s inequality (Lemma5.6), so lim sup
n→∞
E(fAn)−E(fA∞)= 0
almost everywhere.
Exercises for Sect. 5.2
Exercise 5.2.1.Use the increasing martingale theorem (Theorem 5.5) to prove the following version of the Borel–Cantelli lemma (Theorem A.9). Sup- pose that (X,B, μ) is a probability space and (An)n1 is a completely in- dependent sequence of measurable sets (that is, for any finite sequence of indices i1 < · · · < i we have μ(Ai1∩ · · · ∩Ai) = μ(Ai1)· · ·μ(Ai)). If additionally
∞ n=1
μ(An) =∞,
then almost everyxis contained in infinitely many of the setsAn; equivalently μ
∞
N=1
∞ n=N
An
= 1.
Exercise 5.2.2.Use the martingale theorems to prove the following analog of the Lebesgue density theorem (Theorem A.24). Letmbe Lebesgue measure on the cubeC= [0,1]d. Forn1 define the partitionξn ofC into boxes