Conditional Measures - Conditional Measures and Algebras

Conditional Measures and Algebras

5.3 Conditional Measures

situations null sets with respect to the original measure become more trou- blesome, since the conditional measures will often be singular with respect to the original measureμ.

Thus we need to pay more attention to null sets—and in particular we will need to specify in a more concrete way with respect to which measure a given set has measure zero. When we say N is a null set we mean that μ(N) = 0; in contrast for x ∈ X we will say N is a μ^A_x -null set if μ^A_x(N) = 0.

Similarly, we will need to make a distinction between the notion of “almost everywhere” (true oﬀ aμ-null set) and “μ^A_x -almost everywhere”. From now on we will also distinguish more carefully between the space L^p(X,B, μ) of genuine functions and the more familiar spaceL^p(X,B, μ) of equivalence classes of functions; in particular L^∞(X,B) denotes the space of bounded measurable functions and will be writtenL^∞if the underlying measure space is clear.

We next formalize our prevailing assumption about the measure spaces we deal with. A probability space is any triple (X,B, μ) where μ is a measure on the σ-algebra B with μ(X) = 1. It turns out that this deﬁnition is too permissive for some—but by no means all—of the natural developments in ergodic theory.

Example 5.11.LetX ={0,1}^R, with the product topology and theσ-algebra of Borel sets. The product measureμ of the (¹₂,¹₂) measure on each of the sets {0,1} makes X into a probability space with the property that there is an uncountable collection {As}s∈R of measurable sets with the property thatμ(As) =¹₂ for eachs∈Rand the sets are all mutually independent:

μ(As1∩ · · · ∩Asn) = ₂¹n

for any n distinct reals s₁, . . . , s_n. The next deﬁnition gives a collection of probability spaces that precludes the possibility of uncountably many independent sets.

Deﬁnition 5.12.LetX be a Borel subset of a compact metric space with the restriction of the Borelσ-algebraBtoX. Then the pair (X,B) is aBorel space.

Deﬁnition 5.13.Let X be a dense Borel subset of a compact metric space X, with a probability measure μ deﬁned on the restriction of the Borelσ-algebraBtoX. The resulting probability space (X,B, μ) is aBorel probability space.

For a compact metric space X, the space M(X) of Borel probability measures on X itself carries the structure of a compact metric space with respect to the weak*-topology. In particular, we can deﬁne the Borel σ- algebraB_M(X) on the spaceM(X) in the usual way. IfX is a Borel subset of a compact metric spaceX, then we deﬁne

5.3 Conditional Measures 135

M(X) ={μ∈M(X)|μ(XX) = 0},

and we will see in Lemma5.23thatM(X) is a Borel subset ofM(X).

We are now in a position to state and prove the main result of this chapter.

A set is calledconullif it is the complement of a null set. Forσ-algebrasC,C the relation

C ⊆

μ C

means that for anyA∈C there is a setA∈C withμ(A A) = 0. We also deﬁne

C =

μ C to mean thatC ⊆

μ C andC⊆

μ C.

A σ-algebra A on X is countably-generated if there exists a countable set{A1, A2, . . .}of subsets ofXwith the property thatA =σ({A1, A2, . . .}) is the smallest σ-algebra (that is, the intersection of every) σ-algebra containing the setsA1, A2, . . ..

Theorem 5.14.Let(X,B, μ)be a Borel probability space, andA ⊆Baσ- algebra. Then there exists an A-measurable conull set X ⊆ X and a sys- tem {μ^A_x |x∈X} of measures onX, referred to as conditional measures, with the following properties.

(1)μ^A_x is a probability measure onX with E(fA)(x) =

f(y) dμ^A_x (y) (5.3) almost everywhere for allf ∈L¹(X,B, μ). In other words, for any func- tion^∗ f ∈ L¹(X,B, μ) we have that

f(y) dμ^A_x (y) exists for all x be- longing to a conull set in A, that on this set

x→

f(y) dμ^A_x(y) depends A-measurably on x, and that

f(y) dμ^A_x(y) dμ(x) =

fdμ for allA∈A.

(2)If A is countably-generated, thenμ^A_x ([x]_A) = 1 for allx∈X, where [x]_A =

x∈A∈A

∗Notice that we are forced to work with genuine functions inL¹in order that the right- hand side of (5.3) is deﬁned. As we said before,μ^A_x may be singular toμ.

is the atom ofA containingx; moreover μ^A_x =μ^A_y forx, y∈X when- ever[x]_A = [y]_A.

(3)Property(1)uniquely determinesμ^A_x for a.e.x∈X. In fact, property(1) for a dense countable set of functions in C(X)uniquely determines μ^A_x for a.e.x∈X.

(4)If Ais anyσ-algebra with A =

A, thenμÂ_x =μÂ_xêalmost everywhere.

Remark 5.15.Theorem5.14is rather technical but quite powerful, so we as- semble here some comments that will be useful both in the proof and in situations where the results are applied.

(a) For a countably generated σ-algebra A = σ({A1, A2, . . .}) the atom in (2) is given by

[x]_A =

x∈Ai

Ai∩

x /∈A_i

XA_i (5.4)

and hence isA-measurable (see Exercise5.3.1). In fact [x]_A is the smallest element ofA containingx.

(b) IfN ⊆Xis a null set forμ, thenμÂ_x (N) = 0 almost everywhere. In other words, for a μ-null set N, the set N is also a μÂ_x-null set for μ-almost everyx. This follows from property (1) applied to the function f =χN. In many interesting cases, the atoms [x]_A are null sets with respect toμ, and so μÂ_x is singular toμ.

(d) Notice that the uniqueness in property (3) (and similarly for (4)) may require switching to smaller conull sets. That is, ifμÂ_x forx∈X ⊆X andμÂ_x forx∈X⊆X are two systems of measures as in (1), then the claim is that there exists a conull subsetX ⊆X∩X with μÂ_x =μÂ_x for allx∈X.

(e) We only ever talk about atoms for countably generated σ-algebras. The first reason for this is that for a generalσ-algebra the expression defined in Theorem 5.14(2) by an uncountable intersection may not be measurable (let alone A-measurable). Moreover, even in those cases where the expression happens to be A-measurable, the definition cannot be used to prove the stated assertions. We also note that it is not true that any sub-σ-algebra of a countably-generatedσ-algebra is countably generated (but see Lemma5.17for a more positive statement). For example, the σ-algebra of null sets in Twith respect to Lebesgue measure is not countably-generated (but there are more interesting examples, see Exer- cise 6.1.2).

Example 5.16.LetX = [0,1]²andA =B× {∅,[0,1]}as in Example5.3. In this case Theorem5.14claims that any Borel probability measureμonX can

5.3 Conditional Measures 137

be decomposed into “vertical components”: the conditional measuresμ^A_(x

1,x2)

are deﬁned on the vertical line segments {x1} ×[0,1], and these sets are precisely the atoms ofA. Moreover,

μ(B) =

μ^A_(x₁_,x₂₎(B) dμ(x1, x2). (5.5) In this exampleμ^A_(x

1,x₂)=ν_x₁does not depend onx₂, so (5.5) may be written as

μ(B) =

[0,1]

νx₁(B) dμ(x1) (5.6)

whereμ=π_∗μis the measure on [0,1] obtained by the projection π: [0,1]² −→[0,1]

(x₁, x₂)−→x₁.

While (5.6) looks simpler than (5.5), in order to arrive at it a quotient space and a quotient measure has to be constructed (see Sect.5.4). For simplicity we will often work with expressions like (5.5) in the general context.

Onceμis known explicitly, the measuresμ^A_(x

1,x2)can often be computed.

For example, ifμis deﬁned by

fdμ=¹₃

f(s, s) ds+ 1

^√s 0

f(s, t) dtds, then

μ^A_(x

1,x₂)= 1

√x₁+ 1/3δx₁×₁

3δx₁+m_[0,^√_x₁_] .

To see that this equation holds, the reader should use Theorem5.14(3). How- ever, the real force of Theorem5.14lies in the fact that it allows an unknown measure to be decomposed into components which are often easier to work with.

Proof of Theorem 5.14. By assumption, X is contained in a compact metric spaceX, which is automatically separable. We note that the statement of the theorem for the ambient compact metric spaceX implies the theorem for X by Remark 5.15(b). Hence we may assume that X = X is itself a compact metric space.

Suppose ﬁrst that{ρx}and{νx}are families of measures deﬁned for almost everyxthat both satisfy (5.3) for a countable dense subset{fn}n∈NinC(X).

Then for eachn1 and almost everyx,

fndρx=E(fnA) =

fndνx. (5.7)

So there is a common null setNwith the property that (5.7) holds for alln 1 and x /∈ N. By uniform approximation and the dominated convergence theorem (Theorem A.18), this easily extends to show that

fdρ_x=

fdν_x

for allf ∈C(X) andx /∈N. Henceρx=νxforx /∈N, which shows that the conditional measures—if they exist—must be unique as claimed in (3).

Now let

μ A

and write A for the smallest σ-algebra containing both Aand A. Then for any f ∈ C(X), g = E(fA) (or E(f A)) satisfies the characterizing properties ofE(fA), so they are equal almost everywhere. Noting this for a countable dense subset of C(X) shows (as in the proof of uniqueness) thatμÂ_x =μÂ_xêalmost everywhere, showing (4).

Turning to existence, let

F ={f₀≡1, f₁, f₂, . . .} ⊆C(X)

be a vector space overQthat is dense^∗inC(X). For everyi1, choose anA- measurable function^† gi∈Lμ¹withgirepresentingE(fiA). Deﬁneg0to be the constant function 1. Then

• gi(x)0 almost everywhere iffi0;

• |gi(x)|fi∞ almost everywhere;

• iffi=αfj+βfk withα, β∈Q, thengi(x) =αgj(x) +βgk(x) for almost allx.

LetN ∈A be the union of all the null sets on the complement of which the properties above hold; since this is a countable union,N is a null set.

For x /∈ N, deﬁne Λ_x(f_i) to be g_i(x). Then by the properties aboveΛ_x is a Q-linear map fromF to R withΛ_x 1. It follows that Λ_x extends uniquely to a continuous positive linear functional

Λ_x:C(X)→R.

By the Riesz representation theorem, there is a measureμ^A_x onX characterized by the property that

∗SinceX is separable we may ﬁnd a set{h0≡1, h1, h2, . . .}that is dense inC(X). The vector space overQspanned by this set is dense and countable, and may be written in the form{f0≡1, f1, f2, . . .}.

† Notice that this is a genuine function rather than an equivalence class of functions, so there is a choice involved despite Theorem5.1(1).

5.3 Conditional Measures 139

Λ_x(f) =

fdμ^A_x

for allf ∈C(X); moreoverΛ_x(1) = 1, soμ^A_x is a probability measure.

By our choice of the setF, for any f ∈ C(X) there is a sequence (f_n_i) withfn_i−→f uniformly. We have already established that

x→

fn_idμ^A_x isA-measurable (by Theorem5.14(1)), and that

fn_idμ^A_xdμ(x) =

fn_idμ

for allA∈A. So, by the dominated convergence theorem (Theorem A.18),

f_n_idμ^A_x →

fdμ^A_x (5.8)

isA-measurable as a function ofx, and

fdμ^A_xdμ(x) =

fdμ (5.9)

for allA∈A. For any open setO there is a sequence (fn_i) withfn_i χO, so by the monotone convergence theorem (5.8) and (5.9) hold forχO. Thus we have (5.8) and (5.9) for the indicator function of any closed A⊆X, by taking complements. Similarly, these equations extend to anyGδ-set Gand anyFσ-setF. Deﬁne

M ={B∈B|f =χB satisﬁes (5.8) and (5.9)}.

By the monotone convergence theorem (Theorem A.16), if B1, B2, . . . ∈M with

B1⊆B2⊆ · · · , then

n1B_n∈M and ifC₁, C₂, . . .∈M with C₁⊇C₂⊇ · · ·, then

n1C_n ∈ M. Thus M is a monotone class (see Deﬁnition A.3 and Theorem A.4). Deﬁne

R= _n

i=1

O_i∩A_i|O_i⊆X is open andA_i⊆X is closed

forn∈N. We claim thatRis an algebra (that is,Ris closed under complements, finite intersections and finite unions). To see this, notice that theσ- algebraC generated by finitely many open and closed sets has the property that every element ofC is a disjoint union of atoms of the partition generated by the same open and closed sets, all of which are precisely of the formO∩A.

Since any setO∩Ais aGδ-set and (5.8) and (5.9) are linear conditions, it follows that (5.8) and (5.9) also hold for functions of the form

χ_R= n i=1

χ_O_i_∩_A_i

for all

R= n i=1

O_i∩A_i∈R.

By the monotone class theorem (Theorem A.4),B =σ(R)⊆M. In other words, for any Borel measurable setB ∈ B, the characteristic function χB

satisﬁes (5.8) and (5.9). By considering simple functions and applying the monotone convergence theorem, it follows that (5.8) and (5.9) also hold for anyB-measurable functionf 0.

Finally, given anyB-measurable integrable functionf, we may write f =f⁺−f⁻

withf⁺, f⁻non-negative, measurable, and integrable functions. Then, by (5.9),

f⁺dμ^A_x ,

f⁻dμ^A_x <∞

almost everywhere. In particular,f isμ^A_x-integrable for almost everyx, and where it isμ^A_x -integrable,

fdμ^A_x is anA-measurable function ofx. Finally, (5.9) holds, proving (1).

Suppose now thatA =σ({A1, A2, . . .}) is countably-generated. Then E(χA_iA)(x) =χA_i(x)

=μ^A_x(Ai)

almost everywhere, for anyi 1. Collecting all the null sets arising into a single null setN gives

μ^A_x(Ai) =

1 ifx∈AiN;

0 ifx∈X(A_i∪N).

Sinceμ^A_x is a measure, it follows by (5.4) that μ^A_x ([x]_A) = 1

5.3 Conditional Measures 141

ifx /∈N. WritingX forXN, recall that the map X x−→

fdμ^A_x isA-measurable for anyf ∈C(X). Thus

fdμ^A_x =

fdμ^A_y

ifx, y∈X and [x]_A = [y]_A, so that [x]_A = [y]_A implies thatμ^A_x =μ^A_y . One of the many desirable properties of Borel probability spaces is that there is a constraint on the complexity of their sub-σ-algebras.

Lemma 5.17.If (X,B, μ)is a Borel probability space and A ⊆B is a σ- algebra then there is a countably-generatedσ-algebraAwithA =

Proof. Recall that C(X) is separable for any compact metric space X (see Lemma B.8). SinceC(X) is mapped continuously to a dense subspace ofL¹(X,B, μ), the same holds forL¹(X,B, μ). Since subsets of a separable space are separable, it follows that the space

{χA|A∈A} ⊆L¹(X,A, μ)⊆L¹(X,B, μ)

is separable. Thus there is a set{A1, A2, . . .} ⊆A such that for anyε >0 andA∈A there is somenwith

μ(A An) =χA−χA_n1< ε.

LetA=σ({A1, A2, . . .}), so thatA⊆A and{χA|A∈A}is dense in {χA|A∈A}

with respect to theL¹_μ norm. GivenA∈A, we can ﬁnd a sequence (nk) for which

χA−χA_nk1< 1 k

fork1. Then the sequence (χA_nk) is Cauchy inL¹(X,A, μ)⊆L¹(X,A, μ), so has a limit f ∈ L¹(X,A, μ). We must have f = χA almost everywhere since the limit is unique, so there is someA∈Awithμ(A A) = 0. It follows thatA =

Aas required.

In the remainder of this section we give extensions and reformulations of Theorem5.14.

Lemma 5.18.Let(X,B, μ)be a Borel probability space and letA ⊆Bbe a countably-generatedσ-algebra. Iff ∈L^∞(X,B)is constant on atoms ofA, thenf|X isA-measurable, where X is as in Theorem5.14.

A setB (or a functionf) isA-measurable moduloμif, after removing aμ- null set,B(orf) becomesA-measurable. Thus the conclusion of Lemma5.18 is thatf isA-measurable moduloμ.

Proof of Lemma5.18.By Theorem5.14(2), onX we have

fdμ^A_x =f(x)

sinceμ^A_x ([x]_A) = 1 and, by assumption, f is constant (and equal to f(x)) on the set [x]_A. By Theorem5.14(1) we know thatf|X isA-measurable.

In Theorem5.14the conditional measure was characterized in terms of the conditional expectation. The following proposition gives a more geometrical characterization.

Proposition 5.19.Let(X,B, μ)be a Borel probability space and let A be a countably-generated sub-σ-algebra ofB. Suppose that there is a set X ∈B with μ(X) = 1, and a collection {νx|x∈X} of probability measures with the property that

• x→νx is measurable, that is for anyf ∈L^∞ we have that x→ fdνx

is measurable,

• νx=νy for[x]_A = [y]_A andx, y∈X,

• νx([x]_A) = 1, and

• μ=

νxdμ(x)in the sense that

fdμ= fdνxdμ(x)for all f ∈L^∞. Thenνx=μ^A_x for a.e.x. The same is true if the properties hold for a dense countable set of functions in C(X).

Proof.First notice that we may assume that μ^A_x and νx are deﬁned on a common conull setX. Moreover, we may replaceX byX and simultane- ously replace A by A|X = {A∩X | A ∈ A}. After this replacement, Lemma 5.18says that any function f which is constant on A-atoms is A- measurable. In order to apply Theorem5.14(3) we need to check that

fdν_x=E fA

(x) (5.10)

almost everywhere, for allf in a countable dense subset of C(X).

That x→

fdνx is measurable is the ﬁrst assumption on the family of measures in the proposition. Together with Lemma5.18, the second property shows thatx→

fdνx is actually A-measurable. This is the ﬁrst require- ment in the direction of showing that (5.10) holds.

5.3 Conditional Measures 143

To show (5.10), we also need to calculate

fdνxdμ(x) for anyA∈A, as in Theorem5.1(1). We know thatχA(x) is constantνx-almost everywhere for any A ∈ A, by the third property. In fact χA(x) equals 1 νx-almost everywhere ifx∈Aand equals 0 otherwise. Therefore, by the fourth property applied to the functionχAf, we get

f(z) dν_x(z) dμ(x) =

χ_A(x)

[x]_A

f(z) dν_x(z) dμ(x)

χA(z)f(z) dνx(z) dμ(x)

χA(z)f(z) dμ(z) =

fdμ

as required. By Theorem5.14(3) it follows thatνx=μ^A_x almost everywhere.

It remains to prove the last claim of the proposition. So suppose we only assume the first and fourth properties for all functions in a dense countable subset ofC(X). Using dominated convergence, monotone convergence, and the monotone class theorem (Theorems A.18, A.16 and A.4) just as in the proof of Theorem5.14on p.138, we can extend the first and fourth properties in turn to allf ∈C(X), allf =χ_BforBany open set, any closed set, anyG_δ, any F_σ, any Borel set, and finally to anyf ∈L^∞(X). This implies the last

claim.

Proposition 5.20.Let(X,B, μ)be a Borel probability space, and let A ⊆A ⊆B

be countably-generated sub-σ-algebras. Then [z]_A ⊆ [z]_A for z ∈ X, and for almost every z ∈ X the conditional measures for the measureμÂ_z with respect toA are given forμÂ_z -almost every x∈[z]_A by (μÂ_z)Â_x =μÂ_x .

The proof of this result will reveal that it is a reformulation of Theo- rem5.1(4).

Proof of Proposition5.20.We will show that the mapx→μÂ_x satisfies all the assumptions in Proposition5.19with respect to the measureμÂ_z for almost everyz∈X. LetμÂ_z be defined onX_A ∈A and letμÂ_x be defined on X_A ∈A with all the properties in Theorem 5.14. By Remark 5.15(b), we have μÂ_z (X_A ) = 1 for μ-almost every z. Now fix some z ∈ X_A

withμÂ_z (X_A ) = 1. Forx, y∈X_A we know thatμÂ_x =μÂ_y if [x]_A = [y]_A and that μÂ_x([x]_A) = 1 by Theorem 5.14(2). Also, if f ∈ L^∞, we know that

fdμÂ_x is measurable by Theorem5.14(1). Thus we have shown the first three assumptions of Proposition5.19on the complement of a singleμÂ_z -null set.

It remains to check that

μ^A_z =

μ^A_x dμ^A_z (x) (5.11)

for almost everyz. Letf ∈C(X). By Theorem5.14(1), for almost everyz,

fdμ^A_x dμ^A_z (x) =

E(fA)(x) dμ^A_z (x) =E

E(fA)A (z), which by Theorem5.1(1) is equal to

E(fA)(z) =

fdμ^A_z

for almost everyz. Using a dense subset{f1, f2, . . .} ⊆C(X), and collecting the countably many null sets arising in these two statements for eachninto a single null set, we obtain equality in (5.11) on a conull setZ. In other words, we have checked all the requirements of Proposition 5.19 for the family of measuresνx=μÂ_x (and therefore they are equal almost everywhere to μÂ_x) and for the measureμÂ_z forz∈Z∩X_A withμÂ_z (X_A ) = 1.

Theorem5.14(3) and the more geometric discussion above highlights the signiﬁcance of the countably-generated hypothesis on theσ-algebraA, for in that case the conditional measuresμ^A_x can be related to the atoms [x]_A. In a Borel probability space it is safe to assume thatσ-algebras are countably- generated by Lemma5.17.

By combining the increasing and decreasing martingale theorems (Theo- rems5.5and5.8) with the characterizing properties of the conditional measures we get the following corollary (see Exercise5.3.5).

Corollary 5.21.If An A or An A then μ^A_xⁿ −→ μ^A_x in the weak*- topology forμ-almost every x.

This gives an alternative construction ofμÂ_x for a countably-generatedσ- algebra. More concretely, ifA =σ({A1, A2, . . .}) andAn=σ({A1, . . . , An}) is the finiteσ-algebra generated by the firstngenerators ofA, thenμÂ_xⁿ is readily defined, andμÂ_xⁿ→μÂ_x.

Exercises for Sect. 5.3

Exercise 5.3.1.Prove the equality claimed in (5.4).

Exercise 5.3.2.⁽⁵⁹⁾Let (X,B, μ) be an aperiodic measure-preserving trans- formation on a Borel probability space (see Exercise 2.9.2 for the deﬁnition of aperiodic). Prove that for anyk1 there is a setA∈Bwith μ(A)>0 andμ(T⁻^k(A)∩A) = 0.

Dalam dokumen Ergodic Theory (Halaman 152-164)