A variation on the main results - Graduate Texts in Mathematics 261

4.9 Theorem. Suppose that (Xn) converges to X in probability. Then the following are equivalent for it:

a) It converges toX in L¹. b) It is uniformly integrable.

c) It is a sequence inL¹, andX ∈L¹, andE|X_n| →E|X|.

Proof. We have (a) ⇐⇒ (b) by Theorem4.6, and (a)⇒(c) since|E|Xn|−

E|X| | ≤E|Xn−X|.To complete the proof we show that (c)⇒(b). Assume (c), and note that the convergence of (Xn) to X in probability implies the convergence|Xn| → |X| in probability. So, there is no loss of generality in assuming further that theXn andX are all positive.

Let 0 < a < b < ∞. Deﬁne f : R+ → [0, a] by setting f(x) to be x on [0, a], decrease continuously from a at a to 0 at b, and remain at 0 over (b,∞). This f is bounded and continuous. Thus, by proposition 3.5, f◦X_n →f◦X in probability, and applying the implication (b)⇒ (c) here to the sequence (f◦X_n) we see that E f◦X_n → E f◦X, and therefore, E (Xn −f◦Xn) → E (X −f◦X) since E Xn → E X by assumption.

Recall that iε is the indicator of (ε,∞), note that xib(x) ≤ x−f(x) and y−f(y)≤yia(y), replace xwithXn and y with X, and take expectations to conclude that

lim supEXn ib◦Xn≤EX ia◦X .

Sec. 5 Weak Convergence 109 Fix ε > 0. By the integrability of X, the right side goes to 0 as a → ∞; choose a so that the right side is less than ε/2. Then, deﬁnition of limit superior shows that there ismsuch that

supn>mEXn ib◦Xn ≤ε 4.10

for allb≥a+ 1. Since X₁, . . . , Xm are inL¹ by assumption, by choosinga still larger if necessary, we ensure that4.10holds with the supremum taken over all n. Thus (Xn) is uniformly integrable.

Exercises and complements

4.11 Convergence of expectations. Let X₁, X₂, . . . , X be in L¹. Show that Xn →X in L¹ if and only ifEXn1H →EX1H uniformly in H in H, that is, if and only if

limn sup

H∈H|EXn1H−EX1H|= 0.

4.12 Continuation. If Xn → X in L¹, and Vn → V in L¹, and (Vn) is a bounded sequence, thenEXnVn→EXV. Show.

4.13 Convergence in L^p,p∈[1,∞).Show that the following are equivalent for every sequence (X_n):

a) The sequence converges in L^p.

b) The sequence is Cauchy in L^p, that is, E |X_m−X_n|^p → 0 as m, n→ ∞.

c) The sequence converges in probability and (X_n^p) is uniformly integrable.

Hint: Follow the proof of the basic theorem and use the generalization

|x+y|^p≤2^p⁻¹(|x|^p+|y|^p) of the triangle inequality.

4.14Weak convergence inL¹. A sequence is uniformly integrable if and only if its every subsequence has a further subsequence that converges weakly inL¹. This is a deep result.

5 Weak Convergence

This section is about the convergence of sequences of probability measures on a given topological space. We limit ourselves to the spaceRand make a few remarks for the case of general spaces.

Let (Ω,H,P) be a probability space. Letμ₁, μ₂, . . . , μbe probability measures onR. LetX₁, X₂, . . . , XbeR-valued random variables whose respective distributions areμ₁, μ₂, . . . , μ. Finally, letCb=Cb(R,R), the collection of all bounded continuous functions fromRintoR.

5.1 Definition. The sequence (μn) is said to converge weakly to μ iflimμ_nf =μf for everyf inCb. The sequence(X_n)is said to converge in distributiontoX if (μ_n)converges toμ weakly, that is, if

lim Ef◦Xn=Ef◦X for everyf in Cb.

5.2 Remarks. a) Convergence in probability (or in L¹, or almost surely) implies convergence in distribution. To see this, let X_n → X in probability; and let f ∈ Cb. Then, by Theorem 3.3, every subsequence of N has a further subsequence N such that Xn → X along N almost surely, f◦Xn → f◦X along N almost surely by the continuity of f, and Ef◦Xn →E f◦X along N by the bounded convergence theorem. By the selection principle1.6, then,Ef◦Xn→Ef◦X.

b) There is a partial converse: Suppose that (Xn) converges to X in distributionand X =x₀ for some ﬁxed pointx₀. Then, in particular, forf deﬁned by lettingf(x) =|x−x₀| ∧1,

E|Xn−X| ∧1 =Ef◦Xn→Ef◦X=f(x₀) = 0. Thus,Xn→X =x₀ in probability by Proposition3.8.

c) In general, convergence in distribution implies no other kind. For example, if the Xn are independent and have the same distribution asX, thenμ₁=μ₂=· · ·=μand (Xn) converges in distribution toX. But it does not converge in probability except in the trivial case whereX₁=X₂=· · ·= X=x₀almost surely for some ﬁxed pointx₀.

d) As the preceding remark illustrates, convergence of (Xn) in distribution has little to do with the convergence of (Xn) as a sequence of functions.

Convergence in distribution is merely a convenient turn of phrase for the weak convergence of the corresponding probability measures.

5.3 Examples. a) Convergence to Lebesgue measure. Let μn be the probability measure that puts mass 1/nat each of the points 1/n,2/n, . . . , n/n. Then, forf in Cb,

μ_nf = n k=1

1 nf(k

n)→ ˆ ₁

du f(u) =λf ,

whereλdenotes the Lebesgue measure on [0,1]. Thus, (μn) converges weakly toλ.

b) Quantile functions. Letq : (0,1) →R be the quantile function corresponding to μ and deﬁne qn similarly for μn; see Exercise II.1.18 et seq.

Then,μ=λ◦q⁻¹andμn=λ◦q_n⁻¹, whereλis the Lebesgue measure on (0,1).

Suppose thatqn(u)→q(u) forλ-almost everyuin (0,1). Forf in Cb, then, f◦qn(u)→f◦q(u) forλ-almost everyuin (0,1) and therefore

Sec. 5 Weak Convergence 111 μnf =λ(f◦q_n)→λ(f◦q) =μf

by the bounded convergence theorem. Hence, if (qn) converges to q almost everywhere on (0,1), then (μn) converges toμ weakly. We shall see later in Proposition5.7that the converse holds as well.

Characterization theorem

The following basic theorem characterizes weak convergence of (μ_n) toμ in terms of the convergence of numbersμ_n(A). Here,∂Adenotes the bound- ary ofA, ¯A its closure, ˚A its interior (so that ¯A = A∪∂A, ˚A = A\∂A, A¯\A˚=∂A). We also writed(x, y) for the usual distance,|x−y|, between the points ofR.

5.4 Theorem. The following are equivalent:

a) (μn)converges weakly to μ.

b) lim supμn(A)≤μ(A)for every closed set A.

c) lim infμn(A)≥μ(A)for every open setA.

d) limμn(A) =μ(A)for every Borel set Awithμ(∂A) = 0.

Proof. We shall show that (a)⇒(b) ⇐⇒ (c)⇒(d)⇒(a). i) Assume (a). Let A be closed. Let d(x, A) = inf{d(x, y) :y ∈ A} and let A_ε ={x: d(x, A)< ε}. SinceAis closed, A_ε shrinks toA and, hence, μ(A_ε)μ(A) asε0. Thus, to show (b), it is suﬃcient to show that

lim supμ_n(A)≤μ(A_ε) 5.5

for everyε >0. To this end, ﬁxε >0 and deﬁnef(x) = (1−d(x, A)/ε)∨0.

Then, f is continuous and bounded, and 1_A ≤ f, and f ≤ 1_A_ε. Hence, μn(A)≤μnf,μf ≤μ(Aε), andμn(f)→μf, which show that5.5holds.

ii) We have (b) ⇐⇒ (c), because the complements of open sets are closed and vice versa, and lim inf(1−rn) = 1−lim suprn for every sequence (rn) in [0,1].

iii) Suppose that (c) and therefore (b) hold. LetAbe a Borel set. Since A¯⊃A⊃A, using (b) and (c), we obtain˚

μ( ¯A)≥lim supμ_n( ¯A)≥lim supμ_n(A)≥lim infμ_n(A)≥lim infμ_n( ˚A)≥μ( ˚A).

Ifμ(∂A) = 0, thenμ( ¯A) =μ( ˚A), and all the inequalities here become equal- ities and show that limμn(A) =μ(A), So, (d) holds.

iv) Suppose that (d) holds. Letf ∈Cb. Chooseaand bin Rsuch that a < f < b. Fix ε >0 arbitrary. Considering the probability measureμ◦f⁻¹ on (a, b), picka=a₀ < a₁ <· · ·< ak =b such thatai−ai−1 ≤εfor all i and noai is an atom forμ◦f⁻¹(this is possible since a probability measure has at most countably many atoms). LetAi=f⁻¹(ai−1, ai], deﬁne

g= k

ai−11A_i , h= k

ai1A_i

and observe that

f−ε≤g≤f ≤h≤f+ε.

5.6

Ifx∈∂Ai then f(x) is eitherai−1 orai, neither of which is an atom for μ◦f⁻¹. Thus,μ(∂A_i) = 0 and it follows from assuming (d) that μ_n(A_i)→ μ(A_i) asn→ ∞ fori = 1, . . . , k. Thus, μ_ng →μg and μ_nh→μh, and5.6 yields

μf−ε ≤ μg= limμng ≤ lim infμnf

≤ lim supμnf ≤ limμnh=μh ≤ μf+ε.

In other words, limit inferior and limit superior of the sequence (μnf) are sandwiched between the numbersμf −εandμf +εfor arbitraryε >0. So,

μ_nf →μf as needed to show that (a) holds.

Dalam dokumen Graduate Texts in Mathematics 261 (Halaman 123-127)