Lecture-26: Martingale Concentration Inequalities

(1)

Lecture-26: Martingale Concentration Inequalities

1 Introduction

Consider a probability space(Ω,F,P)and a discrete filtrationF•= (Fn⊆F:n∈N). LetX:Ω→R^N be discrete random process and stopping timeτ:Ω→N, both adapted to the filtrationF•.

Lemma 1.1. If X is a submartingale andτis a bounded stopping time such that P{τ⩽n}=₁_then EX1⩽EXτ⩽EXn.

Proof. Sinceτis bounded, it follows from the optional stopping theorem thatE[X_τ]⩾E[X₁]. Further, we observe that{τ=k} ∈FkandXis a submartingale, and therefore

E[Xn1{τ=k}|_F_k] =₁_{τ=k}_E[Xn|_F_k]⩾1{τ=k}X_k=X_τ1{τ=k}.

It follows thatE[Xn1{τ=k}]⩾E[Xτ1{τ=k}]. In addition, ∑ⁿk=11{τ=k}=1 almost surely, and hence we observe that

EXτ=E[Xτ

∑

n k=1

1{τ=k}]⩽

∑

k=1ⁿ

E[Xn1{τ=k}] =EXn.

Theorem 1.2 (Kolmogorov’s inequality for submartingales). For a non-negative submartingale X and a>0,

P

maxi∈[n]X_i>a

⩽ ^E[Xn] a .

Proof. We define a random timeτa≜inf{i∈_N:X_i>a}and stopping timeτ≜τa∧n. It follows that,

maxi∈[n]Xi>a

=∪_i∈[n]{Xi>a}={Xτ>a}. Using this fact and Markov inequality, we getPn

max_i∈[n]X_i>ao

=P{X_τ>a}⩽^E^[X_a^τ^]. Sinceτ⩽nis a bounded stopping time, result follows from the Lemma 1.1.

Corollary 1.3. For a martingale X and positive constant a,

P

maxi∈[n]|X_i|>a

⩽^E|Xn|

a , P

maxi∈[n]|X_i|>a

⩽^EX

2n

a² .

Proof. The proof the above statements follow from and Kolmogorov’s inequality for submartingales, and by considering the convex functionsf(x) =|x|and f(x) =x².

Theorem 1.4 (Strong Law of Large Numbers). Let S:Ω→R^N be a random walk withi.i.d.step size X having finite meanµ. If the moment generating function M(t) =_E[e^tXⁿ]for random variable Xn exists for all t∈_R₊, then

P

n∈limN

Sn

n =µ

=1.

Proof. For a givenϵ>0, we define g:R+→_R₊ for allt∈_R₊ asg(t)≜ ^e^t_M(t)⁽^µ⁺^ϵ⁾. Then, it is clear that g(₀) =_{1 and}

g^′(0) = ^M(0)(µ+ϵ)−M^′(0)

M²(0) =ϵ>0.

1

(2)

Hence, there exists a valuet0>0 such thatg(t0)>1. We now show that^S_nⁿ can be as large asµ+ϵonly finitely often. To this end, note that

Sn

n ⩾^µ+ϵ

⊆

e^t⁰^Sⁿ

M(t0)ⁿ ⩾g(t0)ⁿ

(1) However,Yn ≜ _M^e^tn⁰(t^Sn₀) =_∏ⁿ_i=1_M(t^e^t^0Xi

0) is a product of independent non negative random variables with unit mean, and hence is a non-negative martingale with sup_nEYn =1. By martingale convergence theorem, the limit limn∈NYnexists and is finite.

Sinceg(t₀)>1, it follows from (1) that P

Sn

n ⩾µ+ϵfor an infinite number of n

=0.

Similarly, defining the functionf(t)≜^e^t_M(t)⁽^µ⁻^ϵ⁾ and noting that since f(0) =1 and f^′(0) =−ϵ, there exists a valuet0<0 such that f(t0)>1, we can prove in the same manner that

P Sn

n ⩽^µ−ϵfor an infinite number of n

=0.

Hence, result follows from combining both these results, and taking limit of arbitraryϵdecreasing to zero.

Definition 1.5. A discrete random process X:Ω→_R^N with distribution function Fn≜F_X_n for each n∈N, is said to beuniformly integrableif for everyϵ>0, there is ay_ϵsuch that for eachn∈_N

E[|Xn|1_{|X_n|>y_ϵ}] = Z

|x|>y_ϵ|x|dFn(x)<ϵ.

Lemma 1.6. If X :Ω→R^N is uniformly integrable then there exists finite M such thatE|Xn|<M for all n∈N.

Proof. Lety₁be as in the definition of uniform integrability. Then E|Xn|=

Z

|x|⩽y1

|x|dFn(x) + Z

|x|>y₁|x|dFn(x)⩽y1+1.

1.1 Generalized Azuma Inequality

Lemma 1.7. For a zero mean random variable X with support[−α,β]and any convex function f Ef(X)⩽ ^β

α+βf(−α) + ^α α+βf(β).

Proof. From convexity of f, any point(X,Y)on the line joining points(−α,f(−α)and(β,f(β))is Y= f(−α) + (X+α)^f(β)−f(−α)

β+α ⩾ f(X). Result follows from taking expectations on both sides.

Lemma 1.8. Forθ∈[0, 1]andθ¯≜1−θ, we haveθe^θx^¯ +θe^¯ ⁻^θx⩽e^x²^/8.

Proof. Definingα≜2θ−1,β≜ ^x₂, and f(α,β)≜coshβ+αsinhβ−e^αβ+β²^/2, we can write θe^θx^¯ +θe^¯ ^−θx−e^x²^/8= (1+α)

2 e^(1−α)β−(1−α)

2 e^−(1+α)β−e^β²^/2=e^−αβf(α,β).

Therefore, we need to show that f(α,β)⩽0 for allα∈[−1, 1] and β∈R. This inequality is true for

|α|=1 and sufficiently largeβ. Therefore, it suffices to show this forβ<Mfor someM. We take the partial derivative of f(α,β)with respect to variablesα,βand equate it to zero to get the stationary point,

sinhβ+αcoshβ= (α+β)e^αβ+β²^/2, sinhβ=βe^αβ+β²^/2. Ifβ̸=0, then the stationary point satisfies 1+αcothβ=1+^α

β, with the only solution beingβ=tanhβ.

By Taylor series expansion, it can be seen that there is no other solution to this equation other than β=0. Sincef(α, 0) =0, the lemma holds true.

2

(3)

Proposition 1.9. Let X be a zero-mean martingale with respect to filtrationF•, such that−α⩽Xn−Xn−1⩽β for each n∈N. Then, for any positive values a and b

P{Xn⩾a+bn for some n}⩽exp

− ^8ab (α+β)²

. (2)

Proof. LetX0=0 andc>0, then we define a random sequenceW:Ω→R^Nadapted to filtrationF•, such that

Wn≜e^c(Xⁿ^−a−bn)=Wn−1e^−cbe^c(Xⁿ^−Xⁿ⁻¹⁾, n∈Z+.

We will show thatWis a supermartingale with respect to the filtrationF•. It is easy to see thatσ(Wn)∈ Fnfor eachn∈N. We can also see thatE|Wn|<_∞for alln. Further, we observe

E[Wn|Fn−1] =Wn−1e^−cbE[e^c(Xⁿ^−Xⁿ⁻¹⁾|Fn−1].

Applying Lemma??to the convex function f(x) =e^cx, replacing expectation with conditional expectation, the fact thatE[Xn−Xn−1|Fn−1] =0, and settingθ=_(α+β)^α ∈[0, 1], we obtain that

E[e^c(Xⁿ^−Xⁿ⁻¹⁾|Fn−1]⩽^βe

−cα+αe^cβ

α+β =θe^¯ ^{−c(α+β)θ}+θe^c(α+β)^θ^¯⩽e^c²^(α+β)²^/8.

The second inequality follows from Lemma??withx=c(α+β)andθ=_(α+β)^α ∈[0, 1]. Fixing the value c= ^8b

(α+β)², we obtain

E[Wn|_F_n−1]⩽Wn−1e^−cb+^c

2(α+β)²

8 =Wn−1.

Thus,Wis a supermartingale. For a fixed positive integerk, define the bounded stopping timeτby τ≜inf{n∈_N:Xn⩾a+bn} ∧k.

Now, using Markov inequality and optional stopping theorem, we get

P{X_τ⩾a+bτ}=P{W_τ⩾1}⩽E[W_τ]⩽E[W0] =e^−ca=e⁻

8ab (α+β)2.

The above inequality is equivalent toP{Xn⩾a+_bn_{for some}_n⩽k}⩽e^{−8ab/(α+β)}². Since, the choice ofkwas arbitrary, the result follow from lettingk→∞.

Theorem 1.10 (Generalized Azuma inequality). Let X be a zero-mean martingale, such that−α⩽Xn− Xn−1⩽βfor all n∈N. Then, for any positive constant c and integer m

P{Xn⩾nc for some n⩾m}⩽e⁻

2mc2

(α+β)², P{Xn⩽−nc for some n⩾m}⩽e⁻

2mc2 (α+β)².

Proof. Observe that if there is an nsuch thatn⩾mandXn⩾ncthen for thatn, we haveXn ⩾nc⩾

mc

2 +^nc₂. Using this fact and previous proposition fora=^mc₂ andb=₂^c, we get P{Xn⩾ncfor somen⩾m}⩽Pn

Xn⩾^mc 2 + ^c

2nfor someno

⩽e⁻

8mc 2 c

2 (α+β)². This proves first inequality, and second inequality follows by considering the martingale−X.

3