getdoc8ebe. 164KB Jun 04 2011 12:04:40 AM

(1)

in PROBABILITY

MAXIMAL ARITHMETIC PROGRESSIONS IN

RAN-DOM SUBSETS

ITAI BENJAMINI

The Weizmann Institute of Science, POB 26, Rehovot 76100, ISRAEL email: [email protected]

ARIEL YADIN

The Weizmann Institute of Science, POB 26, Rehovot 76100, ISRAEL email: [email protected]

OFER ZEITOUNI1

Department of Mathematics, University of Minnesota, MN 55455, USA email: [email protected]

Submitted September 18, 2007, accepted in final form September 26, 2007 AMS 2000 Subject classiﬁcation: 60C05

Keywords: arithmetic progression; random subset; Chen-Stein method; dependency graph; extreme type limit distribution

Abstract

Let U(N) _{denote the maximal length of arithmetic progressions in a random uniform subset} of {0,1}N. By an application of the Chen-Stein method, we show that U(N)₋_{2 log}_N/_{log 2} converges in law to an extreme type (asymmetric) distribution. The same result holds for the maximal length W(N) _{of arithmetic progressions (mod}_N_{). When considered in the} natural way on a common probability space, we observe that U(N)_/_log_N _{converges almost} surely to 2/log 2, while W(N)_/_log_N _{does not converge almost surely (and in particular,} lim supW(N)_/_log_N _≥₃_/_{log 2).}

1 Introduction and Statement of Results

In this note we study the length of maximal arithmetic progressions in a random uniform subset of{0,1}N_{. That is, let}_ξ

1, ξ2, . . . , ξN be a random word in{0,1}N, chosen uniformly. Consider

the (random) set ΞN of elements i such that ξi = 1. Let U(N) denote the maximal length

arithmetic progression in ΞN, and letW(N) denote the maximal length aperiodic arithmetic

progression (modN) in ΞN. A consequence of our main result (Theorem 1) is that the

expectation of both U(N) _and _W(N) _{is roughly 2 log}_{N /}_{log 2, twice the expectation of the} longest run in ΞN, see [3],[4]. We also show that the limit law of the centered version of both

W(N) _and_U(N)_{is of the same extreme type as that of the longest run in Ξ}

N.

We observe two interesting phenomena:

1PARTIALLY SUPPORTED BY NSF GRANT DMS-050377

(2)

• Theorem 1 states that the tails of the distribution ofW(N)_{behave diﬀerently for positive} and negative deviations from the mean. In particular, the probability thatW(N)_deviates by x from its mean, behaves roughly like 1−exp¡

−2−(x+2)¢

for positive x, and like exp¡

−2−(x+2)¢

for negative x. Thus, on the positive side of the mean the tail decays exponentially, and on the negative side of the mean the tail decays doubly-exponential. • One may construct the sets ΞN on the same probability space by considering an inﬁnite

sequence of i.i.d., Bernoulli random variables {ξi}∞i=1. Proposition 2 states that with such a construction, the sequence W(N)_/_log_N _{converges in} _probability _{to the constant} 2/log 2, but a.s. convergence does not hold. This contrasts with the behavior ofU(N), where a.s. convergence ofU(N)_/_log_N _{to 2}_/_{log 2 holds. The seemingly small change of} taking arithmetic progressions that “wrap around” the torus, changes the behavior of the lim sup of the sequence.

The notoriously hard extremal problem, showing that a set of integers of upper positive density contains unbounded arithmetic progressions, and its ﬁnite quantitative versions, is a well studied topic reviewed recently in [7].

1.1 The Model

Letξ1, . . . , ξN, . . . ,be i.i.d. Bernoulli random variables of meanE[ξi] = 1₂. For a non-negative

integer N ands, p∈ {1,2, . . . , N}deﬁne

Ws,p=Ws,p(N)

def = max

(

1≤k≤N ¯

¯ξs= 0 , k

Y

i=1

ξs+ip (modN)= 1

)

.

That is, we consider all arithmetic progressions (mod N) in{1,2, . . . , N} starting ats, with diﬀerencep, and check for the longest one of the form 0,1,1, . . .(the role of the 0 is to avoid considering periodic progressions). Ws,p is the length of such progression. We set

W(N) def= max

s,p Ws,p,

which is the size of the maximal arithmetic progression (mod N) in{1,2, . . . , N}of the form 0,1,1, . . ..

Similarly, deﬁne

Us,p=Us,p(N)

def = max

(

1≤k≤ ⌊N−s

p ⌋

¯

¯ξs= 0, k

Y

i=1

ξs+ip= 1

)

,

and

U =U(N) def= max

s,p≤NUs,p,

that is we only consider s, p, ksuch that©

s+ip¯¯i= 0,1, . . . , k ª

⊆ {1, . . . , N}.

1.2 Results

(3)

Theorem 1. Let{xN} be a sequence such thatClogN+xN ∈Zfor allN, andinfNxN ≥b,

for someb∈R_{. Then, with} _λ₍_x_{) = 2}−(x+2)_{, we have}

lim

N→∞exp (λ(xN))

Ph_W(N)_≤_C_log_N₊_x_Ni_{= 1}_. ₍₁₎

Similarly, let{yN} be a sequence such that ClogN −log(2ClogN) +yN ∈Zfor all N, and

infNyN ≥b, for someb∈R. Then,

lim

N→∞exp (λ(yN))

Ph_U(N)_≤_C_log_N₋_log(2_C_log_N_{) +}_y_Ni_{= 1}_. ₍₂₎

In particular, bothW(N)_/_log_N _and_U(N)_/_log_N _{converge in probability to}_C_.

The dichotomy in the sequential behavior of W(N) _and _U(N) _{is captured in the following} proposition.

Proposition 2. U(N)_/C_log_N _{converges a.s. to}_{1, while}

lim sup

N→∞

W(N)

ClogN ≥

3 2.

In particular,W(N)_/C_log_N _{does not}_{converge a.s. to}_1.

The structure of the note is as follows. In the next section, we introduce dependency graphs and the Arratia-Goldstein-Gordon version of the Chen-Stein method, and perform preliminary computations. After these preliminary computations are in place, the short Section 3 is devoted to the proof of Theorem 1. Section 4 is devoted to the proof of Proposition 2.

2 Preliminaries and auxilliary computations

We introduce the notion of dependency graphs, and the method of Chen and Stein to prove Poisson convergence, that will play an important role in our proof.

2.1 Dependency Graphs

LetX1, X2, . . . , XN be N random variables. Let Gbe a graph with vertices 1,2, . . . , N. We

use the notationi∼jto denote two vertices connected by an edge. AsXi is not independent

of itself, we deﬁne i∼ifor alli (this can be thought of as requiringGto have a self loop at each vertex). Gis called adependency graph of{Xi}Ni=1 if for any vertexi,

Xi is independent of the set {Xj : j6∼i}. (3)

The notion of dependency graphs has been introduced in connection with the Lov´asc Local Lemma, see [1], Chapter 5. Some other results concerning dependency graphs are [5], [6]. We emphasize that there can be many dependency graphs associated to a collection of random variables{Xi}Ni=1.

We deﬁne two quantities associated with a dependency graphGof{Xi}Ni=1.

B1=B1(G) =

N

X

i=1

X

j:Xj∼Xi

(4)

B2=B2(G) =

The following is a simpliﬁed version of Theorem 1 in [2], which in turn is an eﬀective way to apply the Chen-Stein method:

Theorem 3(Arratia, Goldstein, Gordon). Let{Xi}Ni=1beN Bernoulli random variables with

pi=E[Xi]>0. Set

Theorem 3 is useful in proving convergence of sums of “almost” independent variables to the Poisson distribution.

i=0 be the arithmetic progression corresponding to Is,p.

LetGbe the graph with vertex set{(s, p)}N_s,p₌₁, and edges deﬁned by the relations

The following combinatorial proposition proves to be useful. Proposition 4. For alls, p the following holds:

(5)

Proof. Fix 1≤s, p≤N.

Let k ≥ 2. Assume that A(s, p)∩A(t, q) = {x1< x2<· · ·< xk}. Let L = lcm(p, q)

def = min©

L¯

¯∃ a, b∈Z : L=ap=bq ª

.

Claim. {xi}ki=1 is an arithmetic progression withxi+1−xi =L.

Proof. Assume thatL=ap=bq, fora6=b≥1. Fix 1≤i≤k−1. Sincexi+1 > xi are both

inA(s, p)∩A(t, q), we get that xi+1−xi=a′p=b′q for nonnegative integersa′6=b′ ≥1. So

xi+1−xi ≥L.

Considerxi+L. Sincexi∈A(s, p)∩A(t, q), andL=ap=bq, and sincexi< xi+L≤xi+1, it

follows thatxi+L∈A(s, p)∩A(t, q). Soxi+L≥xi+1, concluding the proof of the claim. ⊓⊔

Leta6=b≥1 be such thatL=ap=bq= lcm(p, q). We have the following constraints:

s+ (k−1)ap≤x1+ (k−1)L≤s+M p and t+ (k−1)bq≤x1+ (k−1)L≤t+M q.

Thus, 2≤max{a, b} ≤ M

k−1, or: k≤

M

2 + 1. So fork > M₂ + 1 we get thatDs,p(k) = 0.

Consider k≤ M

2 + 1. Since there are at most

M

k−1 choices fora and forb, and since a choice ofa, bdeterminesq, we have at most M2

(k−1)2 choices forq.

Remark. This can be improved to 2

k+1·

M2

(k−1)2, with a slightly more careful analysis. We

will not need this improvement.

Sincet=x1−iq=s+jp−iqfor some 0≤i, j≤M, there are at most (M + 1)2 choices for

t, once we have ﬁxedq.

Thus, altogether, for 2≤k≤ M₂ + 1,

Ds,p(k)≤

(M + 1)2_M2

(k−1)2 ≤(M + 1) 2_M2_.

If|A(s, p)∩A(t, q)|= 1 then there are at mostN choices forqand (M + 1)2 _{choices for}_t_{, so}

Ds,p(1)≤(M + 1)2N. ⊓⊔

RecallGdeﬁned above, a dependency graph of{Is,p(x)}N_s,p₌₁. Set

B1=B1(x, G) =

X

s,p

X

t,q It,q∼Is,p

E_[_I_s,p_]E_[_I_t,q_]_,

as in (4). Also, set

B2=B2(x, G) =

X

s,p

X

(s,p)6=(t,q)

It,q∼Is,p

E_[_I_s,p_I_t,q_]_,

as in (5).

Proposition 5. For anyδ >0, sup

x∈(−∞,εlogN)

(6)

Proof. We have thatE_[_I_t,q_]_≤₂−(ClogN+x+1)_{, for all}_{t, q}_.

3 Arithmetic Progressions: Proof of Theorem 1

Since the proofs are very similar, we only consider the slightly harderW(N)_{. We write}_W _for

W(N)_{whenever no confusion can occur.} We begin with the following lemma:

Lemma 6. The sequenceW(N)_/C_log_N _{converges to}₁ _{in probability; i.e. for any} _{δ >}_0,

Further, the convergence is almost sure on the subsequence Nk = 2k. Finally, the statements

hold with U(N) _replacing_W(N)_.

Proof of Lemma 6. Again, we consider onlyW(N)_{. Fix}_{ε >}_{0. Note that}

(7)

Thus,

Further, from the same estimates one has that with Yk =W(2

k₎

One then deduces from the Borel-Cantelli lemma the claimed almost sure convergence. ⊓⊔ Proof of Theorem 1. As in the proof of Lemma 6, for x∈ R_{, let}_Z₍_x_{) be a Poisson random}

We also have the equality

(8)

4 Convergence in Probability vs. a.s. Convergence

We begin with the following easy consequence of Lemma 6 applied toU(N)_. Proposition 7. U(N)_/C_log_N _{converges a.s. to}_1.

Proof. The main observation is thatU(N)_{is a monotone increasing sequence. That is, a.s. for} allN,U(N)_≤_U(N+1)_{. Thus, a.s. for all}_N_{, setting}_k₌_⌊log

Proof of Proposition 2. In view of Proposition 7, it remains only to consider the statement concerning W(N)_{. Toward this end, ﬁx 0} _{< β <}_{1. Let}_M Lemma 8. The following holds:

X

Thus, the proof of the lemma is based on controlling the cardinality of the collection of triples (s′_{, p}′_{, N}′₎ _{∈ L}

n, whose associated arithmetic progression intersects in a prescribed number

of points the arithmetic progression associated with a given triple (s, p, N)∈ Ln. We divide

(9)

Similarly, deﬁneS(N, N′_{, p}_{) to be the set of all such triples (}_{s, s}′_{, p}′_{) such that}

We have the following estimates.

Proposition 9. For large enoughn, the following holds: for allN, N′_∈_[_n,₂_n_]_and_p_∈_[1_{, N}_],

Assuming Propositions 9, 10, 11, we have

X

which completes the proof of the lemma. ⊓⊔

Returning to the proof of Proposition 2, let

Λ(n) = X (s,p,N)∈Ln

I(s, p, N)

and note that for all large enoughn,

(10)

Thus,

P_[Λ(_n_{) = 0]}_≤Var (Λ(n))

(E_Λ(_n₎₎2 =O

¡

nβ−1log7(n)¢

. (16)

Since

P

·

∃N ∈[n,2n] : W(N)> 2 +β

log 2 logN

¸

≥P_[Λ(_n₎_>_0]_,

it follows from (16) that

∞

X

k=0

P

·

max

N∈[2k_,₂k+1_] W(N) logN ≤

2 +β

log 2

¸

<∞.

By the Borel-Cantelli lemma, we get that a.s.

lim sup

N→∞

W(N)

logN ≥lim supk→∞ max

N∈[2k_,₂k+1_] W(N) logN >

2 +β

log 2.

Sinceβ ∈(0,1) is arbitrary, this completes the proof of Proposition 2. ⊓⊔

Proof of Proposition 9. If (s, s′_{, p}′₎_{∈ T}₍_{N, N}′_{, p}_{) then (10) implies that there exist}_i_∈_[1_{, M}

N],

j ∈[1, MN′],ki∈[0, MN] andkj′ ∈[0, MN′] such that

s+ip−kiN =s′+jp′−k′jN′ (17)

There are at most (2n)2_{choices for}_s′_and_p′_{. There exists some universal constant}_K_{such that} there are at mostKlog(n) choices for each ofi, j, ki, kj′. Choosings′, p′, i, j, ki, kj′ determiness.

Thus, we have shown that|T(N, N′_{, p}_{)| ≤}₄_Kn2_log4₍_n₎_≤_n2_log5₍_n_{) for large enough}_n_. _⊓_⊔

Proof of Proposition 10. If (s, s′, p′) ∈ S(N, N′, p) then (11) implies that there exist i, r ∈ [1, MN] and j, ℓ∈[1, MN′] such that (i, j)6= (r, ℓ) and

s+ip (mod N) =s′₊_jp′ _(mod _N′₎ ₍₁₈₎

and s+rp (modN) =s′₊_ℓp′ _(mod_N′₎_.

Note that for anyi∈[1, MN] there existski∈[0, MN] such thats+ip (mod N) =s+ip−kiN.

Similarly, for any j ∈ [1, MN′] there exists k_j′ ∈ [0, MN′] such that s′ +jp′ (modN′) =

s′ ₊_jp′₋_k′

jN′. Plugging this into (18), and subtracting equations, we get that there exist

i, r∈[1, MN],j, ℓ∈[1, MN′], ki, kr∈[0, MN] andk′j, kℓ′ ∈[0, MN′] such that

(r−i)p+ (ki−kr)N= (ℓ−j)p′+ (k′j−kℓ′)N′. (19)

There exists some universal constantK such that there are at mostKlog(n) choices for each of i, r, j, ℓ, ki, kr, kj′, k′ℓ, and 2n choices for s. After choosingi, r, j, ℓ, ki, kr, k′j, k′ℓ, s, (18) and

(19) determine s′ _and_p′_{. Thus, we have shown that for large enough}_n_,

|S(N, N′, p)| ≤2n(Klog(n))8≤nlog9(n).

(11)

s+ip (modN)¯

¯i∈[1, MN]

ª

, and let (s′_{, p}′_{, N}′₎_{∈ U}₍_{s, p, N}_). Fori= 1,6,11, . . ., let

Zi=

©

s′+ (i+r)p′ (modN′)¯

¯r= 0,1, . . . ,4 ª

.

This is a partition of the arithmetic progression into packets of ﬁve elements. We then have, by the deﬁnition ofU(s, p, N),

2MN′

5 ≤

X

i

|Zi| ≤

MN′

5 maxi |Zi|.

So there exists some setZisuch that|Zi| ≥2. This implies that there existx < y∈A∩[1, N′],

i∈[1, MN′], andr∈[1,4] such that

s+ip′ (mod N′) =x , (20)

s+ (i+r)p′ (mod N′) =y.

Subtracting equations, and using the fact thatrp′_{< N}′_{, we get that}

rp′=y−x. (21)

Moreover, (12) also implies that there must exist an integerj(perhaps negative) with1 5MN′ ≤ |j| ≤MN′, andz∈A∩[1, N′], such that

s+ (i+j)p′ (mod N′) =z. (22)

For large enoughn, we have that |j| ≥ 7. Since 7p′ _{> N}′_{, we get by subtracting (20) from} (22),

jp′+kN′=z−x, (23)

for somek6= 0, such that|k| ≤MN′.

Sincekr6= 0, equations (21) and (23) have at most one solution forp′, N′, in terms ofx, y, z, r, j

and k. Since there are at most |A|3 _≤_M3

N choices for x, y and z, at most 4 choices for r,

and at most 4M2

N′ choices forj and k, we get that there are at most 16|A|

3_M2

N′ choices for

p′_{, N}′_{. Also, there are at most} _M

N′ choices for iin (20), and ﬁxingp′, N′, xandidetermines

s′_{. Thus,}

|U((s, p, N))| ≤16MN3MN3′.

⊓ ⊔

Open Problem. We conjecture that in fact,

lim sup

N→∞

W(N) logN =

3 2.

(12)

References

[1] N. Alon, J. H. Spencer.The Probabilistic Method (2000), John Wiley & Sons. New York. MR1885388

[2] R. Arratia, L. Goldstein, L. Gordon, Two moments suffice for Poisson approxima-tions: The Chen-Stein method.Ann. Probab.17(1989), 9–25. MR0972770 [3] P. Erdös, A. Rényi, On a new law of large numbers,J. Analyse Math. 23(1970),

103–111. MR0272026

[4] P. Erd¨os, P. Revesz, On the length of the longest head-run, Colloq. Math. Soc. Janos Bolyai 16(1977), 219–228. MR0478304

[5] R. Gradwohl, A. Yehudayoﬀ, t-wise independence with local dependencies.

arXiv:math.PR/0706.1637

[6] S. Janson, Large deviations for sums of partly dependent random variables.Random Structures Algorithms 24 (2004), no. 3, 234–248. MR2068873