getdoc29ff. 260KB Jun 04 2011 12:04:11 AM

(1)

El e c t ro n ic

Jo ur

n a l o

f P

r o

b a b i_{l i t y}

Vol. 15 (2010), Paper no. 41, pages 1319–1343. Journal URL

http://www.math.washington.edu/~ejpecp/

Small deviations for beta ensembles

Michel Ledoux

∗

Brian Rider

†

Abstract

We establish various small deviation inequalities for the extremal (soft edge) eigenvalues in the

β-Hermite andβ-Laguerre ensembles. In both settings, upper bounds on the variance of the largest eigenvalue of the anticipated order follow immediately.

Key words:Random matrices, eigenvalues, small deviations.

AMS 2000 Subject Classification:Primary 60B20; Secondary: 60F99.

Submitted to EJP on December 26, 2009, final version accepted July 25, 2010.

∗_{Institut de Mathématiques de Toulouse, Université de Toulouse, F-31062 Toulouse, France.} ledoux@math.univ-toulouse.fr_{. Supported in part by the French ANR GRANDMA.}

†_{Department of Mathematics, University of Colorado at Boulder, Boulder, CO 80309.} _{brian.rider@colorado.edu}_.

(2)

1 Introduction

In the context of their original discovery, the Tracy-Widom laws describe the fluctuations of the limiting largest eigenvalues in the Gaussian Orthogonal, Unitary, and Symplectic Ensembles (G_{O/U/S_}E) [23; 24]. These are random matrices of real, complex, or quaternion Gaussian en-tries, of mean zero and mean-square one, independent save for the condition that the matrix is sym-metric (GOE), Hermitian (GUE), or appropriately self-dual (GSE). The corresponding Tracy-Widom distribution functions have shape

F_{T W}(t)_∼e241βt 3

ast_{→ −∞}, 1₋F_{T W}(t)_∼e−23βt 3/2

as t_{→ ∞}, (1.1)

whereβ=1 in the case of GOE,β=2 for GUE, andβ=4 for GSE.

Since that time, it has become understood that the three Tracy-Widom laws arise in a wide range of models. First, the assumption of Gaussian entries may be relaxed significantly, see [21], [22] for instance. Outside of random matrices, these laws also describe the fluctuations in the longest increasing subsequence of a random permutation[2], the path weight in last passage percolation [11], and the current in simple exclusion[11; 25], among others.

It is natural to inquire as to the rate of concentration of these various objects about the limiting Tracy-Widom laws. Back in the random matrix setting, the limit theorem reads: with λmax the

largest eigenvalue in then_×nGOE, GUE or GSE, it is the normalized quantityn1/6(λmax−2

p

n) which converges to Tracy-Widom. Thus, one would optimally hope for estimates of the form:

Pλmax−2

p

n_{≤ −}ǫpn_≤C e−n2ǫ3/C, Pλmax−2

p

n_≥ǫpn_≤C e−nǫ3/2/C,

for alln≥1, allǫ∈(0, 1]say, andC a numerical constant. Such are “small deviation" inequalities, capturing exactly the finitenscaling and limit distribution shape (compare (1.1)). Takingǫbeyond O(1) in the above yields more typical large deviation behavior and different (Gaussian) tails (see below).

As discussed in[14; 15], the right-tail inequality for the GUE (as well as for the Laguerre Unitary Ensemble, again see below) may be shown to follow from results of Johansson[11]for a more gen-eral invariant model related to the geometric distribution that uses large deviation asymptotics and sub-additivity arguments. The left-tail inequality for the geometric model of Johansson (and thus by some suitable limiting procedure for the GUE and the Laguerre Unitary Ensemble) is established in[3]together with convergence of moments using delicate Riemann-Hilbert methods. We refer to [15] for a discussion and the relevant references, as well as for similar inequalities in the context of last passage percolationetc. By the superposition-decimation procedure of[10], the GUE bounds apply similarly to the GOE (see also[16]).

Our purpose here is to present unified proofs of these bounds which apply to all of the so-called beta ensembles. These are point-processes onRdefined by then-level joint density: for anyβ >0,

P(λ1,λ2, . . . ,λn) =

1 Z_n_,_β

Y

j<k

|λ_j−λ_k|βe−(β/4) Pn

k=1λ2k. (1.2)

(3)

(1.2) has come to be referred to theβ-Hermite ensemble; we will denote it by H_β. Importantly, off of β = 1, 2, 4, despite considerable efforts (see[9], Chapter 13 for a comprehensive review), there appears to be no characterization of the correlation functions amenable to asymptotics. Still, Ramírez-Rider-Virág[19]have shown the existence of a generalβ Tracy-Widom law, T W_β, via the corresponding limit theorem: with self-evident notation,

n1/6 λmax(Hβ)−2

p

n_⇒T W_β. (1.3)

This result makes essential use of a (tridiagonal) matrix model valid at all beta due to Dumitriu-Edelman[5], and proves the conjecture of Edelman-Sutton[6]. As to finitenbounds, we have:

Theorem 1. For all n_≥1,0< ǫ≤1andβ≥1:

P

λmax(Hβ)≥2

p

n(1+ǫ)_≤C e−βnǫ3/2/C,

and

P

λmax(Hβ)≤2

p

n(1₋ǫ)_≤Cβe−βn2ǫ3/C,

where C is a numerical constant.

The restriction toβ ≥1 is somewhat artificial, though note that bounds of this type cannot remain meaningful all the way down toβ=0. Our methods do extend, with some caveats, toβ <1.

Theorem 1′. When0< β <1upper bounds akin to those in Theorem 1 hold as soon as n≥2β−1_,

with the right hand sides reading(1₋e−β/C)−1e−βnǫ3/2/C for the right tail and C e−βn2ǫ3/C for the left tail. A right tail upper bound is available without the restriction on n, but with the right hand side replaced by(1₋e−β2/C)−1e−β3/2nǫ3/2/C.

Our remaining results will have like extensions toβ <1. We prefer though to restrict the statements toβ≥1, which covers the cases of classical interest and allows for cleaner, more unified proofs. At this point we should also mention that forǫbeyondO(1), the large-deviation right-tail inequality takes the form

Pλmax(Hβ)≥2

p

n(1+ǫ)_≤C e−βnǫ2/C. (1.4)

Forβ =1 and 2 this follows from standard net arguments on the corresponding Gaussian matrices (see e.g.[15]). For other values ofβ(this again forβ≥1), crude bounds on the tridiagonal models discussed below immediately yield the claim.

Continuing, those well versed in random matrix theory will know that this style of small deviation questions are better motivated in the context of “null" Wishart matrices, given their application in multivariate statistics. Also known as the Laguerre Orthogonal or Unitary Ensembles (L_{O/U_}E), these are ensembles of type X X∗ in whichX is ann_×κmatrix comprised of i.i.d. real or complex Gaussians.

(4)

Forβ=2 andκa fixed multiple ofn, a small deviation upper bound at the right-tail (as well as the corresponding statement for the minimal eigenvalue in the “soft-edge" scaling) was known earlier (see[14; 15]), extended recently to non-Gaussian matrices in[8].

Once again there is a general beta version. Consider a density of the form (1.2) in which the Gaussian weightw(λ) =e−βλ2/4on_Ris replaced by w(λ) =λ(β/2)(κ−n+1)+1 e−βλ/2, now restricted to_R₊. Hereκcan be any real number strictly larger thann−1. It is whenκis an integer andβ=1 or 2 that one recovers the eigenvalue law for the real or complex Wishart matrices just described. For generalκandβ >0 the resulting law on positive pointsλ1, . . . ,λnis referred to as theβ-Laguerre

ensemble, hereL_β for short.

Using a tridiagonal model for L_β introduced in [5], it is proved in[19]: forκ+1>n_{→ ∞}with

κ/n→c≥1,

(pκn)1/3

(pκ+pn)4/3

λmax(Lβ)−(

p

κ+pn)2_⇒T Wβ. (1.5)

This covers all previous results for real/complex null Wishart matrices. Comparing (1.3) and (1.5) one sees that O(n2/3ǫ) deviations in the Hermite case should correspond to deviations of order (κn)1/6₍p_κ₊p_n₎2/3_ǫ₌_O₍_κ1/2_n1/6_ǫ₎_{in the Laguerre case. That is, one might expect bounds exactly}

of the form found in Theorem 1 with appearances ofnin each exponent replaced byκ3/4n1/4. What we have is the following.

Theorem 2. For allκ+1>n_≥1,0< ǫ≤1andβ≥1:

P

λmax(Lβ)≥(

p

κ+pn)2(1+ǫ)_≤C e−β

p_n_κǫ3/2₍1

p_ǫ∧κ_n1/4)/C

,

and

P

λmax(Lβ)≤(

p

κ+pn)2(1₋ǫ)_≤Cβe−βnκǫ3(1ǫ∧

_κ

n

1/2

)/C_.

Again, C is some numerical constant.

The right-tail inequality is extended to non-Gaussian matrices in[8]. The rather cumbersome expo-nents in Theorem 2 do produce the anticipated decay, though only forǫ≤pn/κ. Forǫ≥pn/κ, the right and left-tails become linear and quadratic inǫrespectively. This is to say that the large devia-tion regime begins at the orderO(pn/κ)rather thanO(1)as in theβ-Hermite case. To understand this, we recall that, normalized by 1/κ, the counting measure of the Lβ points is asymptotically sup-ported on the interval with endpoints(1_±pn/κ)2. This statement is precise with convergentn/κ, and the limiting measure that of Marˇcenko-Pastur. Either way, pn/κis identified as the spectral width, in contrast with the semi-circle law appearing in theβ-Hermite case which is of width one (after similar normalization). Of course, in the more usual set-up whenc1n≤κ≤c2n(c1≥1

neces-sarily) all this is moot: the exponents above may then be replaced with−βnǫ3/2_/_C _and₋_β_n2_ǫ3_/_C

forǫin anO(1)range with no loss of accuracy. And again, the large deviation tails were known in this setting forβ=1, 2.

An immediate consequence of the preceding is a finiten(and/orκ) bound on the variance ofλmax

(5)

Corollary 3. Takeβ≥1. Then,

Varhλmax(Hβ)

i

≤Cβn−1/3, Var

h

λmax(Lβ)

i

≤Cβκn−1/3 (1.6)

with now constant(s) C_β dependent uponβ. (By Theorem1′, the Hermite bound holds forβ < 1as well.)

The same computation behind Corollary 3 implies that

lim sup

n→∞ n

p/6

Eλmax(Hβ)−2

p

np<∞

for any p, and similarly for λmax(Lβ). Hence, we also conclude that all moments of the (scaled) maximalHβ andLβ eigenvalues converge to those for the T Wβ laws (see[3]forβ=2).

Finally, there is the matter of whether any of the above upper bounds are tight. We answer this in the affirmative in the Hermite setting.

Theorem 4. There is a numerical constant C so that

P

λmax(Hβ)≥2

p

n(1+ǫ)_≥C−βe−Cβnǫ3/2,

and

P

λmax(Hβ)≤2

p

n(1₋ǫ)_≥C−βe−Cβn2ǫ3.

The first inequality holds for all n>1, 0< ǫ≤1,andβ≥1. For the second inequality, the range ofǫ

must be kept sufficiently small,0< ǫ≤1/C say.

Our proof of the right-tail lower bound takes advantage of a certain independence in theβ-Hermite tridiagonals not immediately shared by the Laguerre models, but the basic strategy also works in the Laguerre case. Contrariwise, our proof of the left-tail lower bound uses a fundamentally Gaussian argument that is not available in the Laguerre setting.

The next section introduces the tridiagonal matrix models and gives an indication of our approach. The upper bounds (Theorems 1, 1′, 2 and Corollary 3) are proved in Section 3; theHβ lower bounds in Section 4. Section 5 considers the analog of the right-tail upper bound for the minimal eigenvalue in theβ-Laguerre ensemble, this case holding the potential for some novelty granted the existence of a different class of limit theorems (hard edge) depending on the limiting ratio n/κ. While our method does produce a bound, the conditions on the various parameters are far from optimal. For this reason we relegate the statement, along with the proof and further discussion, to a separate section.

2 Tridiagonals

The results of [19] identify the general β > 0 Tracy-Widom law through a random variational principle:

T Wβ =sup

f∈L

  

2

p

β

Z ∞

0

f2(x)d b(x)₋

Z ∞

0

(f′(x))2+x f2(x)d x

 

(6)

in whichx _7→b(x)is a standard Brownian motion and Lis the space of functions f which vanish at the origin and satisfyR∞

0 f

This variational point of view also guides the proof of the convergence of the centered and scaled

λmax (of Hβ or Lβ) to T Wβ. In particular, given the random tridiagonals which we are about to introduce, one always has a characterization of λmax through Raleigh-Ritz. In [19], the point is

to show this “discrete" variational problem goes over to the continuum problem (2.1) in a suitable sense. Furthermore, an analysis of the continuum problem has been shown to give sharp estimates on the tails of the β Tracy-Widom law (again see [19]). Our idea here is therefore retool those arguments for the finiten, or discrete, setting.

We start with the Hermite case. Letg₁,g₂, . . .g_nbe independent Gaussians with mean 0 and variance 2. Let also χβ, χ2β, . . . , χ(n−1)β be independentχ random variables of the indicated parameter. Then, re-using notation,[5]proves that theneigenvalues of the random tridiagonal matrix

Hβ =

The problem at hand (Theorem 1) then becomes that of estimating

P

where we have introduced the usual Euclidean norm ||v||22 =

Pn k=1v

2

k. To make the connection

betweenH(v)and the continuum form (2.1) even more plain we have the following.

Lemma 5. For any c>0define

1_For_β ₌_{1 or 2 this can be seen by applying Householder transformation to the “full" GOE or GUE matrices, and}

(7)

We defer the proof until the end of the section, after a description of the alliedL_β set-up. The point of Lemma 5 should be clear: for an upper bound on the first probability in (2.3) one may replaceH byH_b with any sufficiently smallb>0, and so on. Lemma 5 also marks our first run-in with issues surroundingβ≥1 versusβ <1. An available extension toβ <1 reads:

Lemma 5′. If 0< β <1,estimates of type(2.5)with numerical a and b hold whenever n_≥2β−1. This condition on n may be removed in the upper bound at the cost of choosing b=const.β1/2.

The model forL_β is as follows. Forκ >n₋1, introduce the random bidiagonal matrix

Bβ =

with the same definition for theχ’s and again all variables independent. (The use ofχeis meant to emphasize this independence between the diagonals.) Now[5]shows that it is the eigenvalues of L_β = (B_β)(B_β)T which have the required joint density.2 Note that L_β does not have independent entries.

Similar to before, we define

p

The added normalization bypκmakes for better comparison with the Hermite case. With this, and sinceκ >n₋1, to prove Theorem 2 is to establish bounds on the following analogs of (2.3):

P

Finally, we state the Laguerre version of Lemma 5. (We prove only the latter as they are much the same).

(8)

where

Proof of Lemma 5. Writing,

H(v) = _p1

shows it is enough to compare, for everyv,

I(v) =pn

(We implicitly assume here that n > 1; for n = 1 there is nothing to do.) For this, there is the formulaEχr =21/2

These bounds easily translate to

(9)

valid for allr>0 (this is due to Wendel, see now (2.2) of[17]).

In boundingJ(v)above, it is the second inequality of (2.9) that is affected forβ(n−k)<1. We still wish it to hold, with perhaps the 2 replaced by some other constantC. That is, making use of (2.10) we want a constantC so that

p

n≤Cpn+β(n−k)

 _p1

2β −

C

p

nβ

 ,

and we can takeC =1 ifn>2β−1.

For the lower bound on J(v), note that E[χβ(n−k)/

p

β]_≥ pn/4 for k ≤ n/2 still holds (i.e. we can still use (2.8)) when n > 2β−1 and so everything is as before. On the other hand, (2.10) provides_E[χβ(n−k)/

p

β]_≥ (pβ/2)pnon that same range, and so we always haveJ(v)_≥ bI(v) withb_∼pβ.

3 Upper Bounds

Theorems 1 and 2 are proved, first for theβ-Hermite case with all details present (and comments on Theorem 1′ made along the way); a second subsection explains the modifications required for theβ-Laguerre case. The proof of Corollary 3 appears at the end.

3.1 Hermite ensembles

Right-tail. This is the more elaborate of the two. The following is a streamlined version of what is needed.

Proposition 7. Consider the model quadratic form,

Hb(v,z) =

1

p

β

n

X

k=1

zkvk2−b

p

n

X

k=0

(vk+1−vk)2−

b

p_n

n

X

k=0

kv2_k, (3.1)

for fixed b > 0and independent mean-zero random variables _{z_k_}_k₌_1,...,_n satisfying the uniform tail boundE[eλzk_]_≤_ecλ2_{for all}_λ_∈_R_{and some c}_>_{0. There is a C}₌_C₍_b,_c₎_{so that}

P

sup ||v||2=1

H_b(v,z)_≥ǫpn_≤(1₋e−β/C)−1e−βnǫ3/2/C

for allǫ∈(0, 1]and n_≥1.

The proof of the above hinges on the following version of integration by parts (as in fact does the basic convergence result in[19]).

Lemma 8. Let s1,s2, . . . ,sk, . . .be real numbers, and set Sk=

Pk

ℓ=1sℓ, S0=0. Let further t1, . . . ,tn be

real numbers, t₀=t_n+1=0. Then, for every integer m≥1,

n

X

k=1

s_kt_k= 1 m

n

X

k=1

[S_k+m−1−Sk−1]tk+ n

X

k=0

₁

m

k+_Xm−1

ℓ=k

[Sℓ−Sk]

(10)

Proof. For anyT_k,k=0, 1, . . . ,n, write

Conclude by choosingT_k= 1

m

Next, by the Cauchy-Schwarz inequality, for everyλ >0,

1

(11)

Optimizing inλ, and then applying the same reasoning to the sequence₋S_ℓ produces

From (3.4) it follows that

P

We may now dispense of the proof of Theorem 1 (Right-Tail). Before turning to the proof, we remark that ifǫ >1, one may run through the above argument and simply choose m=1 at the end to produce the classical form of the large deviation inequality (1.4) known previously forβ=1, 2. We turn to the values 0< ǫ≤1. The form (2.4) is split into two pieces,

Hb(v) =Hb/2(v,g) +H˜b/2(v,χ),

Proposition 7 applying to each.

The first term on the right is precisely of the form (3.1) with eachz_k an independent mean-zero Gaussian of variance 2, which obviously satisfies the tail assumption with c = 1. The second term, ˜H_b/2(v,χ), is a bit different, having noise present through the quantity

Pn−1

k=1(χβ(n−k)−

Eχβ(n−k))vkvk+1. But carrying out the integration by parts on tk = vkvk+1 (and sk = χβ(n−k)−

(12)

Lemma 9. Forχ aχ random variable,

E[eλχ]_≤eλEχ+λ2/2, for allλ∈R. (3.8)

Proof. When the parameterris greater than one, this is a consequence of the Log-Sobolev estimate for general gamma random variables. The density function f(x) = crxr−1e−x

2_/₂

on _R₊ satisfies (logf(x))′′_{≤ −}1 (if r≥1) and so the standard convexity criterion (seee.g. [13]) applies to yield a Log-Sobolev inequality (with the same constant as in the Gaussian case). Then the well known Herbst argument gives the bound (3.8).

For r < 1 setφ(λ) = _E[eλχ]and first consider λ ≥ 0. Differentiating twice, then integrating by parts we have that

φ′′(λ) =λφ′(λ) +rφ(λ),

subject toφ(0) =1, φ′(0) = Eχ :=e, for short. Note of course thatφ′′(0) = r = Eχ2, and now

integrating twice we also find that: withψ(λ) =e−λ2/2φ(λ)andθ= r

1+r <1,

ψ(λ) = 1+eλ+

Z λ

0

(rλ−(1+r)t)ψ(t)d t

≤ 1+eλ+rθ

Z λ

0

(λ−t)ψ(θt)d t.

Next, by the inequality pr

1+r ≤ealready used above (proof of Lemma 5

′_{) we can continue the above}

as inψ(λ)_≤1+eλ+e2

Rλ

0(λ−t)ψ(θt)d t. Iterating, we get a next term which reads

e2

Z t

0

(t−s)(1+eθs)ds≤e2

Z t

0

(t−s)(1+es)ds=e2(t2/2) +e3(t3/6),

and this easily propagates to complete the proof (which actually works for allr).

To prove (3.8) forλ <0 setφ(λ) =E[e−λχ](viewingλas nonnegative), and the basic differential equation becomesφ′′(λ) =₋λφ′(λ) +rφ(λ). With p(λ) =φ′(λ)/φ(λ)this transforms top′(λ) =

−p2(λ)₋λp(λ) +r. Just using p′(λ)_{≤ −}λp(λ) +r we find that

p(λ)_≤p(0) +r e−λ2/2

Z λ

0

et2/2d t≤ −e+rλ, or φ(λ)≤e−eλ+rλ

2_/₂

,

which is what we want (when of courser_≤1).

(13)

Left-Tail. This demonstrates yet another advantage of the variational picture afforded by the tridi-agonal models. Namely, the bound may be achieved by a suitable choice of test vector since

P

for whatever _{v_k_}_k₌_1,...,_n on the right hand side. This same idea was used in the large deviation estimates for T Wβ in[19]. (Here have thrown in the additional constant 2C for reasons that will be clear in a moment.) Simplifying, we write

P

Focus on the first term on the right of (3.9), and note that

P

withga single standard Gaussian. Our choice of v is motivated as follows. The event in question

asks for a large eigenvalue (think of pnǫ as large for a moment) of an operator which mimics negative Laplacian plus potential. The easiest way to accomplish this would be for the potential to remain large on a relatively long interval, with a flat eigenvector taking advantage. We choose

vk=

(Here a ∼ b indicates that the ratio a/b is bounded above and below by numerical constants.) Substitution into (3.10) produces, for choice ofC =C(a)large enough inside the probability on the left,

P

H_a(v,g)_{≤ −}Cpnǫ||v_||2₂_≤e−βn2ǫ3/C for nǫ3/2≥1.

The restriction of the range ofǫstems from the gradient-squared term; it also ensures thatǫn≥1 which is required for our test vector to be sensible in the first place.

Next, as a consequence of Proposition 9 (see (3.8)) we have the bound: forc>0,

(14)

Withc=Cpnǫandv as in (3.11), this may be further bounded bye−βn2ǫ3/C. Here too we should assume thatnǫ3/2_≥_1.

Introducing a multiplicative constant of the advertised formCβ extends the above bounds to the full range ofǫin the most obvious way. Replacingǫwithǫ/2C throughout completes the proof.

Remark. With the exception of the restriction on n, nothing changes for the related conclusion in Theorem 1′.

3.2 Laguerre ensembles

Right-Tail. We wish to apply the same ideas from the Hermite case to the Laguerre formL_b(v)(for small b). Recall:

Lb(v) =

1

p

β

n

X

k=1

Zkvk2+

1

p

β

n

X

k=2

˜ Zkvk2+

2

p

β

n−1

X

k=1

Ykvk+1vk (3.14)

−bpn

n

X

k=0

(v_k+1−vk)2−b

1

p

n

X

k=1

kv_k2.

Here, Z_k,Ze_k and Y_k are as defined in (2.7), and the appropriate versions of the tail conditions for these variables (in order to apply Proposition 7) are contained in the next two lemmas.

Lemma 10. Forχ be aχrandom variable of positive parameter,

E[eλχ2]_≤eE[χ2](λ+2λ2)for all realλ <1/4.

Proof. Withr=E[χ2]>0 andλ < 1₂,

E[eλχ2] = 1 1₋2λ

r/2

.

Now, since x ≥ −1₂ implies log(1+x)_≥ x−x2, for anyλ≤ 1₄ the right hand side of the above is lesser(λ+2λ2)as claimed.

Lemma 11. Letχandχebe independentχrandom variables. Then, for everyλ∈Rsuch that|λ|<1,

E

h

eλ(χχe−E[χχe])

i

≤ p 1

1−λ2

exp λ

2

2(1₋λ2₎

E[χ]2+_E[χ_e]2+2λE[χ]_E[χ_e].

Proof. For_|λ|<1, using inequality (3.8) in theχ_evariable,

E[eλχχe]_≤_EeλE[χe]χ+λ2χ2/2=

Z ∞

−∞

Eeλ[E(χe)+s]χdγ(s)

whereγis the standard normal distribution on_R. Now, for everys, with (3.8) in theχvariable,

Eeλ(E(χe)+s)χ≤eλ(E[χe]+s)E[χ]+λ2[E[χe]+s]2/2.

(15)

What this means for the present application is that

E[eλZk_]_, _E_[_eλZek_]_≤_e2λ2 _{for all real}_λ_≤p_βκ/_4, _(3.15)

and

E[eλYk_]_≤_2e12λ2 _{for all real}_λ_with_|_λ_{| ≤}p_βκ/₂p_2. _(3.16)

To proceed, we split L_b(v)into three pieces now, isolating each of the noise components, and focus on the bound for sup_||_v_||

2=1Lb/3(v,Z) (the notation indicating (3.14) with only the Z noise term

present). One must take some care when arriving at the analog of (3.4). In obtaining an inequality of the formP(∆_m(J,Z)>t)_≤C e−t2/Cwe must be able to apply (3.15) (and (3.16) when considering theY noise term) withλ=O(t/m). But, examining (3.5) and (3.6) shows we only need consider t’s of orderpβ(pn+pκǫ)m. Thus we easily get by via

P

sup ||v||2=1

L_b(v)_≥pκǫ ≤ C_P sup ||v||2=1

L_b_/₃(v,Z)_≥pκǫ/C (3.17)

≤ C

1₋e−βǫ p_κ

nm 2_/_C

e−βκmǫ2/C+ C 1₋e−β/C

e−βpκnǫ/C m,

withC =C(b), compare (3.7). Settingmto be the nearest integer to p1 ǫ

_n

κ

1/4

puts both

exponen-tial factors on the same footing, namely on the order ofe−βκ3/4n1/4ǫ3/2/C, and removes allǫ,κ, andn dependence on the first prefactor. Certainly the best decay possible, but requiresǫ≤pn/κ. Oth-erwise, ifǫ≥pn/κ, we simply choosem=1 in which case the second term of (3.17) is the larger and produces decaye−βpnκǫ/C. Happily, both estimates agree at the common valueǫ=pn/κ.

Remark. That Lemma 11 holds for chi’s of any parameter will allows an extension to β <1 in the same spirit as the Hermite case, granted working out a Lemma 6′standing in the same relationship as Lemma 5′does to Lemma 5.

Left-Tail. It is enough to produce the bound for P(L_a(v,Z) _{≤ −}Cpκǫ||v_||2₂) for large a, given v∈Rn and a C = C(a)as in the Hermite case. Indeed, (3.15) and (3.16) show that L_a(v,Ze)and L_a(v,Y)will follow suit.

We have the estimate

PL_a(v,Z)_{≤ −}Cpκǫ||v_||2₂ (3.18)

= P

n

X

k=1

(₋Z_k)v_k2_≥pβ

Cpκǫ||v_||2₂₋apn_||∇v_||2₂₋(a/pn)_||pkv_||2₂

≤ exp

−β [C p

κǫ||v||2 2−a

p

n||∇v||2 2−(a/

p

n)_||pkv||2 2]2

8||v||44

.

Here we have introduced the shorthand

||v_||4₄=

n

X

k=1

v_k4, _||∇v_||2₂=

n

X

k=0

(v_k₊₁₋v_k)2, _||pkv_||2₂=

n

X

k=1

(16)

and have also used the fact that (3.15) applies just as well to₋Z_k. In fact, the sign precludes any concern over the required choice ofλ. For the Y-noise term, care must be taken on this point, but one may check that all is fine given our selection ofv below.

For the small deviation regime, we use a slight modification of the Hermite test vector (3.11), and set

ofǫreplaced byǫ/δ. Substitution into (3.18) yields

P

Forǫ >pn/κ, notice that the particularly simple choice of a constantvgives

P

Combined, these two bounds cover the claimed result, provided that κ3/2n1/2ǫ3 is chosen larger than one in the former. Extending this to the full range ofǫ and all remaining considerations are the same as in the Hermite setting.

3.3 Variances

We provide details forλmax(Hβ), the Laguerre case is quite the same. (Neither is difficult.) Write

Varλmax(Hβ) any ǫ = O(1). Further, from the tridiagonal model we see that λmax stochastically dominates

(1/pβ)max₁_≤_k_≤_ng_k. Hence, for δ > 0 we have the cheap estimate _P(λmax(Hβ) ≤ −δ

for allǫ >0. This easily produces

Z 2

For the other range, recall that we mentioned at the end of proof for the right-tail upper bound that the advertised estimate is easily extended to the large deviation regime (cf. (1.4)) to read

(17)

4 (Hermite) Lower Bounds

Right-Tail. This follows from another appropriate choice of test vectorv. To get started, write

P sup ||v||2=1

H(v)_≥pnǫ ≥ PH_a(v)_≥pnǫ||v_||2₂ (4.1)

≥ PH_a(v,g)_≥2pnǫ||v_||₂2Pχ(v)<pnǫ||v_||2₂.

Here, as before,χ(v) = (2/pβ)Pn_k₌−₁1(χβ(n−k)−E[χβ(n−k)])vkvk+1.

Our choice ofvis arrived at by examining the first factor above: with, as in the left-tail upper bound, a standard Gaussiang,

P

H_a(v,g)_≥2ǫpn_||v_||2₂

= _P

₂

β

n

X

k=1

v_k21/2g≥2pnǫ

n

X

k=1

v_k2+apn

n

X

k=0

(v_k₊₁₋v_k)2+_pa n

n

X

k=1

kv_k2

.

Now the intuition is that the eigenvalue (of a discretized₋d2/d x2+potential) is being forced large positive, so the potential should localize with the eigenvector following suit.

Let then

v_k=pǫk∧ 1−pǫk fork≤ǫ−1/2 and otherwise 0, where we will assume thatn≥ǫ−3/2≥ǫ−1/2. With these choices we have

||v_||2₂_{∼ ||}v_||4₄_∼ _p1

ǫ, ||∇v||

2 2∼

p

ǫ, _||pkv_||2₂_∼ 1

ǫ,

(recall the notation from (3.19)) and thus the existence of a constantC=C(a)so that

P

H_a(v,g)_≥2ǫpn_||v_||2₂_≥ 1 C e

−Cβnǫ3/2

.

Similarly, returning to the second factor on the right hand side of (4.1) and invoking the estimate (3.13) we also have

Pχ(v)_≥pnǫ||v_||2₂_≤e−βnǫ3/2/C

for the same choice of v. And granted nǫ3/2 ≥1, it follows thatP(χ(v)<pnǫ||v||22)≥ 1−e− 1/C

throughout this regime. That is,

P sup ||v||2=1

H(v)_≥pnǫ≥ _C1 e−Cβnǫ3/2 whenevernǫ3/2≥1.

Whennǫ3/2≤1, write

P

sup ||v||2=1

H(v)_≥pnǫ≥P

sup ||v||2=1

H(v)_≥pnǫ0

≥ 1

C eβC ≥ 1 C eβC e

−Cβnǫ3/2

,

(18)

Left-Tail. This relies heavily on the right-tail upper bound. The first step is to reduce to a Gaussian setting via independence: for whateverb>0,

P

Here we also use the notation of the proof of Theorem 1 (right-tail), from which we know that

P

(Asβ≥1 we are simply dropping it from the exponent on the right at this stage.) Hence, if as we regularly have start with an assumption likenǫ3/2≥C2_≥1, it follows that

Turning to H_b(v,g) we make yet another decomposition of the noise term. Let L be an integer (1_≤L_≤n) to be specified. SetS_L= 1 simply drop the beta dependence at this intermediate stage) on which

1

on that same event. Note this choice requiresǫ≤b/6; it is here that the range of valid epsilon gets cut down in our final statement. In any case, putting the last remarks together we have proved that

(19)

again under the constrainsnǫ3/2≥C2 andǫ≤ b/6. The last inequality follows asS_Lis a mean-zero Gaussian with variance of order(nǫ)−1.

The rangenǫ3/2≤C2 is handled as before,

P

sup ||v||2=1

H₂_b(v)_{≤ −}pnǫ≥P

sup ||v||2=1

H₂_b(v)_{≤ −}pnǫ0

≥ 1

C eβC5 ≥

1 C eβC5 e

−Cβn2ǫ3

,

where ǫ0 = (C2/n)2/3. Asǫ0 must lie under b/6, this last selection requires n≥ (6/b)3/2C2, but

smaller values ofncan now be covered by adjusting the constant.

It remains to go back and verify thatP(sup_||_v_||

2=1Hb(v,η)≥

p

nǫ) _≤ C e−nǫ3/2/C. The only reason that Proposition 7 cannot be followed verbatim is that the ηk’s are not independent, the first L

of them being tied together through SL. We need the appropriate Gaussian tail inequality for the

variables

△m(k,η) = max k<ℓ≤k+m

ℓ

X

j=k

ηj

,

and, comparing with (3.4), shows that an estimate of typeP(_△_m(k,η) > t) _≤ C e−t2/C m suffices. But

ℓ

X

j=k

ηj=

ℓ

X

j=k

g_j+ (ℓ∧L₋k_∧L)S_L,

and so

P △m(k,η)>t

≤P △m(k,g)>t/2

+P(mS_L>t/2).

The first term we have already seen to be of the required order, and the second is less thane−L t2/8m2. Since we only apply this bound in the present setting whenL=C nǫ≥Cǫ−1/2andm= [ǫ−1/2](the choice made in Proposition 7), we have thatP(mS_L>t/2)_≤e−t2/C m, and the proof is complete.

5 Minimal Laguerre Eigenvalue

While not detailed there, the results of[19]will imply that

(pκn)1/3

(pκ−pn)4/3

(pκ−pn)2₋λmin(Lβ)

⇒T W_β, (5.1)

wheneverκ,n_{→ ∞},κ/n_→c>1. This appraisal was long understood for the minimal eigenvalue of L_{O/U_}E, and has recently been extended to non-Gaussian versions of those ensembles in[8]. The conditionκ/n→c>1 keeps the limiting spectral density supported away from the origin, resulting in the same soft-edge behavior that one has for λmax. If insteadκ−nremains fixed in the limit,

one has a different scaling and different limit law(s) forλmin, the so-called hard-edge distributions.

Granted the existence of the “hard-to-soft transition" for allβ >0 (see[4]and[18]) it is believed that (5.1) holds as long asκ−n_{→ ∞}, but (to the best of our knowledge) this has not been explicitly worked out in any setting.

(20)

Theorem 12. Letβ≥1andκ≥n+1. Then, exponent in (5.2). Our condition onǫ is certainly not very satisfactory, although still sensible to the fluctuations in (5.1). One would hope for the range of ǫ to be understandable in terms of the soft/hard edge picture ₋ what we have here arises from technicalities. On the other hand, if we place an additional, “soft-edge" type, restriction onκandn, we obtain a more natural looking estimate.

Corollary 13. Again takeβ≥1, but now assume thatκ >cn for c>1. The right hand side of (5.2) may then be replaced by C e−βnǫ3/2/C for a C=C(c), with the resulting bound valid for all0< ǫ≤1.

The last statement should be compared with Corollary V.2.1(b) of[8], which applies to classes of non-Gaussian matrices.

As to the proof, we proceed in a by now familiar way. We first set

p

Then, after a rescaling ofǫ, we will prove the equivalent

P

sup ||v||=1

L(v)_≥α4/3pnǫ≤C e−βnǫ3/2/C forǫ≤min(α44/3,α8/3n−2/5).

Similar to the strategy employed above, a series of algebraic manipulations shows that we can work instead with the simplified quadratic form

L′(v) = _p1

We remark that under the added conditionκ >cnforc>1,αis bounded uniformly from below and

1

βpκE[χβ(κ−k+1)χeβ(n−k)]is bounded below by a constant multiple of

p

n₋k. Hence, the

determin-istic part ofL′is bounded above by a small negative multiple ofpnPn_k₌−₁1(v_k+1+vk)2+p1_n

Pn k=1kvk2.

The proof of Corollary 13 is then identical to that of the right-tail upper bound forλmax(Lβ). Back to Theorem 12 andα’s unbounded from below, we begin by rewriting the noise term in L′as

(21)

in which

U_k=_p1

βκ

h

(χβ(κ−k+1)−χeβ(n−k))2−E

(χβ(κ−k+1)−χeβ(n−k))2

i

, k=1, . . . ,n,

(with the convention thatχ_e0=0). The idea is the following. For moderatek, Var(Uk) =O(α2), and

thus it is this contribution to the noise which balances the drift term pα2_nPn_k₌₁kv_k2. Also, one may check that in the continuum limit the optimalv is such that|vk+vk+1|=o(1), and so theZeandY

terms should “wash out".

We complete the argument in two steps. In step one, we simply drop theZeandY terms and apply the method in Proposition 7 to the further simplified form

L(v,U) = _p1

β

n

X

k=1

(₋Uk)vk2− n−1

X

k=1

E[χβ(κ−k+1)χeβ(n−k)]

βpκ (vk+1+vk)

2

− α

2

p

n

X

k=1

kv_k2. (5.4)

Even here we loose a fair bit in our estimates (resulting in non-optimal on ǫ) due to the variable coefficient in the energy term. Step two shows that, under yet additional restrictions onǫ, the Ze andY noise terms may be absorbed intoL(v,U).

Step 1. We wish to prove P(sup_||_v_||₌₁L(v,U) _≥ α4/3pnǫ) _≤ C e−βnǫ3/2/C for some range of ǫ >

0. (The optimal range being 0 < ǫ ≤ (κ/n)1/2_α2/3_.₎ _{A first ingredient is a tail bound on the} _U

k

variables, for which we first bring in the following.

Lemma 14. (Aida, Masuda, Shigekawa[1]) Given a measureηon the line which satisfies a logarithmic Sobolev inequality with constant C>0, there is the estimate

Z

eλ(F2−E[F2])dη≤2e8Cλ2E[F]2whenever_|λ| ≤ 1

16C,

for any1-Lipschitz function F .

As a consequence, we have that:

Corollary 15. Letχ and χ_ebe independent χ random variables (each of parameter larger than one) and set U= (χ−χe)2andσ=_E[χ−χe]. There exists a numerical constant C>0such that

E[eλ(U−EU)]_≤C eCσ2λ2

for all realλ∈(₋1/C, 1/C).

Indeed, by the general theory (see Thm. 5.2 of[13]for example) the distribution of the pair(χ,χ_e) on_R₊_×_R₊satisfies a logarithmic Sobolev inequality. The lemma then applies withF(x,y) =x₋y. In our setting, we record this bound as

E[eλUk_]_≤_{C e}Cσ2kλ 2

for_|λ|<pβκ/C andσ2_k=_E[U_k]2.

(22)

Picking up the thread of Proposition 7, the variable coefficient in the energy term of L(v,U)is dealt with by applying the Cauchy-Schwarz argument withλ=λkdefined by

λk=

E[χβ(κ−k+1)χeβ(n−k)]

βpκ , k=1, . . . ,n−1, (5.5)

compare (3.3). Schematically, we are left to bound

[_Xn/m]

j=1

P

₁

mpβ

∆_m(jm,U)_∨ 1

λjmβ

∆_m(jm,U)2_≥ α

2

p

njm+ǫα

4/3p_n _(5.6)

for our choice of integerm. The∆_m(_·,U)notation stands in analogy to that used in Section 3. Note we have taken the liberty to drop various constants and shifts of indices in the above display (which are irrelevant to the upshot).

Here the dependence ofσ_k andλ_k on the relationship betweennandκcomes into play. While at the top of the form everything works as anticipated, these quantities behave unfavorably forknear n. For this reason we deal with the sum (5.6) by dividing the range into j ≤n/2mand j > n/2m with the help of the appraisals:

σ2_k≤

¨

Cα2, 1_≤k_≤n/2,

C, n/2<k≤n. λk≥

¨ _p

n/C, 1_≤k≤n/2, α

C

p

n−k, n/2<k<n. (5.7)

Restricted to j _≤ n/2m(and hence substituting σ2_jm = Cα2, λjm =

p

n/C), the sum (5.6) can be

bounded by C e−βnǫ3/2/C upon choosing m = [ǫ−1/2α−2/3]. This holds for all values of ǫ so long as the choice of m is sensible, requiring thatǫ ≥ α−4/3n−2. But this is ensured ifκ≥ n+1 and

ǫ3/2n≥1 (the former having been built into the hypotheses and the latter we may always assume). On the range j_≥n/2mtheǫterm on the right hand side within the probabilities is of no help, and we use, along withσ2_jm≤C andλjm≤α

p

n₋ jm/C, the crude estimates

X

n/2m≤j<n/m

P

∆_m(jm,U)_≥pβα2m2_pj n

≤ C X

j≥n/2m

e−βm3α4j2/C n

≤ C m−2α−4e−βmα4n/C,

and

X

n/2m≤j<n/m

P∆_m(jm,U)2_≥βα2mλjm

j

p

n

≤ C X

1≤j≤n/2m

e−βα3(n/m)1/2j1/2/C

≤ C 1+ (m/nα6)e−βα3(n/m)1/2/C.

The choice of m = [ǫ−1/2_α−2/3_] _{being fixed, we can bound each of the above by the desired}

C e−βnǫ3/2/C only by restricting ǫ to be sufficiently small. The first estimate requires ǫ ≤ α20/3, the second requires in addition thatǫ≤α8/3n−2/5 (and again usesnǫ3/2≥1).

In summary

P

sup ||v||=1

(23)

It is perhaps worth mentioning here that the bounds onλkandσ2kfor the rangek≥n/2 introduced

in (5.7) may be improved slightly, though not apparently with great effect on the final result.

Step 2. To absorb the Z,e Y noise terms, we show that L′(v) _≤ L˜(v,U) +_E(Z,e Y,v) with a new

(Recall the definition of λk from (5.5).) Then, an application of the Cauchy-Schwarz inequality

yields: for allvof length one,

1

to those stated in (5.8) completes the proof.

(24)

References

[1] AIDA, S., MASUDA, T.,ANDSHIGEKAWA, I. (1994) Logarithmic Sobolev inequalities and

exponen-tial integrability.J. Funct. Anal.126, no. 1, 83-101. MR1305064

[2] BAIK, J., DEIFT, P., JOHANSSON, K. (1999) On the distribution of the length of the longest

increasing subsequence of random permutations. J. Amer. Math. Soc.12, no. 4, 1119–1178. MR1682248

[3] BAIK, J., DEIFT, P., MCLAUGHLIN, K., MILLER, P.AND ZHOU, Z. (2001) Optimal tail estimates for

directed last passage site percolation with geometric random variables.Adv. Theor. Math. Phys.

5, 1207–1250. MR1926668

[4] BORODIN, A., FORRESTER, P. (2003) Increasing subsequences and the hard-to-soft transition in

matrix ensembles.J. Phys. A: Math and Gen.36, no. 12, 2963-2982. MR1986402

[5] DUMITRIU, I.ANDEDELMAN, A. (2002) Matrix models for beta ensembles.J. Math. Phys.43, no.

11, 5830-5847. MR1936554

[6] EDELMAN, A., SUTTON, B. (2007) From random matrices to stochastic operators.J. Stat. Phys. 127, no. 6, 1121-1165. MR2331033

[7] ELKAROUI, N. (2003) On the largest eigenvalue of Wishart matrices with identity covariance

whenn,p, andp/n_{→ ∞}.To appear, Bernoulli.

[8] FELDHEIM, O., SODIN, S. (2009) A universality result for the smallest eigenvalues of certain

sample covariance matrices.To appear, Geom. Funct. Anal.

[9] FORRESTER, P.Log-gases and Random Matrices. London Math. Soc. Monographs. Princeton

Uni-versity Press, 2010. MR2641363

[10] FORRESTER, P.ANDRAINS, E. Interrelations between orthogonal, unitary and symplectic matrix

ensembles (2001). In Random Matrix Models and their Applications, volume 40 of Math. Sci. Inst. Res. Publ., 171–207. Cambridge University Press. MR1842786

[11] JOHANSSON, K. (2000) Shape fluctuations and random matrices.Comm. Math. Phys.209, no. 2, 437-476. MR1737991

[12] JOHNSTONE, I. M. (2001) On the distribution of the largest eigenvalue in principal components

analysis.Ann. Statist.29, no. 2, 295-327. MR1863961

[13] LEDOUX, M. The Concentration of Measure Phenomenon. Mathematical Surveys and

Mono-graphs89. Amer. Math. Soc. 2001. MR1849347

[14] LEDOUX, M. (2004) Differential operators and spectral distributions of invariant ensembles

from the classical orthogonal polynomials: The continuous case.Elect. Journal Probab.9, 177-208. MR2041832

[15] LEDOUX, M. (2007) Deviation inequalities on largest eigenvalues. Lecture Notes in Math.1910,

(25)

[16] LEDOUX, M. (2009) A recursion formula for the moments of the Gaussian Orthogonal

Ensem-ble.Annales Inst. H. Poincaré45, 754-769. MR2548502

[17] QI, F. (2010) Bounds for the Ratio of Two Gamma Functions. Journal of Inequalities Appl. 2010, Article ID 493058. MR2611044

[18] RAMÍREZ, J., RIDER, B. (2009) Diffusion at the random matrix hard edge.Comm. Math. Phys. 288, no. 3, 887-906. MR2504858

[19] RAMÍREZ, J., RIDER, B., AND VIRÁG, B. (2007) Beta ensembles, stochastic Airy spectrum and a

diffusion.Preprint, arXiv:math.PR/0607331.

[20] SILVERSTEIN, J. (1985) The smallest eigenvalue of a large dimensional Wishart matrix. Ann

Probab.13, no. 4, 1364-1368. MR0806232

[21] SOSHNIKOV, A. (1999) Universality at the edge of the spectrum in Wigner random matrices.

Comm. Math. Phys.207, 697-733. MR1727234

[22] TAO, T., AND VU, V. (2009) Random matrices: Universality of local eigenvalue statistics up to

the edge.Preprint, arXiv:0908.1982.

[23] TRACY, C.,ANDWIDOM, H. (1994) Level spacing distributions and the Airy kernel.Comm. Math.

Phys.159, no. 1, 151-174. MR1257246

[24] TRACY, C., AND WIDOM, H. (1996) On orthogonal and symplectic matrix ensembles. Comm.

Math. Phys.177, no. 3, 727-754. MR1385083

[25] TRACY, C., AND WIDOM, H. (2009) Asymptotics in ASEP with Step Initial Condition. Comm.

Math. Phys.290, no. 1, 129-154. MR2520510

[26] TROTTER, H. F. (1984) Eigenvalue distributions of large Hermitian matrices; Wigner’s