getdoc41e2. 222KB Jun 04 2011 12:04:18 AM

(1)

El e c t ro n ic

Jo ur

n a l o

f P

r o

b a b i_{l i t y}

Vol. 14 (2009), Paper no. 83, pages 2391–2417. Journal URL

http://www.math.washington.edu/~ejpecp/

CLT for Linear Spectral Statistics of Wigner matrices

∗

Zhidong Bai

†

Xiaoying Wang

‡

Wang Zhou

§

Abstract

In this paper, we prove that the spectral empirical process of Wigner matrices under sixth-moment conditions, which is indexed by a set of functions with continuous fourth-order deriva-tives on an open interval including the support of the semicircle law, converges weakly in finite dimensions to a Gaussian process.

Key words:Bernstein polynomial, central limit theorem, Stieltjes transform, Wigner matrices. AMS 2000 Subject Classification:Primary 15B52, 60F15; Secondary: 62H99.

Submitted to EJP on July 17, 2009, final version accepted October 14, 2009.

∗_{The first two authors were partially supported by CNSF 10871036 and NUS grant R-155-000-079-112 and the third} author was partially supported by grant R-155-000-076-112 at the National University of Singapore

†_{School of Mathematics and Statistics, Northeast Normal University, Changchun, 130024, P.R.C. Email:} [email protected]

‡_{School of Mathematics and Statistics, Northeast Normal University, Changchun, 130024, P.R.C. Email:} [email protected]

§_{Department of Statistics and Applied Probability, National University of Singapore,117546, Singapore.} _Email:

(2)

1 Introduction and Main result

The random matrices theory originates from the development of quantum mechanics in the 1950’s. In quantum mechanics, the energy levels of a system are described by eigenvalues of a Hermitian operator on a Hilbert space. Hence physicists working on quantum mechanics are interested in the asymptotic behavior of the eigenvalues and from then on, the random matrices theory becomes a very popular topic among mathematicians, probabilitists and statisticians. The leading work, the famous semi-circle law for Wigner matrices was found in[19].

A real Wigner matrix of sizenis a real symmetric matrixW_n= (x_{i j})₁_≤_i_,_j_≤_nwhose upper-triangle en-tries(xi j)1≤i≤j≤nare independent, zero-mean real-valued random variables satisfying the following moment conditions:

(1)._∀i, E_|xii|2=σ2>0; (2).∀i< j, E|xi j|2=1.

The set of these real Wigner matrices is called the Real Wigner Ensemble (RWE).

A complex Wigner matrix of size n is a Hermitian matrix W_n whose upper-triangle entries

(xi j)1≤i≤j≤n are independent, zero-mean complex-valued random variables satisfying the follow-ing moment conditions:

(1).∀i, E_|xii|2=σ2>0; (2).∀i< j, E|xi j|2=1, andExi j2 =0.

The set of these complex Wigner matrices is called the Complex Wigner Ensemble (CWE).

The empirical distribution F_n generated by the n eigenvalues of the normalized Wigner matrix

n−1/2Wn is called the empirical spectral distribution (ESD) of Wigner matrix. The semi-circle law states thatF_na.s. converges to the distributionF with the density

p(x) = 1

2π

p

4−x2_, _x _∈_[₋_{2, 2}_]_.

Its various versions of convergence were later investigated. See, for example,[1],[2].

Clearly, one method of refining the above approximation is to establish the rate of convergence, which was studied in[3],[10],[12],[13],[18],[5]and[8]. Although the exact convergence rate remains unknown for Wigner matrices, Bai and Yao[6]proved that the spectral empirical process of Wigner matrices indexed by a set of functions analytic on an open domain of the complex plane including the support of the semi-circle law converges to a Gaussian process under fourth moment conditions.

To investigate the convergence rate of the ESD of Wigner matrix, one needs to use f as step func-tions. However, many evidences show that the empirical process associated with a step function can not converge in any metric space, see Chapter 9 of[8]. Naturally, one may ask whether it is possible to derive the convergence of the spectral empirical process of Wigner matrices indexed by a class of functions under as less assumption on the smoothness as possible. This may help us to have deeper understanding on the exact convergence rate of ESD to semi-circle law.

(3)

fourth-order derivatives, where_U is an open interval including the interval[₋2, 2], the support of

F(x). The empirical processGn¬{Gn(f)}indexed byC4(U)is given by

G_n(f)¬n

Z ∞

−∞

f(x)[F_n−F](d x), f ∈C4(_U) (1.1)

In order to give a unified treatment for Wigner matrices, we define the parameterκwith values 1 and 2 for the complex and real Wigner matrices respectively. Also setβ=E(_|x12|2−1)2−κ. Our

main result is as follows.

Theorem 1.1. Suppose

E_|x_{i j}_|6_≤M for all i,j. (1.2)

Then the spectral empirical process G_n =_{G_n(f): f _∈C4(_U)_}converges weakly in finite dimensions to a Gaussian process G:=_{G(f): f ∈C4(_U)_}with mean function

EG(f) =κ−1

4 [f(2) + f(−2)]−

κ−1

2 τ0(f) + (σ

2

−κ)τ2(f) +βτ4(f).

and covariance function

c(f,g) ¬ E[_{G(f)₋EG(f)_}{G(g)₋EG(g)_}]

= 1

4π2

Z 2

−2

Z 2

−2

f′(t)g′(s)V(t,s)d t ds

where

V(t.s) =

σ2−κ+1

2βts

p

(4₋t2₎₍₄₋_s2₎

+κlog

 

4₋ts+p(4₋t2₎₍₄₋_s2₎

4−ts−p(4−t2)(4−s2)

 

and

τl(f) = 1 2π

Z π

−π

f(2cosθ)eilθdθ= 1

2π

Z π

−π

f(2cosθ)cos(lθ)dθ

= 1

π

Z 1

−1

f(2t)T_l(t)_p 1

1₋t2 d t.

Here_{T_l,l _≥0_}is the family of Chebyshev polynomials.

(4)

ideal assumptions. Therefore, the quantum physicists may want to test the validity of the ideal as-sumptions. Therefore, they may suitably select one or several f’s so that θ’s may characterize the semicircular law. Using the limiting distribution of G_n(f) =n( ˆθ −θ), one may perform statistical test of the ideal hypothesis. Obviously, one can not apply the limiting distribution ofn( ˆθ−E_θˆ)to the above test.

Remark 1.3. Checking the proof of Theorem 1.1, one finds that the proof still holds under the assumption of bounded fourth moments of the off-diagonal entries is finite, if the approximation

G_n(f_m) ofG_n(f) is of the desired order. Thus, the assumption of 6-th moment is only needed in deriving the convergence rate of_kF_n₋F_k. Furthermore, the assumption of the fourth derivative of f is also related to the convergence rate of kFn−Fk. If the convergence rate of kFn −Fk is improved and/or proved under weaker conditions, then the result of Theorem 1.1 would hold un-der the weakened conditions. We conjecture that the result would be true if we only assume the fourth moments of the off-diagonal elements of Wigner matrices are bounded and f have the first continuous derivative.

Pastur and Lytova in[17]studied asymptotic distributions ofnR f(x)d(F_n(x)₋EF_n(x))under the conditions that the fourth cumulant of off-diagonal elements of Wigner matrices being zero and the Fourier transform of f(x) having 5th moment, which implies that f has the fifth continuous derivative. Moreover, they assume f is defined on the whole real line. These are stronger than our conditions.

Remark 1.4. The strategy of the proof is to use Bernstein polynomials to approximate functions in

C4(_U). This will be done in Section 2. Then the problem is reduced to the analytic case, which has been intensively discussed in Bai and Yao[6]. But the functions in[6]are independent ofnand thus one can choose fixed contour and then prove that the Stieltjes transforms tend to a limiting process. In the present case, the Berstein polynomials depend onnthrough increasing degrees and thus they are not uniformly bounded on any fixed contour. Therefore, we cannot simply use the results of Bai and Yao[6]and thus we have to choose a sequence of contours approaching to the real axes so that the approximating polynomials are uniformly bounded on the corresponding contours. On this sequence of contours, the Stieltjes transforms don’t have a limiting process. Our Theorem 1.1 cannot follow directly from[6]. We have to find alternative ways to prove the CLT.

Remark 1.5. It has been also found in literature that the so-called secondary freeness is proposed and investigated in free probability. Readers of interest are referred to Mingo [14; 15; 16]. We shall not be much involved in this direction in the present paper. As a matter, we only comment here that both the freeness and the secondary freeness are defined on sequences of random matrices (or ensembles) for which the limits of expectations of the normalized traces of all powers of the random matrices and the limits of covariances of unnormalized traces of powers of the random matrices exist. Therefore, for a single sequence of random matrices, the existence of these limits has to be verified and the verification is basically equivalent to moment convergence method. Results obtained in[14]is in some sense equivalent to those of[17].

(5)

2 Bernstein polynomial approximation

It is well-known that if ˜f(y)is a continuous function on the interval[0, 1], the Bernstein polynomials

˜

f_m(y) =

m

X

k=0

_m

k

yk(1₋y)m−kf˜

_k

m

converge to ˜f(y)uniformly on[0, 1]asm_{→ ∞}. Suppose ˜f(y)_∈C4[0, 1]. Taylor expansion gives

˜

f

_k

m

= ˜f(y) +

_k

m− y

˜

f′(y) +1

2

_k

m−y

2

˜

f′′(y)

+1

3!

_k

m−y

3

˜

f(3)(y) + 1

4!

_k

m−y

4

˜

f(4)(ξy).

Hence

˜

fm(y)−f˜(y) =

y(1₋y)f˜′′(y)

2m +O

₁

m2

. (2.1)

For the functionf _∈C4(_U), there existsa>2 such that[₋a,a]_{⊂ U}. Make a linear transformation

y = K x+ 1

2, ε ∈(0, 1/2), whereK = (1−2ε)/(2a), then y ∈[ε, 1−ε]if x ∈[−a,a]. Define

˜

f(y)¬ f(K−1(y₋1/2)) = f(x), y _∈[ε, 1₋ε], and define

f_m(x)¬ ˜f_m(y) =

m

X

k=1

_m

k

yk(1−y)m−kf˜

_k

m

.

From (2.1), we have

f_m(x)₋f(x) = f˜_m(y)₋ ˜f(y) = y(1− y)

˜

f′′(y)

2m +O

₁

m2

.

Since ˜h(y)¬ y(1₋ y)f˜′′(y) has a second-order derivative, we can use Bernstein polynomial ap-proximations once again to get

˜

hm(y)−˜h(y) = m

X

k=1

_m

k

yk(1− y)m−k˜h(k

m)−˜h(y)

= O

₁

m

.

So, withh_m(x) =˜h_m(K x+1

2),

f(x) = fm(x)− 1

2mhm(x) +O

₁

m2

(6)

Therefore,G_n(f)can be split into three parts:

G_n(f) = n

Z ∞

−∞

f(x)[F_n₋F](d x)

= n

Z

f_m(x)[F_n−F](d x)₋ n

2m

Z

h_m(x)[F_n−F](d x)

+n

Z

f(x)₋fm(x) + 1 2mhm(x)

[Fn−F](d x)

= ∆₁+ ∆₂+ ∆₃. For∆₃, by Lemma 6.1 given in Appendix,

kFn−Fk=Op(n−2/5).

Here and in the sequel, the notationZn =Op(cn) means that for anyε > 0, there exists anM >0 such that sup_nP(_|Z_{n| ≤} M c_n) < ε. Similarly, Z_n = o_p(c_n) means that for any ε >0, lim_nP(_|Z_{n| ≤}

εc_n) =0.

Takingm2= [n3/5+ε0_]_{, for some}_ε

0>0 and using integration by parts, we have that

∆₃ = ₋n

Z

f(x)₋f_m(x) + 1

2mhm(x)

′

F_n(x)₋F(x)

d x

= Op(n−ε0)

since f(x)₋ fm(x) +₂1_mhm(x)

′

=O(m−2). From now on we choose ε0 = 1/20, and thenm= n13/40.

Note that f_m(x)andh_m(x)are both analytic. Following from the result proved in Section 5, replac-ing f_m byh_m, we obtain that

∆₂=O(∆1)

m =op(1).

It suffices to consider ∆₁ = Gn(fm). In the above, fm(x) and gm(y) are only defined on the real line. Clearly, the two polynomials can be considered as analytic functions on the complex regions

[₋a,a]_×[₋ξ,ξ]and[ε, 1₋ε]_×[₋Kξ,Kξ], respectively.

Since g_∈C4[0, 1], there is a constant M, such that_|g(y)_|< M, _∀y _∈[ε, 1₋ε]. Noting that for

(u,v)_∈[ε, 1₋ε]_×[₋Kξ,Kξ],

|u+i v_|+_|1₋u₋i v_| = pu2₊_v2₊p₍₁₋_u₎2₊_v2

≤ u

1+ v

2

2u2

+ (1₋u)

1+ v

2

2(1₋u)2

≤ 1+ v

2

ε

we have, for y=K x+1/2=u+i v,

|f_m(x)_| = _|f˜_m(y)_|=

n

X

k=1

_m

k

yk(1₋y)m−kf˜

_k

m

≤ M

1+ v

2

ε

m

(7)

If we take _|ξ| ≤ K/pm, then _|f˜m(y)| ≤ M

1+K2/(mε)m _→ M eK2/ε, as m → ∞. So ˜fm(y) is bounded when y ∈ [ε, 1−ε]_×[₋K/pm,K/pm]. In other words f_m(x) is bounded when x ∈

[₋a,a]_×[₋1/pm, 1/pm]. Let γm be the contour formed by the boundary of the rectangle with vertices(_±a±i/pm). Similarly, we can show thathm(x), fm′(x)andh′m(x)are bounded onγm.

3 Simplification by truncation and Preliminary formulae

3.1 Truncation

As proposed in [9], to control the fluctuations around the extreme eigenvalues under condi-tion (1.2), we will truncate the variables at a convenient level without changing their weak limit.

Condition (1.2) implies the existence of a sequenceηn↓0 such that

1

n2η4

n

X

i,j

E

h

|xi j|4I_|_x i j|≥ηnpn

i

=o(1).

We first truncate the variables as ˆxi j = xi jI|xi j|≤ηnpn and normalize them to ˜xi j = ( ˆxi j−Eˆxi j)/si j,

wheres_{i j} is the standard deviation of ˆx_{i j}.

LetFˆ_n and ˜F_ndenote the ESDs of the Wigner matricesn−1/2( ˆx_{i j})andn−1/2(x˜_{i j}), andGˆ_nand ˜G_nthe corresponding empirical process, respectively. First of all, by (1.2),

P(G_n₆= ˆG_n) ¶ P(F_n= ˆ₆ F_n)¶P( ˆx_{i j}₆= ˜x_{i j})

¶ X

i,j

P(_|x_{i j}_{| ≥}ηn p

n)

¶ (ηn p

n)−4X

i,j

E_|x_{i j|}4I_|_x

i j|≥ηnpn=o(1).

Second, we compare ˜G_n(f_m)andGˆ_n(f_m). Note that

max

i,j |1−si j|

¶ max i,j |1−s

2

i j|

= max i,j

h

E(_|x_{i j|}2I_|_x

i j|≥ηnpn) +|E(xi jI|xi j|≥ηnpn)| 2i

≤ (n−1η−_n2+Mη−_n6n−3)max i,j [

E(_|x_{i j}_|4I_|_x i j|≥ηn

p

n)]→0.

Therefore, there exist positive constantsM₁ andM₂so that

X

i,j

E(_|xi j|2|1−s−i j1|

2₎

≤M1

X

i,j

(1−s2_{i j})2_≤ M2

η4

nn2

X

i,j

E(x_{i j}4I_|_x

(8)

Since f(x)_∈C4(_U)implies that f_m(x)are uniformly bounded in x_∈[₋a,a]andm, we obtain

E_|G˜_n(f_m)₋Gˆ_n(f_m)_|2 _≤ ME

  

n

X

j=1

|λ˜n j−λˆn j|

  

2

≤M nE

n

X

j=1

|λ˜n j−λˆn j|2

= M nEX

i,j

|n−1/2(x˜_{i j}−ˆx_{i j})_|2

≤ M

  

X

i,j

(E_|x_{i j|}2)_|1₋s−_{i j}1_|2+X

i,j

|E( ˆx_{i j})_|2s−_{i j}2

  

= o(1),

where ˜λn j andλˆn j are the jth largest eigenvalues of the Wigner matricesn−1/2(˜xi j)andn−1/2( ˆxi j), respectively.

Therefore, the weak limit of the variables(Gn(fm))is not affected if we substitute the normalized truncated variables ˜x_{i j} for the original x_{i j}.

From the normalization, the variables ˜x_{i j} all have mean 0 and the same absolute second moments as the original variables. But for the CWE, the conditionEx_{i j}2 =0 does no longer remain true after these simplifications. Fortunately, we have the estimateE˜x_{i j}2 = on−1η2_n, which is good enough for our purposes.

For brevity, in the sequel we still use xi j to denote the truncated and normalized variables ˜xi j.

3.2 Preliminary formulae

Recall that for a distribution functionH, its Stieltjes transforms_H(z)is defined by

sH(z) =

Z

1

x₋zd H(x), z∈C.

The Stieltjes transforms(z)of the semicircle law F is given bys(z) =₋1

2(z−sgn(Im(z))

p

z2₋₄₎

forz withIm₍_z₎ ₆₌_{0 which satisfies the equation}_s₍_z₎2₊_zs₍_z_{) +}₁₌_{0. Here and throughout the}

paper,pzof a complex numberzdenotes the square root ofzwith positive imaginary part.

Define D= (n−1/2W_n₋z I_n)−1. Letαk be the kth column ofWn with xkk removed andWn(k)the submatrix extracted fromW_n by removing itskth row andkth column. DefineD_k= (n−1/2W_n(k)₋

(9)

following notations:

βk(z) = − 1 p

nxkk+z+n

−1_α∗

kDkαk,

δn(z) = − 1

n

X

j=1

ǫk(z)

βk(z) ,

ǫk(z) = 1 p

nxkk−

1

nα

∗

kDkαk+Esn(z),

q_k(z) = _p1

nxkk−

1

n

α∗_kD_k(z)αk−t r Dk(z)

,

ω_k(z) = ₋_p1

nxkk+n

−1_α∗

kDkαk−s(z),

b_n(z) = n[Es_n(z)₋s(z)].

The Stieltjes transforms_n(z)ofF_nhas the representation

sn(z) = 1

nt r D=

1

nt r

_W

n p

n−z In

−1

= ₋1

n

X

k=1

1

βk(z)

=₋ 1

z+Es_n(z)+

δn(z)

(z+Es_n(z)). (3.2)

Throughout this paper,M may denote different constants on different occassions andε_na sequence of numbers which converges to 0 asngoes to infinity.

4 The mean function

Letλe x t(n−1/2Wn)denote both the smallest and the largest eigenvalue of the matrixn−1/2Wn (de-fined by the truncated and normalized variables). Forη0<(a−2)/2, define

A_n=_{|λ_{e x t}(n−1/2W_n)_{| ≤}2+η0}.

Then, by Cauchy integral we have

∆₁=G_n(f_m) = 1

2πi

Z I

γm f_m(z)

z₋xn[Fn−F](d x)dz IAn

+

Z

f_m(x)n[F_n₋F](d x)I_Ac n.

SinceP(Ac_n) =o(n−t)for anyt>0 (see the proof of Theorem 2.12 in[4]),

Z

(10)

in probability. It suffices to consider

∆ = ₋ 1

2πi

I

γm

fm(z)n[sn(z)−s(z)]IAndz. (4.3)

In the remainder of this section, we will handle the asymptotic mean function of∆. The convergence of random part of∆will be given in Section 5.

To this end, writeb_n(z) =n[Es_n(z)₋s(z)]andb(z)¬[1+s′(z)]s3(z)[σ2₋₁₊₍_κ₋₁₎_s′₍_z₎₊_β_s2₍_z_)]_.

Then we prove

E∆_h = ₋ 1

2πi

I

γmh

fm(z)b(z)dz

− 1 2πi

I

γmh

fm(z)[n(Esn(z)IAn−s(z))−b(z)]dz+o(1)

¬ R₁+R₂+o(1) (4.4)

∆_v = ₋ 1

2πi

I

γmv

fm(z)n[sn(z)−s(z)]IAndz=op(1), (4.5)

where and in the sequelγmhdenotes the union of the two horizontal parts ofγmandγmvthe union of the two vertical parts. The limit ofR1 is given in the following proposition.

Proposition 4.1. R1 tends to

EG(f) = κ−1

4 [f(2) + f(−2)]−

κ−1

2 τ0(f) + (σ

2

−κ)τ2(f) +βτ4(f)

for both the CWE and RWE.

Proof. Since fm(z)are analytic functions, by[6], we have

R₁ = ₋ 1

2πi

I

γmh

f_m(z)b(z)dz

≃ κ−1

4 [fm(2) +fm(−2)]−

κ−1

2 τ0(fm) + (σ

2

−κ)τ2(fm) +βτ4(fm),

wherea≃ bstands for a/b→1 asn→ ∞,

τl(fm) = 1 2π

Z π

−π

f_m(2cosθ)eilθdθ.

As fm(t)→ f(t)uniformly ont∈[−a,a]as m→ ∞, it follows that

R1−→

κ−1

4 [f(2) + f(−2)]−

κ−1

2 τ0(f) + (σ

2

(11)

Since f_m(z)is bounded on and inγm, in order to prove R2→0 asm→ ∞, it is sufficient to show

where we have used Lemma 6.1. This implies

(12)

where we have used_|t r D(z)₋t r D_k(z)_|¶1/v,E_|t r D(z)₋E t r D(z)_|2l¶cn−2lv−4l(∆ +v)l (l _≥1)

which is derived from Lemma 6.2.

Fact 3.

From (3.2) we have

(13)

where theo(1)is uniform forz_∈γmh.

Finally we deal with

(14)

By the proof of (4.3) in[7], we have

Hence we neglect this item. In order to estimate the expectation in (4.11), we introduce the notationαk= (x1k,x2k, ...,xk−1k,xk+1k, ...,xnk)∗¬(xi)and Dk(z)¬(di j). Note thatαk andDk(z)

Here we need to consider the difference between the CWE and RWE.

For the RWE, all the original and truncated variables have the properties: Ex_{i j} = 0 andE_|x_{i j|}2 = which allow us to have the following unified expression,

(15)

Hence

Eǫ2_k(z) = σ

2 n +

1

n2



κE[t r D2_k(z)] +βE  

X

i

d_ii2

 +o

1

nv2η4

n

 +

M

n2v3,

which combined with

E

1

nt r D 2

k(z)−s′(z)

2

¶ 1

v4(kF

k

n−1−Fnk+kFn−Fk)2

¶ M

v4

1

n+

1

n25−η

2

¶M n−203+2η

and

E_|d_ii2₋s2(z)_| ¶ E_|[D_k(z)]_ii+s(z)_||[D_k(z)]_ii₋s(z)_|

¶ M

₁

v +1

E_|[D_k(z)]_ii₋s(z)_|¶ M v

E_|[D_k(z)]_ii₋s(z)_|21/2

¶ M

v

₁

nv4

1

2

=_pM

nv3,

gives

|S₂+s2(z)[σ2+κs′(z) +βs2(z)]_|¶ _pM nv3.

Summarizing the estimates ofS_i,i=1, ..., 4, we have obtained that

|nEδn(z) +s2(z)[σ2−1+ (κ−1)s′(z) +βs2(z)]|¶

M

p

nv3

Note that

b_n(z) = n[Es_n(z)₋s(z)] = 2sgn(

Im₍_z₎₎_nEδn(z)

p

z2₋₄₍₁₋_E_δ

n(z)) +

p

z2₋₄

= sgn(

Im₍_z₎₎_nEδn(z)

p

z2₋₄

(1+o(1)).

and that

s′(z) =₋ s(z)

z+2s(z) =−

s(z)

sgn(Im₍_z₎₎p_z2₋₄

.

The second equation is equivalent to

sgn(_ℑ(z))

p

z2₋₄

=s(z)(1+s′(z)).

From the above two equations, we conclude

(16)

Then, (4.6) is proved by noticing that_|b_n(z)₋b_nA(z)_{| ≤}nv−1P(Ac_n)_≤o(n−t)for any fixed t. Now, we proceed the proof of (4.5). We shall find a setQn such thatP(Qnc) =o(n−1)and

I

γmv

f_m(z)n(s_n(z)I_Q

n−s(z))dz=op(1). (4.12)

By continuity of s(z), there are positive constants M_l and M_u such that for all z _∈ γmv, Ml ≤ |z+s(z)_{| ≤}M_u. DefineB_n=_{min

k |βk(z)|

I_A

n> ε}, whereε=Ml/4. So

P(B_nc) = P(min k |βk(z)|

I_A

n¶ε)¶

n

X

k=1

P(_|βk(z)|IAn¶ε)

¶

n

X

k=1

P(_|βk(z)−z−s(z)|IAn¾ε) +nP(A

c n)

¶ 1

ε4

n

X

k=1 E

|βk(z)−z−s(z)|4IAn

+nP(Ac_n)

¶ Mh 1 n2E|xkk|

4₊ 1 n4E

|α∗_kDk(z)αk−t r Dk(z)|4IAn

+E

|1

nt r Dk(z)−s(z)| 4_I

An

i

+nP(Ac_n).

LetA_nk=_{|λ_{e x t}(n−1/2W_nk)_{| ≤}2+η₀}. It is trivially obtained thatA_n⊆A_nk(see[11]). Noting the independence ofI_A

nk andαk, we have

n−4E

|α∗_kDk(z)αk−t r Dk(z)|4IAn

≤ n−4E

[Ek_|_α∗

kDk(z)αk−t r Dk(z)|4]IAnk

≤ M n−4E

[ E_|x₁₂_|4t r D_k(z)D∗_k(z)2

+E_|x₁₂_|8t r D_k(z)D∗_k(z)2

]I_A nk

≤ M n−2,

whereEkdenotes the expectation taken only aboutαk. Therefore

P(Bc_n)_≤M n−2+n−2+ (n−2/5+η)4

+nP(Ac_n)_≤M n−8/5+4η.

This givesP(Bc_n) =o(n−1)as n→ ∞. DefineQn¬An∩Bn, we haveP(Qn)→1, asn→ ∞. Similar to (4.7), we have forz_∈γmv,

Es_n(z)I_Q n=−

P(Q_n)

z+Es_n(z)I_Q n

+ δnQ(z)

z+Es_n(z)I_Q n

, (4.13)

where

δnQ(z) = 1

n

X

k=1 EǫkQ

(z)I_Q n

βk(z)

ǫkQ(z) = 1

p_nx_kk₋ 1 nα

∗

(17)

Similar to (4.7), we have

Es_n(z)I_Q n=−

1 2

z₋sgn(Im₍_z₎₎

Æ

z2₋₄₍_P₍_Q

n)−δnQ(z))

.

Therefore,

bnQ(z) = n(Esn(z)IQn−s(z))

= 2n[P(Q

c

n) +δnQ(z)]

sgn(Im₍_z_))[p_z2₋₄₍_P₍_Q_n₎₋_δ_nQ₍_z_{)) +}p_z2₋₄_]

.

We shall prove that bnQ(z)is uniformly bounded for all z∈γmv. Noticing that|z2−4|>(a−2)2 and the factP(Qc_n) =o(1), we only need to show that

nδnQ(z) is uniformly bounded forz∈γmv. (4.14)

Rewrite

nδnQ(z) = n

X

k=1

EǫkQIQn

z+Es_n(z)I_Q n

+

n

X

k=1

E_ǫ2

kQIQn

βk(z)[z+Esn(z)IQn]

.

At first we have

E_ǫ_kQI_A nk =

1

nE

h

t r D(z)I_Q

nP(Ank)−t r Dk(z)IAnk

i

= 1

nE

_I

Qn

βk

P(A_nk)₋t r D_k(z)(I_Qc

nAnk+IQnP(A

c nk))

=O(1/n).

Here, the result follows from facts that 1/βk is uniformly bounded when Qn happens, 1_nDk(z) is bounded whenAnk happens and P(Qcn) = o(1/n). From this and the facts that Esn(z)IQn → s(z)

uniformly onγmv and

E_|ǫkQ|I_Qc nAnk ≤

1

nE

h

t r D(z)I_Q nP(Q

c

nAnk)−t r Dk(z)IQc nAnk

i

= O(1/n),

we conclude that the first term in the expansion ofnδnQ(z)is bounded.

By similar argument, one can prove that the second term of the expansion ofnδnQ(z)is uniformly bounded onγmv. Therefore, we have

Proposition 4.2. H_γ

mv fm

(z)n(Esn(z)IQn−s(z))dz→0in probability as n→ ∞.

Therefore, to complete the proof of (4.5), we only need to show that

Proposition 4.3. H γmv fm

(z)n(sn(z)IQn−Esn(z)IQn)dz→0in probability as n→ ∞.

(18)

5 Convergence of

∆

₋

E

∆

Let_Fk= (x_{i j},k+1_≤i,j_≤n)for 0_≤k_≤nandE_k(_·) =E(_·|Fk). Based on the decreasing filtration

(_Fk), we have the following well known martingale decomposition

n[sn(z)−Esn(z)] =t r D(z)−Et r D(z)

At first, we note that

P(R336=0)≤P(

[

Ac_nk)_≤P(Ac_n)_→0.

Next, we note thatR₃₂is a sum of martingale differences. Thus, we have

(19)

Whenz_∈γmvandAnkhappens, we have|n−1t r Dk(z)Dk(¯z)| ≤η−02. Also,z+n− 1_{t r D}

k(z)→z+s(z) uniformly. Further,_|z+s(z)_|has a positive lower bound onγmv. These facts, together with (5.15), imply that

R₃₂_→0, in probability.

The proof of Proposition 4.3 is the same as those forR₃₂_→0 andR₃₃_→0. We omit the details. Note that whenz_∈γmh

Eqk(z) =0, E

q_k(z)

z+n−1t r Dk(z)

=0.

Taylor expansion of the log function implies

R31 =

1 2πi

n

X

k=1

I

γmh

(E_k₋₁₋E_k)h qk(z)

z+n−1t r Dk(z)

+O

_q

k(z) −z−n−1t r D_k(z)

2 i

f_m′(z)dz

= 1

2πi

n

X

k=1

I

γmh

E_k₋₁ qk(z) z+n−1t r Dk(z)

f_m′(z)dz+op(1)

¬ 1

2πi

n

X

k=1

Ynk+op(1),

where o_p(1) follows from the following Condition 5.1. Clearly Y_nk_{∈ Fk}₋₁ andE_kY_nk=0. Hence {Y_nk,k = 1, 2, ...,n_} is a martingale difference sequence and Pn_k₌₁Y_nk is a sum of a martingale difference sequence. To save notation, we still useG_n(f_m)to denote 1

2πi

Pn

k=1Ynkfrom now on. In order to apply CLT toGn(fm), we need to check the following two conditions:

Condition 5.1Conditional Lyapunov condition.For somep>2,

n

X

k=1

E_k|Y_nk|p_−→0 in Probability.

Condition 5.2. Conditional covariance. Note that Ynk are complex, we need to show that

U2 = Pn_k₌₁E_kY_nk2 converges to a constant limit to guarantee the convergence to complex normal. For simplicity, we may consider two functions f,g_∈C4(_U) and show that C ov_k[G_n(f_m),G_n(g_m)]

converges in probability, where f_m andg_m denote their Bernstein polynomial approximations. That is,

C ovk[Gn(fm),Gn(gm)]→C(f,g) in probability.

The proof of condition 5.1 withp=3.

n

X

k=1

E_k|Y_nk|3=

n

X

k=1 E_k

I

γmh

E_k₋₁ qk(z) z+n−1t r D_k(z)f

′

m(z)dz

3

(20)

Since f_m′(z)is bounded onγmh, it suffices to prove

where we have used the facts that

Then the above gives

n

The proof of Condition 5.2. The conditional covariance is

(21)

(22)

since f_m′(t)_→ f′(t), g_m′ (t)_→g′(t)uniformly on[₋2, 2].

ForQ₂, sinces(z₁)s(z₂)f_m′(z₁)g_m′(z₂)is bounded onγmh×γmh, in order to proveQ2→0 in

probabil-ity, it is sufficient to prove thatΓ_n(z1,z2)converges in probability toΓ(z1,z2)uniformly onγmh×γmh. DecomposeΓ_n(z₁,z₂)as

Γ_n(z1,z2) =

n

X

k=1

E_k[E_k₋₁qk(z1)Ek−1qk(z2)]

=

n

X

k=1 E_k

hx2

kk

n +

1

n2Ek−1[α

∗

kDk(z1)αk−t r Dk(z1)]

×E_k₋₁[α∗_kDk(z2)αk−t r Dk(z2)]

i

= σ2+ κ

n2

n

X

k=1

X

i,j>k

[E_k₋₁Dk(z1)]i j[Ek−1Dk(z2)]ji

+β

n2

n

X

k=1 E_kX

i>k

[E_k₋₁D_k(z₁)]_ii[E_k₋₁D_k(z₂)]_ii

¬ _σ2+_S₁+_S₂.

SinceE_|[D_k(z)]_ii₋s(z)_|2_≤M/(nv4)uniformly onγmh,

E_|E_k[E_k₋₁Dk(z1)]ii[Ek−1Dk(z2)]ii−s(z1)s(z2)| ≤ M nv4

uniformly onγmh×γmh. Hence

E_|S₂₋β

2s(z1)s(z2)| ≤

M nv4

In the following, we consider the limit of_S₁. As proposed in[6], we use the following decomposi-tion. Let

e_j= (0, . . . , 1, . . . , 0)′_n₋₁, j=1, 2, . . . ,k₋1,k+1, . . . ,n

whose jth (or(j−1)th) element is 1, the rest being 0, if j<k(or j>k). Then

D_k−1(z) =_p1

nWn(k)−z In−1=

X

i,j6=k 1 p

nxi jeie

′

j−zin−1

z Dk(z) +I=

X

i,j6=k 1 p

nxi jeie

′

jDk(z)

We introduce the matrix

D_{ki j}=

₁

p

n[Wn(k)−δi j(xi jeie

′

j+xjieje′i)]−z In−1

−1

(23)

From the formulaA−1₋B−1=₋B−1(A₋B)A−1, we have

D_k−D_{ki j}=₋D_{ki j}_p1

nδi j(xi jeie

′

j+xjieje′i)Dk.

From the above, we get

z D_k(z) = ₋I+ X

i,j6=k

n−1/2x_{i j}e_ie′_jD_{ki j}(z)

−X i,j6=k

n−1/2x_{i j}e_ie′_jD_{ki j}(z)n−1/2δi j(xi jeie′j+xjieje′i)Dk(z)

= ₋I+ X

i,j6=k

n−1/2x_{i j}e_ie′_jD_{ki j}(z)₋s(z)n−3/2

n

X

i6=k

e_ie_i′D_k(z)

−X i,j6=k

δi j

n−1_|x_{i j}_|2([D_{ki j}(z)]_{j j}₋s(z)) +n−1(_|x_{i j}_|2₋1)s(z)

×eie′jDk(z)

−X i,j6=k

n−1δi jx2i j[Dki j]jieie′jDk(z).

Therefore

z₁X

i,j>k

[E_k₋₁D_k(z₁)]_{i j}[E_k₋₁D_k(z₂)]_ji

= ₋X

i>k

[E_k₋₁D_k(z₂)]_ii

+n−1/2 X

i,j,l>k

xil[Ek−1Dkil(z1)]l j[Ek−1Dk(z2)]ji

−s(z₁)n−3/2

n

X

i,j>k

[E_k₋₁D_k(z₁)]_{i j}[E_k₋₁D_k(z₂)]_ji

−X i,j>k

l6=k

δilEk−1

|x_il_|2₋1

n s(z1) +

|x_il_|2

n [Dkil(z1)l l−s(z1)]

[D_k(z₁)]_{i j}

×[E_k₋₁D_k(z₂)]_ji

−1

n

X

i,j>k l6=k

δilEk−1x2il[Dkil(z1)]l i[Dk(z1)]l j[Ek−1Dk(z2)]ji

= T1+T2+T∗+T3+T4.

First note that the term T_∗ is proportional to the term of the left hand side. We now evaluate the contributions of the remaining four terms to the sum_S₁.

The termT1→s(z2)inL2since

E

n−2X

k

X

i>k

([E_k₋₁D_k(z₂)]_ii₋s(z₂))

2

≤ M

(24)

by

E_|[D_k(z)]_ii₋s(z)_|2_≤ M

nv4.

The termsT3andT4are negligible. The calculations are lengthy but simple. We omit all the details. T2 can not be ignored. We shall simplify it step by step. SplitT2 as

T2 =

X

i,j,l>k

E_k₋₁_pxl j

n[Dkil(z1)]l j[Ek−1Dk(z2)]ji

= X

i,j,l>k

E_k₋₁_pxil

n[Dkil(z1)]l j[Ek−1(Dk−Dkil)(z2)]ji

+ X

i,j,l>k

E_k₋₁_pxil

n[Dkil(z1)]l j[Ek−1Dkil(z2)]ji

= T2a+T2b.

Again,T2bcan be shown to be ignored. As for the remaining termT2a, we have

ET₂_a

= n−2 X

i,j,l>k

x_il[E_k₋₁D_kil(z₁)]_{l j}[E_k₋₁(D_k₋D_kil)(z₂)]_ji

= ₋n−1 X

i,j,l>k

[E_k₋₁D_kil(z₁)]_{l j}[E_k₋₁D_kil(z₂)δil(x2ileie′l+|xil|2ele′i)Dk(z2)]ji

= J1+J2,

where

E_|J₁_| = n−1 X

i,j,l>k

E_|x2_il_|[E_k₋₁D_kil(z₁)]_{l j}[D_kil(z₂)]_ji[D_k(z₂)]_{l i|}

≤ n−1 X

i,j1,,j2,l>k

E_|x_il_|4_|[E_k₋₁D_kil(z₁)]_{l j}₁_|2_|[D_kil(z₂)]_j₂_i_|2

×X i,l>k

E_|(D_k(z₂))_{l i|}21/2

≤ M n1/2v−3.

Hence, the contribution of this term is negligible. Finally, we have

J2 = −

X

i,j,l>k

E_k₋₁|xil| 2

n [Dki j(z1)]l j[Ek−1Dki j(z2)]jl[Dk(z2)]ii

≃ −s(z2)

X

i,j,l>k

E_k₋₁1

n[Dkil(z1)]l j[Ek−1DkilDkil(z2)]jl

= ₋n−k

n s(z2)

X

j,l>k

(25)

where the last approximation follows from

X

i,j,l>k

[D_kil(z₁)]_{l j}[E_k₋₁D_kil(z₂)]_jl₋(n₋k) X

j,l>k

[E_k₋₁D_k(z₁)]_{l j}[E_k₋₁D_k(z₂)]_jl

≤M n3/2v−3.

Let us define

X_k= X

i,j>k

[E_k₋₁D_k(z₁)]_{i j}[E_k₋₁D_k(z₂)]_ji

Summarizing the estimations of all the termsTi, we have proved that

z1Xk=−s(z1)Xk−(n−k)s(z2)Xk

n₋k n +rk,

where the residual term rk is of order O(pnv−3) uniformly in k = 1, ...,n and z1,z2 ∈ γm. By

z₁+s(z₁) =₋1/s(z₁), the above identity is equivalent to

X_k= (n₋k)s(z₁)s(z₂) + n−k

n s(z1)s(z2)Xk−s(z1)rk.

Consequently,

1

n2

n

X

k=1 Xk =

1

n

X

k=1

n−k

n s(z1)s(z2) 1₋n−_nks(z₁)s(z₂)+

1

n2

n

X

k=1

s(z₁)r_k

1₋ n−_nks(z₁)s(z₂),

which converges in probability to

s(z1)s(z2)

Z 1

0

t

1₋ts(z₁)s(z₂)d t=−1− s(z1)s(z2)

−1

log 1−s(z1)s(z2)

.

As a conclusion,Γ_n(z1,z2)converges in probability to

Γ(z1,z2) =σ2−κ−

κ

s(z₁)s(z₂)log 1−s(z1)s(z2)

₊1

2βs(z1)s(z2).

The proof of Condition 5.2 is then complete.

Although we have completed the proof of Theorem 1.1, we summarize the main steps of the proof for reader’s convenience.

Proof of Theorem 1.1. In Section 2, the weak convergence ofG_n(f)is reduced to that ofG_n(f_m). Then in Section 4, we show that this is equivalent to the weak convergence of

∆ =₋ 1

2πi

I

γm

f_m(z)n[s_n(z)₋s(z)]dz.

(26)

6 Appendix

Lemma 6.1. Under condition(1.2), we have

kEF_n₋F_k = O(n−1/2), kF_n₋F_k = O_p(n−2/5),

kFn−Fk = O(n−2/5+η) a.s., for any η >0.

This follows from Theorems 1.1, 1.2 and 1.3 in[5].

Lemma 6.2. For X = (x₁, ...,x_n)T i.i.d. standardized (complex) entries withEx_i=0andE_|x_i|2=1, and C is an n×n matrix (complex), we have, for any p¾2

E_|X∗C X ₋t r C_|p¶K_p(E_|x₁_|4t r C C∗)p/2+E_|x₁_|2pt r(C C∗)p/2.

This is Lemma 2.7 in[7].

References

[1] Arnold, L. (1967). On the asymptotic distribution of the eigenvalues of random matrices.J. Math. Anal. Appl.20

262–268. MR0217833

[2] Arnold, L. (1971). On Wigner’s semicircle law for the eigenvalues of random matrices.Z. Wahrsch. Verw. Gebiete. 19191–198. MR0348820

[3] Bai, Z. D. (1993). Convergence rate of expected spectral distributions of large random matrices. Part I. Wigner matrices.Ann. Probab.21625–648. MR1217559

[4] Bai, Z. D. (1999). Methodologies in spectral analysis of large dimensional random matrices, a review.Stat. Sin.9

611–677. MR1711663

[5] Bai, Z. D., Miao, B. and Tsay, J. (2002). Convergence rates of the spectral distributions of large Wigner matrices.

Int. Math. J.165–90. MR1825933

[6] Bai, Z. D. and Yao, J. (2005). On the convergence of the spectral empirical process of wigner matrices.Bernoulli 111059-1092. MR2189081

[7] Bai, Z. D. and Silverstein, J. W. (1998). No eigenvalues outside the support of the limiting spectral distribution of large dimensional random matrices.Ann. Probab.26316-345.6 MR1617051

[8] Bai, Z. D. and Silverstein, J. W. (2006).Spectral analysis of large dimensional random matrices. Mathematics Mono-graph Series 2, Science Press, Beijing.

[9] Bai, Z. D. and Yin, Y. Q. (1988). Necessary and sufficient conditions for the almost sure convergence of the largest eigenvalue of Wigner matrices.Ann. Probab.161729–1741. MR0958213

[10] Costin, O. and Lebowitz, J. (1995). Gaussian fluctuations in random matrices.Physical Review Letters7569–72.

[11] Horn, R. A. and Johnson, C. R. (1990).Matrix Analysis. Cambridge University Press. MR1084815

(27)

[13] Khorunzhy, A. M., Khoruzhenko, B. A. and Pastur, L. A. (1996) Asymptotic properties of large random matrices with independent entries.J. Math. Phys.375033¨C-5060.

[14] Collins, B., Mingo, J. A., Sniady, P., and Speicher, R. (2007) Second order freeness and fluctuations of random matrices. III. Higher order freeness and free cumulants.Doc. Math.12, 1–70 (electronic). MR2302524

[15] Mingo, J. A., Sniady, P., and Speicher, R. (2007) Second order freeness and fluctuations of random matrices. II. Unitary random matrices.Adv. Math.209, no. 1, 212–240. MR2294222

[16] Mingo, J. A. and Speicher, R. (2006) Second order freeness and fluctuations of random matrices. I. Gaussian and Wishart matrices and cyclic Fock spaces.J. Funct. Anal.235, no. 1, 226–270. MR2216446

[17] Pastur, L. A. and Lytova A. (2009) Central Limit Theorem for linear eigenvalue statistics of random matrices with independent entries.Ann. Prob.37, 1778-1840.

[18] Sinai, Y. and Soshnikov, A. (1998) Central limit theorem for traces of large random symmetric matrices with independent elements.Bol. Soc. Brasil. Mat. (N.S.)291–24. MR1620151