• Tidak ada hasil yang ditemukan

getdoc47c5. 210KB Jun 04 2011 12:04:20 AM

N/A
N/A
Protected

Academic year: 2017

Membagikan "getdoc47c5. 210KB Jun 04 2011 12:04:20 AM"

Copied!
29
0
0

Teks penuh

(1)

E l e c t ro n ic

J o u r n

a l o

f P

r o b

a b i l i t y

Vol. 8 (2003) Paper no. 7, pages 1–29. Journal URL

http://www.math.washington.edu/˜ejpecp/ Paper URL

http://www.math.washington.edu/˜ejpecp/EjpVol8/paper7.abs.html

THE NORM OF THE PRODUCT OF A LARGE MATRIX AND A RANDOM VECTOR

Albrecht B¨ottcher

Fakult¨at f¨ur Mathematik, TU Chemnitz, 09107 Chemnitz, Germany

[email protected]

Sergei Grudsky 1

Departamento de Matem´aticas, CINVESTAV del I.P.N., Apartado Postal 14-740, 07000 M´exico, D.F., M´exico

[email protected], [email protected]

Abstract Given a real or complexn×nmatrixAn, we compute the expected value and the variance of the random variable kAnxk2/kAnk2, where x is uniformly distributed on the unit sphere ofRnorCn. The result is applied to several classes of structured matrices. It is in particular shown that if An is a Toeplitz matrixTn(b), then for largen the values of kAnxk/kAnk cluster fairly sharply aroundkbk2/kbk∞ ifb is bounded and around zero

in case bis unbounded.

Keywords condition number, matrix norm, random vector, Toeplitz matrix

AMS subject classification (2000) Primary 60H30, Secondary 15A60, 47B35, 65F35 Submitted to EJP on September 9, 2002. Final version accepted on May 14, 2003.

1

(2)

1

Introduction

Let k · k be the Euclidean norm in Rn. For a real n×n matrix A

n, the spectral norm kAnk is defined by

kAnk= max

kxk=1kAnxk= max0<kxk≤1

kAnxk kxk .

Let s1 ≤s2 ≤. . .≤sn be the singular values ofAn, that is, the eigenvalues of (A⊤A)1/2. The set{kAnxk/kAnk:kxk= 1} coincides with the segment [s1/sn,1]. We show that for a randomly chosen unit vector x the value ofkAnxk2/kAnk2 typically lies near

1

s2 n

s21+. . .+s2n

n . (1)

Notice that sn = kAnk and that s21+. . .+s2n = kAnk2F, where kAnkF is the Frobenius (or Hilbert-Schmidt) norm. Thus, if kAnk= 1, then for a typical unit vector x the value of kAnxk2 is close to kAnk2F/n. The purpose of this paper is to use this observation in order to examine the most probable values ofkAnxk/(kAnk kxk) for several classes of large structured matrices An.

Our interest in the problem considered here arose from a talk by Siegfried Rump at a conference in Marrakesh in 2001. Let Mn(R) denote the real n×n matrices and let Circn(R) stand for the circulant matrices in Mn(R). For an invertible matrix An ∈ Circn(R), define the unstructured condition number κ(An, x) of An at a vector x ∈ Rn as limε→0supkδxk/(εkxk), the supremum over all δx such that (An+δAn)(x+δx) =

Anx for some δAn ∈ Mn(R) with kδAnk ≤ εkAnk, and define the structured condition number κcirc(An, x) as limε→0supkδxk/(εkxk), this time the supremum over all δx such that (An+δAn)(x+δx) =Anx for some δAn ∈Circn(R) with kδAnk ≤ εkAnk. A well known result by Skeel says that κ(An, x) =kAnk kAn−1k (for every An∈Mn(R)), and in his talk Rump proved that

κcirc(An, x) = k

Ank kA−1 n xk kxk

(see also [9], [14]). Thus,

κcirc(A n, x)

κ(An, x)

= kA− 1 n xk kA−n1k kxk

, (2)

which naturally leads to the question on the value taken by (2) at a typicalx.

2

General Matrices

Let Bn = {x ∈ Rn : kxk ≤ 1} and Sn−1 = {x ∈ Rn : kxk = 1}. For a given matrix

An∈Mn(R), we consider the random variable

Xn(x) = k

(3)

where xis uniformly distributed on Sn−1.

As the following lemma shows, there is no difference between taking x uniformly on a sphere or in a ball.

Lemma 2.1 For every natural number k,

1

The following result is undoubtedly known. As we have not found an explicit reference, we cite it with a full proof.

(4)

notice that in (5) we first made the substitution Vnx =y and then changed the notation

y back tox. By symmetry, the integrals 1

are independent ofj and hence they are all equal to 1/n. This proves (3). In analogy to (6),

(5)

Since σ2X2

n can be written in the symmetric forms

σ2Xn2 = 2

Obvious modifications of the proof of Theorem 2.2 show that Theorem 2.2 remains true for complex matrices on Cn with the2 norm.

Example 2.3 Let

For EXn2 = 1/nε/2 we therefore obtain from Chebyshev’s inequality that

(6)

with probability at least 90% for n 18 and with probability at least 99% for n 57, and the inequality

(x1+. . .+xn)2 ≤

n

100(x 2

1+. . .+x2n),

is true with probability at least 90% whenevern895 and with probability at least 99% provided n2829. We will return to the present example in Example 7.5.

The following lemma will prove useful when studying concrete classes of matrices. We denote by k · ktr the trace norm, that is, the sum of the singular values.

Lemma 2.4 Let {An}

n=1 be a sequence of matrices An ∈ Mn(K), where K = R or K=C. If

kAnktr

n =O(1) and kAnk → ∞,

then EX2

n→0 as n→ ∞.

Proof. Let s1(An) ≤ s2(An) ≤. . . ≤ sn(An) be the singular values of An and note that

sn(An) =kAnk. By assumption, there is a a finite constantM such that 1

n

n

X

j=1

sj(An)≤M

for alln. Fixε(0,1), for instance,ε= 1/2. LetNndenote the number of alljfor which

sj(An)≥MkAnk1−ε. Then

M 1 n

n

X

j=1

sj(An)≥ 1

nNnMkAnk

1−ε,

whenceNn≤n/kAnk1−ε and thus, by Theorem 2.2,

EXn2 = 1

ns2 n(An)

n

X

j=1

s2j(An)≤

(nNn)M2kAnk2−2ε

nkAnk2 +

NnkAnk2 nkAnk2

≤ M

2 kAnk2ε +

1

kAnk1−ε =o(1) because kAnk → ∞.

We remark that if EXn2 0, then P(Xn ≥ ε) = O(1/n) for each ε > 0: we have

EXn2 ε2/2 for allnn0 and hence

P(Xn≥ε) =P(Xn2 ≥ε2)≤P

µ

Xn2EXn2 ε

2 2

=P

µ

|Xn2EXn| ≥2 ε

2 2

≤ 4σ 2X2

n

(7)

3

Toeplitz Matrices with Bounded Symbols

We need one more simple auxiliary result.

Lemma 3.1 Let EXn2 =µ2n and suppose µn →µ as n→ ∞. If ε >0 and |µn−µ|< ε,

then

P(|Xn−µ| ≥ε)≤

σ2Xn2 µ2

n(ε− |µn−µ|)2

.

Proof. We have

P(|Xn−µ| ≥ε)≤P

³

|Xn−µn| ≥ε− |µn−µ|

´

≤P³|Xn−µn|(Xn+µn)≥µn(ε− |µn−µ|)

´

=P³|Xn2µ2n| ≥µn(ε− |µn−µ|)

´

,

and the assertion is now immediate from Chebyshev’s inequality.

Now let An be a Toeplitz matrix, that is,An=Tn(b) := (bj−k)nj,k=1, where

bk=

Z 2π

0

b(eiθ)e−ijθ dθ

2π (j∈Z). (12)

Clearly, (12) makes sense for everybL1 on the complex unit circle T. Throughout this section we assume that b is a function in L∞. The Avram-Parter theorem says that in this case

lim n→∞

sk

1 +. . .+skn

n =kbk

k k:=

Z 2π

0 |

b(eiθ)|kdθ

2π (13)

for every natural numberk(see [1], [2], [4], [11]). It is also well known thatsn=kTn(b)k → kbkasn→ ∞(see [2] or [4], for example). In what follows we always assume thatbdoes not vanish identically. In Theorems 3.2 to 3.5, the constants hidden in theO’s depend of course on εand δ, respectively.

Theorem 3.2 Let b L∞ and suppose |b| is not constant almost everywhere. Then for each ε >0, there is ann0 =n0(ε) such that

Pµ ¯¯¯¯ kTn(b)xk

kTn(b)k kxk− kbk2

kbk

¯ ¯ ¯ ¯≥ε

n+ 23 ε12

kbk44− kbk22

kbk22kbk2 (14)

for all nn0. If, in addition, b is a rational function, then for each δ >0,

Pµ ¯¯¯¯ kTn(b)xk

kTn(b)k kxk

kbk2

kbk

¯ ¯ ¯ ¯≥

1

n1/2−δ

=O

µ

1

n2δ

(8)

Proof. Since|b|is not constant, it follows thatkbk4 >kbk2. Put

Thus, Lemma 3.1 shows that

P(|Xn−µ| ≥ε)≤

for all sufficiently large n, which is (14). Ifb is a rational function, we even have

sk1+. . .+skn for every natural numberk and

sn=kbk∞+O

Theorem 3.3 Let bL∞ and suppose |b| is constant almost everywhere. Then

P Lemma 3.1 therefore gives

(9)

If bis rational, we have (16) and (17). Thus,

Consequently, by Lemma 3.1,

P

We now consider the case whereAn is the inverse of a Toeplitz matrix. Supposeb is a continuous function onTandbhas no zeros onT. Let windbdenote the winding number of babout the origin.

If windb= 0, then Tn(b) is invertible for all sufficiently largenand

Since bhas no zeros onT, the Avram-Parter formula (13) also holds for negative integers

(10)

In the case where bis rational, one can sharpen (18) and (13) to therefore consider the Moore-Penrose inverse T+

n(b), which coincides with Tn−1(b) in the

Proof. The so-called splitting phenomenon, discovered by Roch and Silbermann [13] (see also [2]), tells us that if |windb|=k1, then ksingular values ofTn(b) converge to zero with exponential speed,

sℓ≤Ce−γn (γ >0) for ℓ≤k, while the remaining singular values stay away from zero,

(11)

for all sufficiently large n. Also by Theorem 2.2,

for all sufficiently large n, and thus,

P(Xn≥ε)≤P

Similarly, for large n,

P

Now supposebis a trigonometric polynomial. ThenTn(b) is a banded matrix for all suffi-ciently large n. For these n, we changeTn(b) to a circulant matrix by adding appropriate entries in the upper-right and lower-left corner blocks. For example, if

(12)

We have

Cn(b) =Un∗diag (b(1), b(ωn), . . . , b(ωnn−1))Un (19) where Un is a unitary matrix and ωn = e2πi/n. Thus, the singular values of Cn(b) are |b(ωn)j |(j = 0, . . . , n1). The only trigonometric polynomialsb of constant modulus are

b(t) =αtk (tT) withαC, and in this casekCn(b)xk=|α| kxkfor all x.

Theorem 4.1 Assume that |b| is not constant. Then for each ε > 0 there exists an

n0 =n0(ε) such that

Pµ ¯¯¯¯ kCn(b)xk

kCn(b)k kxk− kbk2 kbk

¯ ¯ ¯ ¯≥ε

n+ 23 ε12

kbk44− kbk22

kbk22kbk2

for all nn0.

Proof. The proof is analogous to the proof of (14). Note that now (13) amounts to the fact that the integral sum

sk

1+. . .+skn

n =

n−1

X

j=0

|b(e2πij/n)|k1

n

converges to the Riemann integral

Z 1

0 |

b(e2πiθ)|kdθ=

Z 2π

0 |

b(eiθ)|kdθ

2π =:kbk

k k.

Furthermore, it is obvious that sn= max|b(ωn)j | → kbk∞.

Ifbhas no zeros on T, then (19) shows that

Cn−1(b) =Un∗diag (b−1(1), b−1(ωn), . . . , b−1(ωnn−1))Un and hence the argument of the proof of Theorem 4.1 delivers

Pµ ¯¯¯¯ kC−

1 n (b)xk kCn−1(b)k kxk −

kb−1k2

kb−1k

¯ ¯ ¯ ¯≥ε

n+ 23 ε12 kb

−1

k44− kb−1k22

kb−1k22kb−1k2 (20)

for all sufficiently large n.

Example 4.2 Putb(t) = 2 +α+t+t−1, whereα >0 is small. Thus,

C5(b) =

     

2 +α 1 0 0 1

1 2 +α 1 0 0

0 1 2 +α 1 0

0 0 1 2 +α 1

1 0 0 1 2 +α

     

.

If n is large, then kCn(b)k ≈ 4 and kCn−1(b)k ≈ 1/α. Consequently, for the condition numbers defined in the introduction we have

κ(Cn(b), x) =kCn(b)k kCn−1(b)k ≈ 4

(13)

and

κcirc(Cn(b), x) = k

Cn−1(b)xk

kCn−1(b)k kxk

κ(Cn(b), x)≈ 4

α

kCn−1(b)xk

kCn−1(b)k kxk

.

From (20) we therefore obtain that if nis sufficiently large, then with probability near 1,

κcirc(Cn(b), x)≈ 4

α

kb−1k2

kb−1k

= 4

α

α1/4(2 +α)1/2 (4 +α)3/4 ≈

α1/4

2 4

α.

For α= 0.01 this gives

κ(Cn(b), x)≈400, κcirc(Cn(b), x)≈63 with probability near 1, and forα = 0.0001 we get

κ(Cn(b), x)≈40000, κcirc(Cn(b), x)≈2000 with probability near 1.

5

Hankel Matrices We begin with a general result.

Theorem 5.1 Let A = (ajk)∞j,k=1 be an infinite matrix and put An = (ajk)nj,k=1. If A

induces a compact operator on ℓ2, then EXn20 as n→ ∞. Proof. We denote byF

j andFjnthe operators of rank at mostjonℓ2andCn, respectively. The approximation numbers σj of A and An are defined by

σj(A) = dist (A,Fj∞) = inf{kA−Fjk:Fj ∈ Fj∞}, (j= 0,1,2, . . .),

σj(An) = dist (An,Fjn) = inf{kA−Fjk:Fj ∈ Fjn}, (j= 0,1, . . . , n−1). Note that the approximation numbers of An are just the singular values in reverse order,

sn−j(An) =σj(An) for j= 0, . . . , n−1. LetPn stand for the orthogonal projection onto the first ncoordinates. If Fj ∈ Fj∞, then PnFjPn may be identified with a matrix inFjn. Furthermore, a matrix Fj ∈ Fjn may be thought of as a matrix of the formPnFjPn with

Fj ∈ Fj∞. Thus, Fjn=PnFj∞Pn and it follows that

sn−j(An) =σj(An) = inf{kPnAPn−Fjk:Fj ∈ Fjn} = inf{kPnAPn−PnFjPnk:Fj ∈ Fj∞}

≤inf{kAFjk:Fj ∈ Fj∞}=σj(A). From Theorem 2.2 we therefore obtain

EXn2 = 1

n

n

X

k=1

s2k(An) = 1

n

n−1

X

j=0

s2nj(An)≤ 1

n

n−1

X

j=0

(14)

But if A is compact, thenσj(A)→0 as j → ∞. This implies that the right-hand side of (21) goes to zero asn→ ∞.

An interesting concrete situation is the case where A = H(b) is the Hankel matrix (bj+k−1)∞j,k=1 generated by the Fourier coefficients of a functionb∈L1. Ifbis continuous, then the Hankel matrix H(b) induces a compact operator and hence, by Theorem 5.1,

EX2

n→0. The following result shows that, surprisingly,EXn2→0 for all Hankel matrices with L1 symbols (notice that such matrices need not even generate bounded operators). Theorem 5.2 Let b L1 and let An be the n×n principal section of H(b). Then

EX2 n→0.

Proof. As shown by Fasino and Tilli [6], [15], lim

n→∞

F(s1) +. . .+F(sn)

n =F(0)

for every uniformly continuous and bounded function F on R. Suppose first that H(b) induces a bounded operator. Then kAnk ≤ kH(b)k =: d < ∞, which implies that all singular values ofAn lie in the segment [0, d]. Thus, lettingF be a smooth and bounded function such that F(x) = x2 on [0, d], we deduce that Ps2

j/n → 0. Since b is not identically zero, there is anN such thatkANk>0. It follows that 0<kANk ≤ kAnk=sn for all nn. Thus, Psj2/(ns2n)0, and Theorem 2.2 gives the assertion.

Now suppose thatH(b) is not bounded. We claim that thenkAnk → ∞. Indeed, the sequence {kAnk} is monotonically increasing: kAnk ≤ kAn+1k for alln. If there exists a finite constantM such thatkAnk ≤M for alln, then{Anx} is a convergent sequence for each xℓ2. The Banach-Steinhaus theorem (= uniform boundedness principle) therefore implies that the operator Adefined by Ax:= limAnx is bounded onℓ2. But A is clearly given by the matrixH(b). This contradiction proves thatkAnk → ∞. Finally, Fasino and Tilli [6], [15] proved that always

1

nkAnktr≤2kbk1.

Lemma 2.4 now shows thatEXn2 0.

6

Toeplitz Matrices with Unbounded Symbols

Following Tyrtyshnikov and Zamarashkin [18], we consider Toeplitz matrices generated by so-called Radon measures. Thus, given a function β : [π, π]C of bounded variation, we define

(dβ)k= 1 2π

Z π

−π

e−ikθdβ(θ), (22)

the integral understood in the Riemann-Stieltjes sense, and we put

(15)

Ifβ is absolutely continuous, thenβ′L1[π, π] and ()

kis nothing but thekth Fourier coefficient ofβ′, defined in accordance with (12). Consequently, in this case Tn(dβ) is just what we denoted by Tn(β′) in Section 3.

For general β we have β = βa+βj+βs where βa is absolutely continuous with βa′ ∈

L1[π, π],βj is the “jump part”, that is, a function of the form

βj(θ) =

X

θℓ<θ

hℓ,

X

|hℓ|<,

with an at most countable set{θ1, θ2, . . .} ⊂[−π, π), andβs is the “singular part”, that is, a continuous function of bounded variation whose derivative vanishes almost everywhere. This decomposition is unique up to constant additive terms. After partial integration (see, e.g., [7, No. 577]), formula (22) can be written more explicitly as

(dβ)k = 1 2π

Z π

−π

e−ikθβa′(θ)dθ+ 1 2π

X

hℓe−iθℓk

+(−1) k

³

βs(π)βs(π)´+ ik 2π

Z π

−π

e−ikθβs(θ)dθ.

In particular, if β(θ) = 0 forθ [π,0] and β(θ) = 2π forθ (0, π], then (dβ)k = 1 for all k, that is,Tn(dβ) is the matrix (10).

Theorem 6.1 Let β =βa+βj+βs be a nonconstant function of bounded variation and

putAn=Tn(dβ). Then EXn2 converges to a limit as n→ ∞. This limit is positive if and

only if β′

a ∈L∞[−π, π]and βj =βs= 0.

Proof. If β =βa with βa′ ∈L∞[−π, π], then EXn2 converges to kβak′ 2/kβak′ ∞ 6= 0 due to

Theorems 3.2 and 3.3.

So assumeβ=βa+βj+βs. Writeβ=β1−β2+i(β3−β4) with nonnegative functions

βk of bounded variation. We have 1

nkTn(dβ)ktr≤

1

n

4

X

k=1

kTn(dβk)ktr,

The singular values of the positively semi-definite matrices Tn(dβk) coincide with the eigenvalues. Hence

1

nkTn(βk)ktr=

1

ntrTn(βk) = (dβk)0 =

1 2π

Z π

−π

dβk≤ 1

2πVarβk<∞

(this argument is standard; see, e.g., [18]). Consequently, there is a finite constantM such that

1

nkTn(dβ)ktr≤M

(16)

with β′

a ∈ L∞[−π, π]. By virtue of Lemma 2.4, this implies that EXn2 → 0 whenever

β′a/ L∞[π, π] orβj6= 0 or βs 6= 0.

Thus, suppose there is a finite constant C such that kTn(dβ)k ≤ C for all n. Let

ej ∈Cn be thejth vector of the standard basis. Then

kTn(dβ)e1k22 =|(dβ)0|2+. . .+|(dβ)n−1|2 ≤C2,

kTn(dβ)enk22=|(dβ)0|2+. . .+|(dβ)−(n−1)|2 ≤C2

for all n, which tells us that there is a function b L2[π, π] such that (dβ)k = bk for all k. Since the decomposition of β into the absolutely continuous part, the jump part, and the singular part is unique (up to additive constants), it follows that β=βa+βj+βs with βa′ = b and βj = βs = 0. We are left to show that b is in L∞[−π, π]. Using the Banach-Steinhaus theorem as in the proof of Theorem 5.2 we arrive at the conclusion that the Toeplitz matrix T(b) := (bj−k)∞j,k=1 induces a bounded operator onℓ2. By a classical theorem of Toeplitz [16] (full proofs are also in [3] and [8]), this happens if and only if b

is in L∞[π, π].

Theorem 6.1 reveals in particular that EX2

n → 0 if An = Tn(b) with b ∈ L1 \L∞. The following theorem concerns a class of Toeplitz matrices with increasing entries. The notation cj ≃dj means thatcj/dj remains bounded and bounded away from zero. Theorem 6.2 Let An = (bj−k)nj,k=1 where |bj| ≃ eγj as j → +∞ and |b−j| ≃ eδj as

j+. If one of the numbers γ andδ is positive, then EXn2 =O(1/n) as n→ ∞. Proof. For the sake of definiteness, supposeγ >0. We have

kAnk2F = n−1

X

j=0

(nj)|bj|2+ n−1

X

j=1

(nj)|bj|2

≤ C1 n−1

X

j=0

(nj)e2γj +C1 n−1

X

j=1

(nj)e2δj

with some finite constant C1. In the cases δ >0 and δ≤0, this gives kAnk2FC2

³

e2γn+e2δn´ and kAnk2FC2e2γn with some finite constant C2, respectively; note that, for example,

nX−1

j=0

(nj)e2γj = e 2γn

(e2γ1)2 +O(n). On the other hand, consideringkAne1k22 and kAnenk22, we see that

kAnk2 1

2 n−1

X

j=0 |bj|2+

1 2

n−1

X

j=1

(17)

which shows that there is a finite constant C3 >0 such that kAnk2 ≥C3

³

e2γn+e2δn´ and kAnk2≥C3e2γn forδ >0 andδ 0, respectively. Thus, in either case,

EXn2 = kAnk 2 F

nkAnk2 =O

µ

1

n

.

Example 6.3 Let

An=

     

a b̺ b̺2 . . . n−1

cσ a b̺ . . . b̺n−2

cσ2 cσ a . . . b̺n−3 . . . . cσn−1 n−2 n−3 . . . a

     .

Similar matrices are studied in [17]. In the case a=b=c and σ =̺, such matrices are called Kac-Murdock-Szeg¨o matrices [10]. Suppose that a, b, c are nonzero. If|σ|<1 and |̺| <1, then EXn2 converges to a nonzero limit by Theorem 2.2 and (13). If |σ|> 1 or |̺| >1, we can invoke Theorem 6.2 to deduce that EXn2 0. Finally, in the two cases where |σ| ≤1 =|̺|or|̺| ≤1 =|σ|, Theorem 6.1 implies thatEX2

n→0.

7

Appendix: Distribution Functions

The referee suggested that it would be interesting to compute the distribution of Xn2 in some cases and noted that this can probably be done easily for smallnand for the matrix of Example 2.3. The purpose of this section is to address this problem. It will turn out that the referee is right in all respects.

Let An ∈ Mn(R) and let 0 ≤ s1 ≤ . . . ≤ sn be the singular values of An. Suppose

sn > 0. The random variable Xn2 = kAnxk2/kAnk2 assumes its values in [0,1]. With notation as in the proof of Theorem 2.2,

Eξ :=

½

xSn−1: k

Anxk2 kAnk2

< ξ

¾

=

½

xSn−1: k

DnVnxk2

s2 n

< ξ

¾

.

PutGξ ={x∈Sn−1 :kDnxk2/s2n< ξ}. Clearly, Gξ=Vn(Eξ). SinceVn is an orthogonal matrix, it leaves the surface measure on Sn−1 invariant. It follows that |Gξ| =|Vn(Eξ)| and hence

Fn(ξ) :=P(Xn2< ξ) =P

µ

kDnxk2

s2 n

< ξ

=P

µ

s2

1x21+. . .+s2nx2n

s2 n

< ξ

. (24) This reveals first of all that the distribution function Fn(ξ) depends only on the singular values of An.

A real-valued random variableX is said to be B(α, β) distributed on (a, b) if

P(cX < d) =

Z d

c

(18)

where the density function f(ξ) is zero on (−∞, a] and [b,) and equals (ba)1−α−β

B(α, β) (ξ−a)

α−1(bξ)β−1

on (a, b). HereB(α, β) = Γ(α)Γ(β)/Γ(α+β) is the common beta function and it is assumed that α > 0 and β > 0. The emergence of the beta distribution, of the χ2 distribution, of elliptic integrals and Bessel functions in connection with uniform distribution on the unit sphere is no surprise and can be found throughout the literature. Thus, the following results are not at all new. However, they tell a nice story and uncover the astonishing simplicity of Theorem 2.2.

We first consider 2×2 matrices, that is, we let n= 2. From (24) we infer that

F2(ξ) =P

µ

s21 s2 2

x21+x22 < ξ

. (25)

The constellation s1 =s2 is uninteresting, because F2(ξ) = 0 for ξ <1 and F2(ξ) = 1 for

ξ 1 in this case.

Theorem 7.1 If s1< s2, then the random variableX22 is subject to the B

¡1

2,12

¢

distribu-tion on (s2

1/s22,1).

Proof. Putτ =s1/s2. By (25),F2(ξ) is 1 times the length of the piece of the unit circle

x21+x22 = 1 that is contained in the interior of the ellipse τ2x21 +x22 = ξ. This gives

F2(ξ) = 0 for ξ ≤τ2 and F2(ξ) = 1 for ξ ≥1. Thus, let ξ ∈(τ2,1). Then the circle and the ellipse intersect at the four points

Ã

±

r

1ξ

1τ2,±

r

ξτ2 1τ2

!

,

and consequently,

F2(ξ) = 2

πarctan

s

ξτ2 1ξ ,

which implies that F′

2(ξ) equals 1

π(ξ−τ

2)−1/2(1

−ξ)−1/2 = 1

B(1/2,1/2)(ξ−τ

2)−1/2(1

−ξ)−1/2 and proves that X22 has theB¡12,12¢ distribution on (τ2,1).

In the general case, things are more involved. An idea of the variety of possible distribution functions is provided by the class of matrices whose singular values satisfy

0 =s1 =. . .=sn−m < sn−m+1≤. . .≤sn (26) with smallm. Notice thatm is just the rank of the matrix. We put

µn−m+1 =

sn

sn−m+1

, µn−m+2 =

sn

sn−m+2

, . . . , µn=

sn

sn

(19)

Our problem is to find

Fn(ξ) =P

µ

x2nm+1 µ2

n−m+1

+. . .+x 2 n

µ2 n

< ξ

; (27)

the dependence of Fn on m and µn−m+1, . . . , µn will be suppressed.

Example 7.2 In order to illustrate what will follow by a transparent special case, we take n = 3 and suppose that the singular values of A3 satisfy 0 = s1 < s2 < s3. We put µ=s3/s2. Clearly, (27) becomes F3(ξ) =P

³x2 2

µ2 +x23 < ξ

´

. We rename x1, x2, x3 to

x, y, z. The preceding equality tells us that F3(ξ) is 1 times the area of the piece Σ ofe the sphere x2+y2+z2 = 1 that is cut out by the elliptic cylinder y22+z2 < ξ. Let us assume that ξ (0,1/µ2). Then the ellipse y2/µ2+z2 < ξ is completely contained in the disky2+z2 <1. By symmetry, it suffices to consider the part Σ of Σ that lies in thee octant x0, y 0, z0. We have

F3(ξ) = 8 4π

Z

Σ

dσ,

and it easily verified (see the proof of Theorem 7.3) that 8

Z

Σ

dσ= 8 4π/3

Z

co(0,Σ)

dx dy dz,

where co(0,Σ) is the cone ∪sΣ[0, s] (we prefer volume integrals to surface integrals). A parametrization of Σ is

y=µrcosϕ z=rsinϕ x=

q

1r2(µ2cos2ϕ+ sin2ϕ),

where r [0,√ξ) and ϕ[0, π/2]. Consequently, a parametrization of co(0,Σ) is given by

y=tµrcosϕ z=trsinϕ x=t

q

1r2(µ2cos2ϕ+ sin2ϕ)

with t [0,1] and r and ϕ as before. We set v(ϕ) = µ2cos2ϕ+ sin2ϕ and denote the Jacobian ∂(y, z, x)/∂(t, r, ϕ) byJ. By what was said above,

F3(ξ) = 6

π

Z

co(0,Σ)

dx dy dz= 6

π

Z π/2

0

Z √

ξ

0

Z 1

0 |

(20)

Writing downJ and subtracting r/ttimes the second column from the first we get

(21)

We now return to the situation given by (26). Let Q= [0, π/2]. Forϕ1, . . . , ϕk−1 in

Q we introduce the spherical coordinatesω1(k), . . . , ω(k)k by

ω1(k)= cosϕ1

ω2(k)= sinϕ1cosϕ2

ω3(k)= sinϕ1sinϕ2cosϕ3

. . .

ωk(k)1 = sinϕ1sinϕ2. . .sinϕk−2cosϕk−2

ωk(k)= sinϕ1sinϕ2. . .sinϕk−2sinϕk−2. Notice that

∂(rω(k)1 , . . . , rω(k)k )

∂(r, ϕ1, . . . , ϕk−1)

=rk−1sink−2ϕ1sink−3ϕ2. . .sinϕk−2, (29)

Z

Qk1

sink−2ϕ1sink−3ϕ2. . .sinϕk−2dϕ1. . . dϕk−1= 1 2k−1

πk/2

Γ(k/2). (30) We define v=v(ϕ1, . . . , ϕm−1) by

v=µ2nm+11(m)]2+µ2nm+2[ω(m)2 ]2+. . .+µn2[ωn(m)m]2.

Theorem 7.3 Let n 3 and suppose the singular values of An satisfy (26). Then for

ξ (0,1/µ2

n−m+1), the density function of Xn2 is

fn(ξ) =cnξ(m−2)/2

Z

Qm

−1

(1ξv(ϕ))(n−m−2)/2sinm−2ϕ1sinm−3ϕ2. . .sinϕm−2dϕ

with

cn= 2m−1

πm/2

Γ³n 2

´

Γ

µ

nm

2

¶µn−m+1. . . µn.

Proof. We proceed as in Example 7.2. Let Σ denote the set of all points (x1, . . . , xn) for which x21+. . .+xn2 = 1,x1≥0, . . . , xn≥0, and

x2 n−m+1

µ2 n−m+1

+. . .+x 2 n

µ2 n

< ξ. (31)

We have

Fn(ξ) = 2n |Sn−1|

Z

Σ

dσ.

We prefer to switch from the surface integral to a volume integral. Let co(0,Σ) denote the cone formed by all segments [0, s] with sΣ. Because|Sn−1|=n|Bn|, we get

1 |Sn−1|

Z

Σ

dσ= 1

n|Bn|

Z

Σ

dσ= 1 |Bn|

Z 1

0

Z

Σ

tn−1dσ dt= 1 |Bn|

Z

co(0,Σ)

(22)

Since ξµ2

n−m+j <1 forj= 1, . . . , m, the ellipsoid (31) is completely contained in the ball

x2nm+1+. . .+x2n<1.

We start with the parametrization

xn−m+1 = µn−m+1r ω1(m)(ϕ1, . . . , ϕm−1)

xn−m+2 = µn−m+2r ω2(m)(ϕ1, . . . , ϕm−1)

. . .

xn = µnr ω(m)m (ϕ1, . . . , ϕm−1),

where r[0,√ξ) and (ϕ1, . . . , ϕm−1)∈Qm−1. By the definition ofv, x21+. . .+x2nm = 1x2nm+1. . .x2n= 1r2v.

Hence, after letting (θ1, . . . , θn−m−1)∈Qn−m−1 and

x1 =

p

1r2v(ϕ

1, . . . , ϕm−1)ω1(n−m)(θ1, . . . , θn−m−1)

x2 =

p

1r2v(ϕ

1, . . . , ϕm−1)ω2(n−m)(θ1, . . . , θn−m−1)

. . . xn−m =

p

1r2v(ϕ

1, . . . , ϕm−1)ωn(n−−mm)(θ1, . . . , θn−m−1)

we have accomplished the parametrization of Σ. Finally, on multiplying the right-hand sides of the above expressions for x1, . . . , xn by t∈[0,1], we obtain a parametrization of co(0,Σ). The Jacobian

J = ∂(x1, . . . , xn)

∂(t, r, ϕ1, . . . , ϕm−1, θ1, . . . , θn−m−1)

can be evaluated as in Example 7.2: after subtracting r/ttimes the second column from the first and taking into account that

p

1r2vr

∂r

p

1r2v = 1 1r2v

one arrives at a determinant that is the product of an m×m determinant and an (n m)×(nm) determinant; these two determinants can in turn be computed using (29). What results is

|J| = tn−1rm−1µn−m+1. . . µn(1−r2v)(n−m−2)/2 × sinm−2ϕ1sinm−3ϕ2. . .sinϕm−2

(23)

In summary, with cn as in the theorem. Consequently,

Fn′(ξ) =cnξ(m−2)/2

From Theorem 7.3 we therefore deduce that the density functionfn(ξ) is a constant times

ξ(m−2)/2(1ξ)(n−m−2)/2. The constant is

Under the hypothesis of Corollary 7.4, the density function ofXn2 is

fn(ξ) =

It follows that the random variable nX2

(24)

on (0, n). Ifm remains fixed andngoes to infinity, then this has the limit 1

2m/2Γ(m/2)ξ

(m−2)/2e−ξ/2, ξ

∈(0,),

which is the density of the χ2

m distribution.

Example 7.5 Let us consider Example 2.3 again. Thus, suppose An is the matrix (10). The singular values ofAnare 0, . . . ,0, nand hence we can apply Corollary 7.4 withm= 1 to the situation at hand. It follows thatX2

n isB

¡1

2,n−21

¢

distributed on the interval (0,1). If X has theB(α, β) distribution on (0,1), then

EX = α

α+β, σ

2X= αβ

(α+β)2(α+β+ 1). This yields

EXn2= 1

n, σ

2X2 n=

2(n1)

n2(n+ 2),

which is in perfect accordance with (11). In Example 2.3 we were able to conclude that

P(Xn2 ε)8/(n2ε2). Since we know the density, we can now write

P(Xn2ε) = 1

B

µ

1 2,

n1 2

¶ Z 1

ε

ξ−1/2(1ξ)(n−3)/2dξ.

Once partially integrating and using Stirling’s formula we obtain

P(Xn2ε) =

r

2

π

1

(1ε)(n−1)/2

µ

1 +O

µ

1

n

¶¶

,

the O depending on ε. Thus, for each ε (0,1), the probability P(X2

n ≥ ε) actually decays exponentially to zero as n→ ∞.

Example 7.6 Orthogonal projections have just the singular value pattern of Corollary 7.4. This leads to some pretty nice conclusions.

Let E be an N-dimensional Euclidean space and let U be an m-dimensional linear subspace ofE. We denote by PU the orthogonal projection ofE ontoU. Then foryE, the element PUy is the best approximation ofy inU and we have kyk2 =kPUyk2+ky

PUyk2. The singular values of PU are N m zeros and m units. Thus, Corollary 7.4 implies that if y is uniformly distributed on the unit sphere of E, then kPUyk2 has the

B¡m2,N−2m¢ distribution on (0,1). In particular, if N is large, then PUy lies with high probability close to the sphere of radiuspmN and the squared distanceky−PUyk2clusters sharply around 1 mN.

(25)

an m-dimensional linear subspace of Mn(R). Examples include the Toeplitz matrices, Toepn(R)

the Hankel matrices, Hankn(R) the tridiagonal matrices, Tridiagn(R)

the tridiagonal Toeplitz matrices, TridiagToepn(R) the symmetric matrices, Symmn(R)

the lower-triangular matrices, Lowtriangn(R) the matrices with zero main diagonal, Zerodiagn(R) the matrices with zero trace, Zerotracen(R).

The dimensions of these linear spaces are

dim Toepn(R) = 2n1, dim Hankn(R) = 2n1,

dim Tridiagn(R) = 3n2, dim TridiagToepn(R) = 3,

dim Symmn(R) = n 2+n

2 , dim Lowtriangn(R) =

n2+n

2 , dim Zerodiagn(R) =n2n, dim Zerotracen(R) =n2−1.

Supposenis large andYn∈Mn(R) is uniformly distributed on the unit sphere onMn(R), kYnk2F = 1. Let PStructYn be the best approximation of Yn by a matrix in Structn(R). Notice that the determination of PStructYn is a least squares problem that can be easily solved. For instance, PToepYn is the Toeplitz matrix whose kth diagonal, k = −(n− 1), . . . , n1, is formed by the arithmetic mean of the numbers in the kth diagonal of

Yn. Recall that dim Structn(R) =m. From what was said in the preceding paragraph, we conclude thatkPStructYnk2

FisB

³

m 2,n

2

−m 2

´

distributed on (0,1). For example,kPToepYnk2 has theB³2n2−1,n2−2n+12 ´distribution on (0,1). The expected value of the variablekYn− PToepYnk2is 1−n2+n12 and the variance does not exceed n43. Hence, Chebyshev’s inequality gives

P

µ

1 2

n+

1

n2 −

ε

n <kYn− PToepYnk

2 <1

n2 + 1

n2 +

ε n

≥1 4

nε2. (32) Consequently, PToepYn is with high probability found near the sphere with the radius

q

2

n −n12 and kYn− PToepYnk2F is tightly concentrated around 1− 2n+n12.

We arrive at the conclusion that nearly alln×nmatrices of Frobenius norm 1 are at nearly the same distance to the set of all n×n Toeplitz matrices!

This does not imply that the Toeplitz matrices are at the center of the universe. In fact, the conclusion is true for each of the classes Strucn(R) listed above. For instance, from Chebyshev’s inequality we obtain

P

µ

1 2 −

1

2n−ε <kYn− PSymmYnk

2 < 1 2 −

1 2n+ε

≥1 1

(26)

and much sharper estimates. Namely, Lemma 2.2 of [5] in conjunction with Corollary 7.4 implies that ifX2

n has theB

¡m

2,N−2m

¢

distribution on (0,1), then

P³Xn2 σ m be small and choose τ such that

τ

theO depending onε, and hence (34) amounts to

P

which is worse than the Chebyshev estimate (32).

(27)

Proof. We havev(ϕ) =µ2cos2ϕ+ sin2ϕ and hence (28) yields Combining (38) and Theorem 7.3 we arrive at (36). Formula 2.2.6.1 of [12] gives that (38) equals

This in conjunction with Theorem 7.3 proves (37).

Fory(0,1), the complete elliptic integrals K(y) and E(y) are defined by

For smalln’s, Corollary 7.7 delivers the following densities on (0,1/µ2):

f3(ξ) =

(28)

Formula 2.3.6.2 of [12] tells us that the last expression equals

µ

2πe

−ξ/2π e−ξ(µ2

−1)/4I 0

µ

ξ(µ21) 4

= µ 2 e

−ξ(µ2 +1)/4I

0

µ

ξ(µ21) 4

,

where I0 is the modified Bessel function,

I0(y) := 1 +

X

k=1

µ

1

k!

¶2³

y

2

´2k

.

Conclusion. It is clear that estimates based on knowledge of the distribution function are in general better than estimates that are obtained from Chebyshev’s inequality. Examples 2.3 and 7.5 convincingly demonstrate the superiority of the distribution function over Chebyshev estimates. However, the message of this paper is that, for large n, the ratio kAnxk2/(kAnk2kxk2) clusters sharply around a certain number and that this number can be completely identified for important classes of structured matrices. The zoo of distribution functions we encountered above makes us appreciate the beauty of the simple and general Theorem 2.2 and the ease with which we were able to deduce the results of Sections 3 to 6 from this theorem.

Acknowledgment. We thank Florin Avram, Nick Trefethen, and Man-Chung Yeung for encouraging remarks on a baby version of this paper. We are also greatly indebted to David Aldous, Ted Cox, and the referee for giving our attempt at probability a warm reception.

References

[1] F. Avram: On bilinear forms in Gaussian random variables and Toeplitz matrices.

Probab. Theory Related Fields79(1988), 37–45.

[2] A. B¨ottcher and S. Grudsky: Toeplitz Matrices, Asymptotic Linear Algebra, and Functional Analysis. Hindustan Boook Agency, New Delhi 2000 and Birkh¨auser Ver-lag, Basel 2000.

[3] A. B¨ottcher and B. Silbermann: Analysis of Toeplitz Operators. Akademie-Verlag, Berlin 1989 and Springer-Verlag, Berlin 1990.

[4] A. B¨ottcher and B. Silbermann: Introduction to Large Truncated Toeplitz Matrices. Springer-Verlag, New York 1999.

[5] S. Dasgupta and A. Gupta: An elementary proof of a theorem of Johnson and Lin-denstrauss. Random Structures& Algorithms22(2003), 60–65.

[6] D. Fasino and P. Tilli: Spectral clustering properties of block multilevel Hankel ma-trices.Linear Algebra Appl. 306(2000), 155–163.

(29)

[8] P. Halmos: A Hilbert Space Problem Book.D. van Nostrand, Princeton 1967.

[9] N. J. Higham: Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, PA 1996.

[10] M. Kac, W. L. Murdock, and G. Szeg¨o: On the eigenvalues of certain Hermitian forms. J. Rat. Mech. Anal.2 (1953), 787–800.

[11] S. V. Parter: On the distribution of the singular values of Toeplitz matrices. Linear Algebra Appl.80(1986), 115–130.

[12] A. P. Prudnikov, Yu. A. Brychkov, and O. I. Marichev: Integrals and Series. Vol. 1. Elementary Functions.Gordon and Breach, New York 1986.

[13] S. Roch and B. Silbermann: Index calculus for approximation methods and singular value decomposition.J. Math. Analysis Appl. 225(1998), 401–426.

[14] S. M. Rump: Structured perturbations. Part I: Normwise distances. Preprint, April 2002.

[15] P. Tilli: A note on the spectral distribution of Toeplitz matrices. Linear and Multi-linear Algebra45 (1998), 147–159.

[16] O. Toeplitz: Zur Theorie der quadratischen und bilinearen Formen von unendlichvie-len Ver¨anderlichen.Math. Ann.70 (1911), 351–376.

[17] W. F. Trench: Asymptotic distribution of the spectra of a class of generalized Kac-Murdock-Szeg¨o matrices.Linear Algebra Appl. 294 (1999), 181–192.

[18] E. E. Tyrtyshnikov and N. L. Zamarashkin: Toeplitz eigenvalues for Radon measures.

Referensi

Dokumen terkait

[r]

Untuk melakukan pengujian Sistem Informasi Digital Library di Perpustakaan SMP Negeri 2 Bandung apakah sudah sesuai dengan yang dibutuhkan oleh siswa dalam menjawab

Finally, as individual independent variable it can be concluded that only dividend yield that is significant in predicting the stock return while price earnings ratio and book

[r]

Berdasarkan hasil uji nilai F yang ditunjukkan pada tabel 4 diperoleh nilai p value sebesar 0.000 &lt; 0.05, maka dapat disimpulkan bahwa persepsi Wajib Pajak atas

Speech acts, which is technical term in linguistics, is the acts of communication which is usually used to make request, asking questions, giving orders,

lebih besar dari 110% (seratus sepuluh perseratus) dari harga satuan yang tercantum dalam HPS, dilakukan klarifikasi. Harga satuan penawaran tersebut dinyatakan timpang dan hanya

However, it should be noted that the result of this research is that the independent variables which is burnout (exhaustion and disengagement) have positive