K-trace - CONCENTRATION OF EIGENVALUE SUMS AND GENERALIZED LIEB’S CONCAVITY THEOREM

CONCENTRATION OF EIGENVALUE SUMS AND GENERALIZED LIEB’S CONCAVITY THEOREM

5.2 K-trace

In Tropp’s results (5.6), namely the case k = 1, the cost of “switching” λ and E is of scale logn. In our estimates (5.8), the gap factor becomes log_n

≤ klogn that grows only sub-linearly in k, which is reasonable as we are estimating the sum of the k largest (or smallest) eigenvalues. We shall further compare our estimates to another related work. Tropp et al. [40] introduced a subspace argument based on Courant–Fischer characterization of eigenvalues to prove tail bounds for all eigenvalues ofY =Pm

i=1X⁽ⁱ⁾. Though not stated in [40], the following expectation estimates for all eigenvalues can also be established using the subspace argument.

Give any finite sequence of independent, random matrices{X⁽ⁱ⁾}_i^m₌₁under the same assumption as in Theorem 5.1.4, andY =Pm

i=1X⁽ⁱ⁾, we have for any 1 ≤ k ≤ n, Eλk(Y) ≤ _θ>inf

e^θ−1

θ λk(EY)+ c

θlog(n−k+1), (5.10a) Eλk(Y) ≥ sup

θ>0

1−e^−θ

θ λk(EY)− c

θlogk. (5.10b)

Summing (5.10a) (or (5.10b)) for thek largest (or smallest) eigenvalues, we imme- diately obtain

i=1

λi(Y) ≤ _θ>inf

e^θ−1 θ

i=1

λi(EY)+ c θlog

i=1

(n−i+1), (5.11a)

i=1

λn−i+1(Y) ≥ sup

θ>0

1−e^−θ θ

i=1

λn−i+1(EY)− c θ log

i=1

(n−i+1). (5.11b) Therefore, our expectation estimates (5.8a) and (5.8b) are sharper for partial sums of eigenvalues, as log_n

< logQk

i=1(n−i+1)fork >1. In particular, if one choose k to be a fixed proportion of n, then log_n

= O(k), while logQk

i=1(n−i+1) = O(klogn). Our results are then better by a factor logn.

At last, we remark that if we combine Theorem 5.3.1 and the subspace argument in [40], we shall be able to derive similar expectation estimates and tail bounds for the sum of arbitrary successive eigenvalues ofY = Pm

i=1X⁽ⁱ⁾. We will leave this potential extension to future works.

In particular, Tr1[A] = Tr[A] is the normal trace of A, and Trn[A] = det[A] is the determinant of A. If we write A_(i₁···i_k,i₁···i_k) for the k× k principal submatrix of A corresponding to the indicesi₁,i₂,· · ·,ik, then an equivalent definition of thek-trace of Ais given by

Trk[A]= X

1≤i₁<i₂<···<i_k≤n

det[A_(i₁···i_k,i₁···i_k)], 1≤ k ≤ n. (5.13) Using the second definition (5.13), one can check that for any 1 ≤ k ≤ n, thek-trace enjoys the cyclic invariance property like the normal trace and the determinant. That is for any A,B∈Cⁿ, Trk[AB]=Trk[B A].

The motivation of studying thek-trace is to provide effective estimates on the sum of the k largest (or smallest) eigenvalues of, in particular, random Hermitian matrices.

As we know, the sum of the k largest eigenvalues of a Hermitian matrix A (as a variable) is a convex function in A. So if Ais random, we have, for example, the expectation estimate

i=1

λi(A) ≥

i=1

λi(EA)

by Jensen’s inequality. Recall thatλi(A)denotes thei_thlargest eigenvalue ofA. This provides a lower bound for the left hand side if we knowEA; or a way to bound the right hand side from above if we can sampleA. However, an estimate between these two quantities in an inverse fashion is more interesting and challenging. For the k =1 case, Tropp [122] related the largest eigenvalue to the trace of the exponential using the observation

λ₁(A) ≤ log Trexp(A) ≤ λ₁(A)+logn, A∈Hn, (5.14) which only introduced a gap of log scale in dimension. In particular, the first inequality in (5.14) was applied to the random matrix A, and the second was applied toEA. Tropp then applied the Lieb’s theorem to the intermediate quantity Trexp(A)(more precisely, with A= H+logY for some fixed matrixHand some random matricesY) to derive inverse expectation estimates and a series of matrix concentration inequalities. Inspired by Tropp’s work, we will develop expectation estimates and tail bounds on the sum of thek largest eigenvalues based on an analog of (5.14) that

i=1

λi(A) ≤ log Trkexp(A) ≤

i=1

λi(A)+log n k

, A∈Hn, (5.15)

This is actually the starting point of this paper. Naturally, manipulating the intermediate quantity Trkexp(A) in our estimates requires extending the Lieb’s theorem to a generalk-trace version. Note that the sum of thek smallest eigenvalues can be handled in a similar spirit.

Apart from its particular use discussed above, the k-trace is of theoretical interest by itself, as it has many interpretations corresponding to different aspects of matrix theories. Writing D(A⁽¹⁾,A⁽²⁾,· · · ,A⁽ⁿ⁾)the mixed discriminant of anynmatrices

A⁽¹⁾,A⁽²⁾,· · ·,A⁽ⁿ⁾ ∈C^n×n, we have the identity Trk[A]= n

· D(A,· · ·,A

| {z }

,In,· · ·,In

| {z }

n−k

Also, if we consider thek_thexterior algebra∧^k(Cⁿ), we can then interpret thek-trace of Aas

Trk[A]=TrL(∧^k(Cⁿ))

M₀^(k)(A),

where TrL(∧^k(Cⁿ)) is the normal trace on the operator space L(∧^k(Cⁿ)), and M₀^(k)(A) ∈ L(∧^k(Cⁿ)) is defined as M₀^(k)(A)(v₁∧v₂∧ · · · ∧vk) = Av₁∧ Av₂∧

· · · ∧ Avk, for any v₁∧v₂∧ · · · ∧vk ∈ ∧^k(Cⁿ). These two interpretations, in fact, will provide us important tools for studying the k-trace and proving our generalized Lieb’s concavity theorems. We will discuss more on this in Section 5.6.1 and Section 5.6.2.

Throughout this work, we will be using the following nice properties of thek-trace.

Proposition 5.2.1. For any positive integers n,k, 1 ≤ k ≤ n, the k-trace function Trk[·]satisfies the following:

(i) Cyclicity: Trk[AB]=Trk[B A], A,B ∈C^n×n.

(ii) Homogeneity: Trk[αA]= α^kTrk[A], A∈C^n×n, α∈C.

(iii) Monotonicity: For any A,B ∈ H⁺_n, Trk[A] ≥ Trk[B], if A B; Trk[A] >

Trk[B], if A B. In particular,Trk[A]≥ 0,A∈H⁺_n. (iv) Concavity: The function A7→ (Trk[A])¹^k is concave onH⁺_n.

(v) Hölder’s Inequality: Trk[|AB|^r]^r¹ ≤ Trk[|A|^p]¹^pTrk[|B|^q]¹^q, for any r,p,q ∈ (0,+∞], ¹_p+ ¹_q = ¹_r, and any A,B ∈C^n×n.

(vi) Consistency: For anyn,˜ k ≤ n˜ ≤ n, and any A∈C^n×^˜ ⁿ^˜,Trk







* ,

A 0 0 0 +

-^n×n







= Trk[A].

Proof. (i), (ii), (iii) and (vi) can be easily verified by the definitions (5.12) and (5.13).

(iv) is a consequence of the general Brunn–Minkowski theorem (Corollary 5.6.3) introduced in Section 5.6.1. (v) is a direct result of expression (5.51) in Section 5.6.2.

In fact, since the normal trace enjoys the Hölder’s inequality, we have Trk[|AB|^r]¹^r =Tr[

M₀^(k)(A)M₀^(k)(B)

r]¹^r

≤ Tr[|M₀^(k)(A)|^p]^p¹Tr[|M₀^(k)(B)|^q]^q¹

=Tr^k[|A|^p]¹^pTr^k[|B|^q]¹^q.

We have used multiple properties of the operator M₀^(k)(A) introduced in Sec-

tion 5.6.2.

A Simplified Notation

As we will be working with the general setting ofk-trace, there is no need to specify a particular value ofk. Therefore, we will sometimes write

φ(A)= (Trk[A])¹^k

for notational simplicity. Note that the function φ also satisfies (i) cyclicity, (iii) monotonicity, (v) Hölder’s inequality and (vi) consistency as in Proposition 5.2.1.

But now the map A 7→ φ(A) is homogeneous of order 1 and is concave onH⁺_n. Abusing notation, we will also refer the functionφas thek-trace.

Dalam dokumen De Huang (Halaman 152-155)