CONCENTRATION OF EIGENVALUE SUMS AND GENERALIZED LIEB’S CONCAVITY THEOREM
5.2 K-trace
In Tropp’s results (5.6), namely the case k = 1, the cost of “switching” λ and E is of scale logn. In our estimates (5.8), the gap factor becomes logn
k
≤ klogn that grows only sub-linearly in k, which is reasonable as we are estimating the sum of the k largest (or smallest) eigenvalues. We shall further compare our estimates to another related work. Tropp et al. [40] introduced a subspace argument based on Courant–Fischer characterization of eigenvalues to prove tail bounds for all eigenvalues ofY =Pm
i=1X(i). Though not stated in [40], the following expectation estimates for all eigenvalues can also be established using the subspace argument.
Give any finite sequence of independent, random matrices{X(i)}im=1under the same assumption as in Theorem 5.1.4, andY =Pm
i=1X(i), we have for any 1 ≤ k ≤ n, Eλk(Y) ≤ θ>inf
0
eθ−1
θ λk(EY)+ c
θlog(n−k+1), (5.10a) Eλk(Y) ≥ sup
θ>0
1−e−θ
θ λk(EY)− c
θlogk. (5.10b)
Summing (5.10a) (or (5.10b)) for thek largest (or smallest) eigenvalues, we imme- diately obtain
E
k
X
i=1
λi(Y) ≤ θ>inf
0
eθ−1 θ
k
X
i=1
λi(EY)+ c θlog
k
Y
i=1
(n−i+1), (5.11a)
E
k
X
i=1
λn−i+1(Y) ≥ sup
θ>0
1−e−θ θ
k
X
i=1
λn−i+1(EY)− c θ log
k
Y
i=1
(n−i+1). (5.11b) Therefore, our expectation estimates (5.8a) and (5.8b) are sharper for partial sums of eigenvalues, as logn
k
< logQk
i=1(n−i+1)fork >1. In particular, if one choose k to be a fixed proportion of n, then logn
k
= O(k), while logQk
i=1(n−i+1) = O(klogn). Our results are then better by a factor logn.
At last, we remark that if we combine Theorem 5.3.1 and the subspace argument in [40], we shall be able to derive similar expectation estimates and tail bounds for the sum of arbitrary successive eigenvalues ofY = Pm
i=1X(i). We will leave this potential extension to future works.
In particular, Tr1[A] = Tr[A] is the normal trace of A, and Trn[A] = det[A] is the determinant of A. If we write A(i1···ik,i1···ik) for the k× k principal submatrix of A corresponding to the indicesi1,i2,· · ·,ik, then an equivalent definition of thek-trace of Ais given by
Trk[A]= X
1≤i1<i2<···<ik≤n
det[A(i1···ik,i1···ik)], 1≤ k ≤ n. (5.13) Using the second definition (5.13), one can check that for any 1 ≤ k ≤ n, thek-trace enjoys the cyclic invariance property like the normal trace and the determinant. That is for any A,B∈Cn, Trk[AB]=Trk[B A].
The motivation of studying thek-trace is to provide effective estimates on the sum of the k largest (or smallest) eigenvalues of, in particular, random Hermitian matrices.
As we know, the sum of the k largest eigenvalues of a Hermitian matrix A (as a variable) is a convex function in A. So if Ais random, we have, for example, the expectation estimate
E
k
X
i=1
λi(A) ≥
k
X
i=1
λi(EA)
by Jensen’s inequality. Recall thatλi(A)denotes theithlargest eigenvalue ofA. This provides a lower bound for the left hand side if we knowEA; or a way to bound the right hand side from above if we can sampleA. However, an estimate between these two quantities in an inverse fashion is more interesting and challenging. For the k =1 case, Tropp [122] related the largest eigenvalue to the trace of the exponential using the observation
λ1(A) ≤ log Trexp(A) ≤ λ1(A)+logn, A∈Hn, (5.14) which only introduced a gap of log scale in dimension. In particular, the first inequality in (5.14) was applied to the random matrix A, and the second was applied toEA. Tropp then applied the Lieb’s theorem to the intermediate quantity Trexp(A)(more precisely, with A= H+logY for some fixed matrixHand some random matricesY) to derive inverse expectation estimates and a series of matrix concentration inequalities. Inspired by Tropp’s work, we will develop expectation estimates and tail bounds on the sum of thek largest eigenvalues based on an analog of (5.14) that
k
X
i=1
λi(A) ≤ log Trkexp(A) ≤
k
X
i=1
λi(A)+log n k
!
, A∈Hn, (5.15)
This is actually the starting point of this paper. Naturally, manipulating the interme- diate quantity Trkexp(A) in our estimates requires extending the Lieb’s theorem to a generalk-trace version. Note that the sum of thek smallest eigenvalues can be handled in a similar spirit.
Apart from its particular use discussed above, the k-trace is of theoretical interest by itself, as it has many interpretations corresponding to different aspects of matrix theories. Writing D(A(1),A(2),· · · ,A(n))the mixed discriminant of anynmatrices
A(1),A(2),· · ·,A(n) ∈Cn×n, we have the identity Trk[A]= n
k
!
· D(A,· · ·,A
| {z }
k
,In,· · ·,In
| {z }
n−k
).
Also, if we consider thekthexterior algebra∧k(Cn), we can then interpret thek-trace of Aas
Trk[A]=TrL(∧k(Cn))
M0(k)(A),
where TrL(∧k(Cn)) is the normal trace on the operator space L(∧k(Cn)), and M0(k)(A) ∈ L(∧k(Cn)) is defined as M0(k)(A)(v1∧v2∧ · · · ∧vk) = Av1∧ Av2∧
· · · ∧ Avk, for any v1∧v2∧ · · · ∧vk ∈ ∧k(Cn). These two interpretations, in fact, will provide us important tools for studying the k-trace and proving our general- ized Lieb’s concavity theorems. We will discuss more on this in Section 5.6.1 and Section 5.6.2.
Throughout this work, we will be using the following nice properties of thek-trace.
Proposition 5.2.1. For any positive integers n,k, 1 ≤ k ≤ n, the k-trace function Trk[·]satisfies the following:
(i) Cyclicity: Trk[AB]=Trk[B A], A,B ∈Cn×n.
(ii) Homogeneity: Trk[αA]= αkTrk[A], A∈Cn×n, α∈C.
(iii) Monotonicity: For any A,B ∈ H+n, Trk[A] ≥ Trk[B], if A B; Trk[A] >
Trk[B], if A B. In particular,Trk[A]≥ 0,A∈H+n. (iv) Concavity: The function A7→ (Trk[A])1k is concave onH+n.
(v) Hölder’s Inequality: Trk[|AB|r]r1 ≤ Trk[|A|p]1pTrk[|B|q]1q, for any r,p,q ∈ (0,+∞], 1p+ 1q = 1r, and any A,B ∈Cn×n.
(vi) Consistency: For anyn,˜ k ≤ n˜ ≤ n, and any A∈Cnט n˜,Trk
* ,
A 0 0 0 +
-n×n
= Trk[A].
Proof. (i), (ii), (iii) and (vi) can be easily verified by the definitions (5.12) and (5.13).
(iv) is a consequence of the general Brunn–Minkowski theorem (Corollary 5.6.3) introduced in Section 5.6.1. (v) is a direct result of expression (5.51) in Section 5.6.2.
In fact, since the normal trace enjoys the Hölder’s inequality, we have Trk[|AB|r]1r =Tr[
M0(k)(A)M0(k)(B)
r]1r
≤ Tr[|M0(k)(A)|p]p1Tr[|M0(k)(B)|q]q1
=Trk[|A|p]1pTrk[|B|q]1q.
We have used multiple properties of the operator M0(k)(A) introduced in Sec-
tion 5.6.2.
A Simplified Notation
As we will be working with the general setting ofk-trace, there is no need to specify a particular value ofk. Therefore, we will sometimes write
φ(A)= (Trk[A])1k
for notational simplicity. Note that the function φ also satisfies (i) cyclicity, (iii) monotonicity, (v) Hölder’s inequality and (vi) consistency as in Proposition 5.2.1.
But now the map A 7→ φ(A) is homogeneous of order 1 and is concave onH+n. Abusing notation, we will also refer the functionφas thek-trace.