A First Proof by Matrix Derivative - Proof of Concavity Theorems

CONCENTRATION OF EIGENVALUE SUMS AND GENERALIZED LIEB’S CONCAVITY THEOREM

5.4 Proof of Concavity Theorems

5.4.1 A First Proof by Matrix Derivative

As Theorem 5.3.4 is sufficient to lead to our concentration results, we provide in this subsection an independent proof of it. As mentioned before, this generalized Lieb’s theorem is a joint result of the original Lieb’s theorem and the Alexandrov–Fenchel inequality. But we will not use the Lieb’s theorem directly. Instead, we will be using the following lemma, also due to Lieb [67], which is an equivalence of the Lieb’s theorem. We provide the proof here only to show its connection to the Lieb’s theorem.

Lemma 5.4.1. Given any A∈H⁺⁺_n ,C ∈Hn, define T = Z ∞

0 (A+τI)⁻¹C(A+τI)⁻¹dτ, R= 2Z ∞

0 (A+τI)⁻¹C(A+τI)⁻¹C(A+τI)⁻¹dτ, then for anyB ∈H⁺_n, we have

Z ₁

0 dsTr

T B^sT B¹^−s −Tr

RB ≤ 0. (5.21)

Proof. By Lieb’s theorem (Theorem 6 [67]), for any H ∈ Hn, the functiong(t) = Trexp(H+log(A+tC)) is concave. Also this function is smooth intfortsmall enough such that A+tC ∈ H⁺⁺_n . Thus we have _∂t^∂²2g(t)|t=0 = g⁰⁰(0) ≤ 0. Write B(t)= exp(H+log(A+tC)), and

T(t) = Z ∞

0 (A+tC+τI)⁻¹C(A+tC+τI)⁻¹dτ, R(t) =2

Z ∞

0 (A+tC+τI)⁻¹C(A+tC+τI)⁻¹C(A+tC+τI)⁻¹dτ.

It is easy to check that _∂t^∂ log(A+tC) =T(t),T⁰(t) =−R(t)by formulas (5.58) and (5.59). Then using the derivative formulas (5.57), (5.58) and (5.59), we have

g⁰(t) =Z ₁

0 dsTr

(B(t))^sT(t)(B(t))¹^−s = Tr

T(t)B(t), and

g⁰⁰(t)= Tr

T⁰(t)B(t)+Z ₁

0 dsTr

T(t)(B(t))^sT(t)(B(t))¹^−s.

For anyB ∈H⁺⁺_n , we may chooseH =logB−logA, so thatB(0) =exp(H+logA) = B. And notice thatT(0)=T,R(0) = R, we thus have

−Tr

RB +Z ₁

0 dsTr

T B^sT B¹^−s = g⁰⁰(0) ≤ 0.

The extension toB ∈H⁺_n can be done by continuity.

We use this variant of the Lieb’s theorem since it is more convenient for us to choose arbitraryB ∈H⁺⁺_n in inequality (5.21). In particular, if we chooseBto be diagonal with diagonal entriesb₁,b₂,· · ·,bn, then Lemma 5.4.1 implies that

i=1

Riibi ≥ Z ₁

0 ds

i=1 n

j=1

Ti jb^s_jTjib¹_i^−s, (5.22) which is a critical estimate that we will be using.

We now prove a trace inequalities using Lemma 5.4.1 and the Alexandrov–Fenchel inequality Theorem 5.6.1. This inequality can be seen as a generalization of Lemma 5.4.1 from k =1 to all 1 ≤ k ≤ n.

Lemma 5.4.2. For arbitrary A∈H⁺⁺_n ,B ∈H⁺_n,C ∈Hn, let T =

Z ∞

0 (A+τI)⁻¹C(A+τI)⁻¹dτ, R= 2

Z ∞

0 (A+τI)⁻¹C(A+τI)⁻¹C(A+τI)⁻¹dτ, then we have, for all1≤ k ≤ n,

Z ₁

0 TrM₁⁽^k)(T B^s;B^s)M₁⁽^k)(T B¹^−s;B¹^−s)

ds−TrM₁^(k)(RB;B)

≤ TrM₂^(k)(T B,T B,B).

(5.23)

Proof. We first claim that we only need to consider the case when B = Λ is a diagonal matrix with all diagonal entries λ₁, λ₂,· · ·, λn ≥ 0. Indeed, if B is not diagonal, we consider its eigenvalue decomposition B = UΛU^T, whereU ∈ C^n×n is unitary, and Λ is a diagonal matrix whose diagonal entries λ₁, λ₂,· · ·, λn are the eigenvalues of B. Since B ∈ H⁺_n, λ₁, λ₂,· · ·, λn ≥ 0. If we introduce AH = U^TAU,CH=U^TCU,HT =U^TTU,RH=U^TRU, we have

TH= Z ∞

0 (HA+τI)⁻¹C(H AH+τI)⁻¹dτ, RH= 2Z ∞

0 (HA+τI)⁻¹C(H AH+τI)⁻¹CH(AH+τI)⁻¹dτ.

Then using the cyclic invariance of trace and the product properties (5.47), we have, for example,

TrM₁^(k)(T B^s;B^s)M₁^(k)(T B¹^−s;B¹^−s)

=TrM₁^(k)(UU^TTUΛ^sU^T;UΛ^sU^T)M₁^(k)(UU^TTUΛ¹^−sU^T;UΛ¹^−sU^T)

=TrM₀^(k)(U)M₁⁽^k)(HTΛ^s;Λ^s)M₀^(k)(U^T)M₀^(k)(U)M₁^(k)(HTΛ¹^−s;Λ¹^−s)M₀^(k)(U^T)

=TrM₁^(k)(THΛ^s;Λ^s)M₀^(k)(U^T)M₀⁽^k)(U)M₁^(k)(THΛ¹^−s;Λ¹^−s)M₀^(k)(U^T)M₀^(k)(U)

=TrM₁^(k)(THΛ^s;Λ^s)M₁^(k)(THΛ¹^−s;Λ¹^−s).

Using the same trick to the other terms in the inequalities (5.23), one can show that (5.23) is equivalent to

Z ₁

0 TrM₁^(k⁾(HTΛ^s;Λ^s)M₁^(k)(HTΛ¹^−s;Λ¹^−s)

ds−TrM₁⁽^k)(RΛH ;Λ)

≤ TrM₂^(k)(HTΛ,THΛ,Λ).

which justifies our claim. In what follows, we will still use A,C,T,Rfor A,HC,HT,H RH. We now prove (5.23) with B = Λ being diagonal whose diagonal entries are λ₁, λ₂,· · ·, λn ≥ 0. Using product properties (5.47) and identities in Lemma 5.6.4, we rewrite the quantity

I , Z ₁

0 dsTrM₁^(k)(TΛ^s;Λ^s)M₁^(k)(TΛ¹^−s;Λ¹^−s)−TrM₁^(k)(RΛ;Λ)

= Z ₁

0 ds(

TrM₁^(k)(TΛ^sTΛ¹^−s;Λ) +TrM₂^(k)(TΛ^sΛ¹^−s,Λ^sTΛ¹^−s;Λ))

−TrM₁^(k)(RΛ;Λ)

= Z ₁

0 ds











i=1 n

j=1

Ti jλ^s_jTjiλ¹_i^−s d_i^(n,k)

+ X

1≤i,j≤n

(Tiiλiλ^s_jTj jλ¹_j^−s −Tjiλiλ_i^sTi jλ¹_j^−s)g_{i j}⁽ⁿ^,k)











−

i=1

Riiλid_i^(n,k⁾.

Then replacingbi byλid_i^(n,k) in (5.22), we have by Lemma 5.4.1

i=1

Riiλid_i^(n,k) ≥ Z ₁

0 ds

i=1 n

j=1

Ti j(λjd^(n,k)_j )^sTji(λid_i^(n,k))¹^−s.

Therefore we have I ≤

Z ₁

0 ds( X

1≤i,j≤n

Ti jTjiλ^s_jλ¹_i^−sd_i^(n,k) + X

1≤i,j≤n

(TiiTj jλiλj−TjiTi jλ_i¹⁺^sλ¹_j^−s)g_{i j}^(n,k)

− X

1≤i,j≤n

Ti jTji(λjd^(n,k)_j )^s(λid_i^(n,k))¹^−s) .

We now investigate the integrand for anys ∈[0,1]. We have X

1≤i,j≤n

Ti jTjiλ^s_jλ¹_i^−sd_i^(n,k) + X

1≤i,j≤n

(TiiTj jλiλj−TjiTi jλ_i¹⁺^sλ¹_j^−s)g_{i j}^(n,k)

− X

1≤i,j≤n

Ti jTji(λjd^(n,k)_j )^s(λid_i^(n,k))¹^−s

i=1

T_ii²λid_i^(n,k)+ X

1≤i<j≤n

|Ti j|²(λ^s_jλ¹_i^−sd_i^(n,k) +λ_i^sλ¹_j^−sd^(n,k)_j ) + X

1≤i,j≤n

TiiTj jλiλjg_{i j}^(n,k) − X

1≤i<j≤n

|Ti j|²(λ¹⁺_i ^sλ¹_j^−s+λ¹⁺_j ^sλ¹_i^−s)g_{i j}^(n,k)

−

i=1

T_ii²λid_i^(n,k)

− X

1≤i<j≤n

|Ti j|²(λ^s_jλ¹_i^−s(d^(n,k)_j )^s(d_i^(n,k))¹^−s +λ^s_iλ¹_j^−s(d_i^(n,k))^s(d^(n,k)_j )¹^−s)

= X

1≤i,j≤n

TiiTj jλiλjg_{i j}^(n,k) + X

1≤i<j≤n

|Ti j|²(

λ^s_jλ¹_i^−sd_i^(n,k)+λ^s_iλ¹_j^−sd^(n,k)_j −(λ¹⁺_i ^sλ¹_j^−s+ λ¹⁺_j ^sλ_i¹^−s)g_{i j}^(n,k)

−λ^s_jλ¹_i^−s(d^(n,k)_j )^s(d_i^(n,k))¹^−s−λ_i^sλ¹_j^−s(d_i^(n,k⁾)^s(d^(n,k)_j )¹^−s)

≤ X

1≤i,j≤n

TiiTj jλiλjg_{i j}^(n,k)−2 X

1≤i<j≤n

|Ti j|²λiλjg_{i j}^(n,k)

= X

1≤i,j≤n

(T_iiTj jλiλj−Ti jTjiλiλj)g_{i j}^(n,k).

We have usedg_{i j}^(n,k) = g^(n,k)_ji andg_ii^(n,k) = 0. The proof of the last inequality above is as follows. For anys ∈[0,1], we have a Hölder-type inequality for scalars:

(a+b)^s(c+d)¹^−s ≥ a^sc¹^−s+b^sd¹^−s, a,b,c,d ≥ 0. Then using the expansion relations (5.53),

d_i^(n,k) = λjg_{i j}⁽ⁿ^,k) +g_{i j}⁽ⁿ^,k⁺¹⁾, d^(n,k)_j = λig_{i j}^(n,k) +g_{i j}^(n,k⁺¹⁾,

we have

λ^s_jλ¹_i^−sd_i^(n,k)+λ_i^sλ¹_j^−sd^(n,k)_j − (λ¹⁺_i ^sλ¹_j^−s+λ¹⁺_j ^sλ_i¹^−s)g_{i j}^(n,k)

−λ^s_jλ_i¹^−s(d^(n,k)_j )^s(d_i^(n,k))¹^−s−λ^s_iλ¹_j^−s(d_i^(n,k))^s(d^(n,k)_j )¹^−s

≤ λ^s_jλ¹_i^−s(λjg_{i j}^(n,k) +g_{i j}^(n,k⁺¹⁾)+ λ^s_iλ¹_j^−s(λig_{i j}^(n,k) +g_{i j}^(n,k⁺¹⁾)

−(λ¹⁺_i ^sλ¹_j^−s+λ¹⁺_j ^sλ¹_i^−s)g_{i j}^(n,k)

−λ^s_jλ_i¹^−s(λ_i^sλ¹_j^−sg_{i j}^(n,k) +g_{i j}^(n,k⁺¹⁾)− λ^s_iλ¹_j^−s(λ^s_jλ¹_i^−sg_{i j}^(n,k) +g_{i j}^(n,k⁺¹⁾)

= −2λiλjg_{i j}^(n,k).

Finally using Lemma 5.6.4 again, we have I ≤

Z ₁

0 ds









 X

1≤i,j≤n

(TiiTj jλiλj −Ti jTjiλiλj)g_{i j}^(n,k)











=TrM₂^(k)(TΛ,TΛ,Λ).

This completes the proof of Lemma 5.4.2.

We are now ready to prove Theorem 5.3.1 with all established results.

Proof of Theorem 5.3.1. We first prove the concavity of the function fH,k(A) = Trkexp H+logA _k¹.

Notice that given any A ∈ H⁺⁺_n and any C ∈ Hn, there exist some such that A+tC ∈H⁺⁺_n fort ∈ (−, ), and fH,k(A+tC)is continuously differentiable with respect tot on (−, ). In what follows, any function oft is always assumed to be defined on a reasonable neighborhood of 0 (so that A+tC ∈H⁺⁺_n ).

Then the concavity of fH,k(A)onH⁺⁺_n is equivalently to the statement that_∂t^∂²2 fH,k(A+ tC) ≤ 0|_t₌₀ for all choices of A ∈ H⁺⁺_n ,C ∈ Hn. Now fix a pair A,C, define B(t) = exp H +log(A+tC) ∈ H⁺⁺_n andg(t) = Trkexp H +log(A+tC) = TrM₀^(k)(B(t)) >0. Since fH,k(A+tC) =g(t)¹^k, and

∂²

∂t²fH,k(A+tC)= 1

kg(t)¹^k⁻² g⁰⁰(t)g(t)− k−1

k (g⁰(t))²,

we then need to show thatg(0)g⁰⁰(0) ≤ ^k−_k¹(g⁰(0))². Using the derivative formulas (5.58) and (5.59), we have

∂

∂t log(A+tC) = Z ∞

0 (A+tC+ x In)⁻¹C(A+tC+x In)⁻¹ ,T(t),

∂

∂tT(t) =−2

0 (A+tC+x In)⁻¹C(A+tC+ x In)⁻¹C(A+tC+x In)⁻¹ , −R(t).

Then using formula (5.57), we can compute the first derivative g⁰(t)= ∂

∂tTrM₀^(k)(B(t))

=TrM₁^(k)(B⁰(t);B(t))

=TrM₁^(k) Z ₁

0 dsB(t)^sT(t)B(t)¹^−s;B(t)

= Z ₁

0 dsTrM₁^(k) B(t)^sT(t)B(t)¹^−s;B(t)^sB(t)¹^−s

= Z ₁

0 dsTrM₀^(k)(B(t)^s)M₁^(k)(T(t);In)M₀^(k)(B(t)¹^−s)

= Z ₁

0 dsTrM₁^(k)(T(t);In)M₀^(k)(B(t)¹^−s)M₀^(k)(B(t)^s)

=TrM₁^(k)(T(t);In)M₀^(k)(B(t)).

We have used the fact that M₁^(k)(X;Y) is linear in X, and so we can pull out the integral symbol. Then the second derivative is

g⁰⁰(t) = ∂

∂tTrM₁^(k)(T(t);In)M₀^(k)(B(t))

=TrM₁^(k)(T(t);In)M₁^(k)(B⁰(t);B(t)) +TrM₁^(k)(T⁰(t);In)M₀^(k)(B(t))

= Z ₁

0 dsTrM₁^(k)(T(t);In)M₀^(k)(B(t)^s)M₁^(k)(T(t);In)M₀^(k)(B(t)¹^−s)

−TrM₁^(k)(R(t);In)M₀^(k)(B(t)).

WriteT =T(0), R= R(0)and B= B(0). We then apply Lemma 5.4.2 to reach g(0)g⁰⁰(0)

=TrM₀^(k)(B)(Z ₁

0 dsTrM₁^(k)(T;In)M₀^(k)(B^s)M₁^(k)(T;In)M₀^(k)(B¹^−s)

−TrM₁^(k)(R;In)M₀^(k)(B))

=TrM₀^(k)(B)(Z ₁

0 dsTrM₁^(k)(T B^s;B^s)M₁^(k)(T B¹^−s;B¹^−s)

−TrM₁^(k)(RB;B))

≤ TrM₀^(k)(B)TrM₂^(k)(T B,T B,B)].

To continue, we use definitions (5.46), identity (5.50) and the Alexandrov–Fenchel inequality (Theorem 5.6.1) to obtain

TrM₀⁽^k)(B)TrM₂^(k)(T B,T B,B)]

= n!

k!(n−k)!D(B,· · · ,B

| {z }

,In,· · ·,In

| {z }

n−k

)

× n!

(k−2)!(n−k)!D(T B,T B,B· · · ,B

| {z }

k−2

,In,· · · ,In

| {z }

n−k

)

≤ k−1 k

(k−1)!(n−k)!D(T B,B· · ·,B

| {z }

k−1

,In,· · ·,In

| {z }

n−k

= k−1

k TrM₁^(k)(T B,B)2

= k−1

k (g⁰(0))². We therefore have proved that

g(0)g⁰⁰(0) ≤ k−1

k (g⁰(0))². The concavity of fH,k(A)onH⁺⁺_n then follows.

Next we prove the equivalence of (i) the concavity of the functions f_H,k(A)onH⁺⁺_n and (ii) the concavity of the functions ˜fH,k =log Trkexp H+logA onH⁺⁺_n . (i)

⇒(ii) is trivial. To prove (ii)⇒(i), we need the following lemma.

Let x = (x₁,x₂) ∈ (0,+∞)². Define f(x) = Trkexp H + log(x₁A₁+ x₂A₂) . One can easily verify that fH,k(A) being concave on H⁺⁺_n is equivalent to f(x)¹^k being concave on(0,+∞)²for arbitrary but fixed choice of A₁,A₂ ∈H⁺⁺_n ,H ∈Hn. Similarly, ˜fH,k(A) being concave onH⁺⁺_n is equivalent to log f(x) being concave on (0,+∞)² for arbitrary but fixed choice of A₁,A₂ ∈ H⁺⁺_n ,H ∈ Hn. Using the definition of the k-trace Trk, it is easy to check that f(x)is homogeneous of order k. By Lemma 5.6.9, we know f(x)¹^k is concave if and only if log f(x)is concave.

Therefore we have (i)⇔(ii).

Dalam dokumen De Huang (Halaman 158-164)