Minimizing the maximal eigenvalue of a Hermitian matrix function 20

1.2 Preliminaries

1.2.4 Minimizing the maximal eigenvalue of a Hermitian matrix function 20

1.2.4 Minimizing the maximal eigenvalue of a Hermitian matrix

(2) If the minimumλ^?_maxofLis attained at(t^?₀, . . . , t^?_m)∈R^m+1and is a simple eigenvalue of H? :=G+t^?₀H0+· · ·+t^?_mHm, then there exists an eigenvectoru∈Cⁿ\ {0} of H?

associated with λ^?_max satisfying

u^∗H_ju= 0 for j = 0, . . . , m. (1.2.21) (3) Under the assumptions of (2), we have

sup

u^∗Gu u^∗u

u∈Cⁿ\ {0}, u^∗Hju= 0, j = 0, . . . , m

=λ^?_max. (1.2.22) In particular, the supremum of the left hand side of (1.2.22) is a maximum and is attained for the eigenvector u in (2).

Proof. (1) The convexity of L is straightforward to check. Concerning the proof that L has a global minimum, we will show that there exists a constant % > 0 such that for all (t0, . . . , tm) with t²₀ +· · ·+t²_m > %² we have L(t0, . . . , tm) ≥ L(0, . . . ,0). Since the closed ball

B^%:={(t0, . . . , tm)∈R^m+1|t²₀+· · ·+t²_m ≤%²}

with center in the origin and radius % is compact and since L is continuous as eigenvalues depend continuously on the entries of a matrix,Lhas a global minimumλ^?_max≤L(0, . . . ,0) onB^%. By construction we then have λ^?_max≤L(t0, . . . , tm) for all (t0, . . . , tm)∈R^m+1, i.e., λ^?_max is the global minimum of L. Thus, define

c:= inf

λmax(α0H0+· · ·+αmHm)(α0, . . . , αm)∈R^m+1, α²₀+· · ·+α²_m = 1 . Then c ≥ 0, because by hypothesis the matrix α0H0 +· · ·+αmHm is indefinite for all (α0, . . . , αm) ∈ R^m+1 with α₀² +· · · +α²_m = 1, i.e., it always has at least one positive eigenvalue. Since the function f : (α0, . . . , αm)7→λmax(α0H0+· · ·+αmHm) is continuous (again using the well known fact that eigenvalues depend continuously on the entries of a matrix), the infimum cis attained, because of the compactness of the unit sphere inR^m+1. This implies c > 0, because the function f only takes positive values on the unit sphere.

Next, define

%:= λmax(G)−λmin(G)

c ≥0.

Let (t₀, . . . , t_m) ∈ R^m+1 and r ≥ % so that t²₀ +· · ·+t²_m = r² ≥ %². Using the fact that for two Hermitian matrices A, B ∈Cⁿ^×ⁿ we haveλ_max(A+B)≥λ_max(A) +λ_min(B), (see

[22]), we obtain

L(t0, . . . , tm) = λmax(G+t0H0+· · ·+tmHm)≥λmax t0H0+· · ·+tmHm) +λmin(G)

= r·λ_max t0

rH₀+· · ·+tm

r H_m

+λ_min(G)

≥ %·c+λmin(G) =λmax(G) =L(0, . . . ,0), This finishes the proof of (1).

(2) By step (1), the minimum λ^?_max of L exists and by assumption it is attained at some point (t^?₀, . . . , t^?_m) ∈ R^m+1 and is a simple eigenvalue of the corresponding matrix G+t^?₀H0 +· · ·+t^?_mHm. Then, it follows from Theorem 1.2.17 that L is differentiable at (t^?₀, . . . , t^?_m) with partial derivatives,

∂L

∂tj

(t^?₀, . . . , t^?_m) =u^∗Hju, j = 0, . . . , m,

whereuis an eigenvector ofG+t^?₀H₀+· · ·+t^?_mH_m associated with λ^?_maxsatisfyingkuk= 1.

Since λ^?_max is the global minimum of L, this implies u^∗H_ju= 0 for j = 0, . . . , m.

(3) Let s^? denote the left hand side of (1.2.22). We show that s^? =λ^?_max.

“≥”: By (2), there exists an eigenvectoru∈Cⁿ\{0}ofG+t^?₀H0+· · ·+t^?_mHm associated with λ^?_max satisfying u^∗Hju= 0 for j = 0, . . . , m. Thus, we obtain

λ^?_max = u^∗(G+t^?₀H0+· · ·+t^?_mHm)u

u^∗u = u^∗Gu

u^∗u which implies that s^? ≥λ^?_max.

“≤”: Let u ∈ Cⁿ\ {0} be an arbitrary vector satisfying u^∗Hju = 0 for j = 0, . . . , m (by “≥” there do exists such vectors). Then, we obtain

u^∗Gu

u^∗u = u^∗(G+t^?₀H0+· · ·+t^?_mHm)u

u^∗u ≤λmax(G+t^?₀H0+· · ·+t^?_mHm) =λ^?_max. Since u was arbitrary, this impliess^? ≤λ^?_max. This completes the proof.

Remark 1.2.19. We highlight that the applicability of Theorem 1.2.18 relies heavily on the fact that the eigenvalue λ^?_max is a simple eigenvalue. This need not be the case always as the following example shows.

Example 1.2.20. Consider the Hermitian 2×2 matrices G= 0, H₀ =

1 0 0 −1

and H₁ =

0 1 1 0

# .

Then for t0, t1 ∈R, the matrix

H(t0, t1) =G+t0H0+t1H1 =

t0 t1

t1 −t0

has the eigenvalues ±p

t²₀+t²₁ which implies in particular that any nonzero linear combination α0H0 + α1H1 is indefinite. Moreover, the function L : R² → R given by L(t0, t1) := λmax H(t0, t1)

has its minimum at (t^?₀, t^?₁) = (0,0) with value λ^?_max = 0 which happens to be a double eigenvalue of the zero matrix H(0,0). Nevertheless, the vector u = h

1 i i>

is an eigenvector of H(0,0) associated with λ^?_max = 0 satisfying u^∗H0u= 0 =u^∗H1u.

Example 1.2.20 suggests that the statement (2) of Theorem 1.2.18 may still be true even without the hypothesis of λ^?_max being a simple eigenvalue. The next theorem shows that in the case of the pencils when m= 1, this is indeed always the case.

Theorem 1.2.21. Let G, H0, H1 ∈ Cⁿ^×ⁿ be Hermitian matrices. Assume that any linear combination α0H0+α1H1, (α0, α1)∈R²\ {0} is indefinite. Then the following statements hold:

(1) The function L:R² →R given by L(t0, t1) :=λmax(G+t0H0+t1H1) is convex and has a global minimum λ^?_max.

(2) If the minimumλ^?_max of Lis attained at (t^?₀, t^?₁)∈R², then there exists an eigenvector u∈Cⁿ\ {0} of G+t^?₀H0+t^?₁H1 associated with λ^?_max satisfying

u^∗H₀u= 0 =u^∗H₁u. (1.2.23) (3) We have

sup

u^∗Gu u^∗u

u∈Cⁿ\ {0}, u^∗H0u= 0, u^∗H1u= 0

= min

t0,t1∈RL(t0, t1) = λ^?_max. (1.2.24) In particular, the supremum of the left hand side of (1.2.24) is a maximum and is attained for the eigenvector u in (2).

Proof. In view of Theorem 1.2.18, it remains to prove (2) for the case thatλ^?_maxis a multiple eigenvalue of G+t^?₀H0+t^?₁H1. Let the columns of U ∈Cⁿ^×^m form an orthonormal basis of the eigenspace of G+t^?₀H0+t^?₁H1 associated with λ^?_max. Moreover, let α0, α1 ∈R such

thatα²₀+α₁² = 1. By Theorem 1.2.17, we obtain the existence of the limit of the one-sided derivatives at t= 0 of the function

t 7→L(t^?₀+α0t, t^?₁+α1t) =λmax (G+t^?₀H0+t^?₁H1) +t(α0H0+α1H1)

and this limit must be nonnegative, because there is a global minimum at (t^?₀, t^?₁). More precisely, we obtain from Theorem 1.2.17 that

λmax U^∗(α0H0+α1H1)U

= lim

ε→0 ε>0

L(t^?₀+α0ε, t^?₁+α1ε)−L(t^?₀, t^?₁)

ε ≥0

for all α0, α1 ∈ R with α²₀ +α²₁ = 1. Thus, for all such α = (α0, α1) there exists an eigenvector xα ∈C^m, kxαk = 1 associated withλmax U^∗(α0H0+α1H1)U

such that x^∗_αU^∗(α0H0+α1H1)Uxα =λmax U^∗(α0H0+α1H1)U

≥0 (1.2.25) We now show the existence of a vector x∈C^m with kxk= 1 such that

x^∗U^∗H0Ux= 0 =x^∗U^∗H1Ux. (1.2.26) Then u=Ux is the desired eigenvector of G+t^?₀H0+t^?₁H1 satisfying (1.2.21).

Recall that the joint numerical range of two Hermitian matrices F1, F2 ∈ Cⁿ^×ⁿ is the set

W0(F1, F2) :={(x^∗F1x, x^∗F2x)∈R²|x∈Cⁿ,kxk= 1}.

Thus the existence of a vector x with kxk = 1 satisfying (1.2.26) is equivalent to the fact that zero is in the joint numerical range W0 := W0(U^∗H0U, U^∗H1U) of the matrices U^∗H0U andU^∗H1U. Thus, let us assume that zero isnot inW0. SinceW0is a closed convex set [23], by [21, Theorem 4.11, page 51] this implies the existence ofαe= [αe0,αe1]^> ∈R²\{0} (without loss of generality we may assume αe²₀ +αe₁²= 1) with

* e α,

x^∗U^∗H0Ux x^∗U^∗H1Ux

=x^∗U^∗(αe0H0+αe1H1)Ux

for all x ∈ C^m with kxk = 1 contradicting (1.2.25). Hence, zero is in the joint numerical range of U^∗H₀U and U^∗H₁U which finishes the proof of (2) and thus of the theorem.

Remark 1.2.22. Ifm >1 in the above result, then since 0 is in the joint numerical range of them×mHermitian matricesU^∗H0U andU^∗H1U,the Hermitian pencilzU^∗H0U+U^∗H1U is not a definite pencil (see, [40] for details). Therefore its eigenvalues do not satisfy the conditions that characterize definite pencils as specified in Theorem 3.2 of [6]. These facts may be used in the numerical computation of the eigenvectorxcorresponding toλ^?_maxsuch that x^∗U^∗H0Ux=x^∗U^∗H1Ux= 0 when λ^?_max is a multiple eigenvalue ofG+t^?₀H0+t^?₁H1.

Remark 1.2.23. Unfortunately, the argument in the proof of Theorem 1.2.21 cannot be generalized to the casem >1, because the joint numerical range of three or more Hermitian matrices need not be convex.

Example 1.2.24. Consider the Hermitian 3×3 matrices G = diag(α, α, β), where α >

β ≥0, and

H0 =





1 0 0

0 −1 0

0 0 0



, H1 =





0 1 0 1 0 0 0 0 0



, and H2 =





0 i 0

−i 0 0 0 0 0



.

Then for t0, t1, t2 ∈R, the matrix

H(t0, t1, t2) =G+t0H0+t1H1+t2H2 =





α+t₀ t₁+it₂ 0 t1−it2 α−t0 0

0 0 β





has the eigenvaluesβandα±p

t²₀+t²₁+t²₂. Again, any nonzero linear combinationα₀H₀+ α₁H₁+α₂H₂ is indefinite. Similar to Example 1.2.20, the function L: R³ → R given by L(t0, t1, t2) = λmax H(t0, t1, t2)

has its minimum in (0,0,0) with value λ^?_max = α which happens to be a double eigenvalue of the matrix H(0,0,0) = G. In this case, a matrix whose columns form an orthonormal bases of the eigenspace ofH(0,0,0) associated withα is the 3×2 matrixU = [e1 e2], wheree1 and e2 denote the first two standard basis vectors.

One easily checks that zero is not in the joint numerical range of U^∗H0U =

1 0

0 −1

, U^∗H1U =

0 1 1 0

, andU^∗H2U =

0 i

−i 0

(1.2.27) and hence no eigenvector u of H(0,0) associated with λ^?_max = α satisfies u^∗Hju = 0 for j = 0,1,2.

Note that in this example, scalar multiples of the third standard basis vector e₃ are the only vectors u satisfying u^∗Hju= 0 for j = 0,1,2 which shows that the left hand side of (1.2.22) in Theorem 1.2.18 equals β which is strictly less than α =λ^?_max.

The three Hermitian matrices in (1.2.27) are a classical example for Hermitian matrices whose joint numerical range is not convex [9, 19].

Chapter 2 Structured eigenvalue backward

errors under ||| . ||| _w, ² norm: Hermitian and related structures

In this chapter, we derive a formula for the backward error of a complex number λ when considered as an approximate eigenvalue of a regular Hermitian matrix pencil or polynomial P(z) =Pm

j=0z^jAj under Hermitian perturbations measured with respect to the|||.|||^w,2norm as defined in (1.2.1). If P(z) is a skew-Hermitian matrix polynomial then Q(z) =iP(z) is a Hermitian matrix polynomial. Likewise, if P(z) is a ∗-even (∗-odd) matrix polynomial then Q(z) =P(iz) (Q(z) =iP(iz)) is a Hermitian matrix polynomial. Due to these facts, structured eigenvalue backward errors of approximate eigenvalues of skew-Hermitian, ∗- even and∗-odd matrix pencils and polynomials are also obtained in terms of the appropriate Hermitian eigenvalue backward error. Numerical experiments suggest that in many cases there is a significant difference between the backward errors with respect to perturbations that preserve structure and those with respect to arbitrary perturbations.

2.1 Framework for the Hermitian backward error

LetP(z) =Pm

j=0z^jA_jbe a Hermitian matrix polynomial, i.e, (A₀, . . . , A_m)∈(Herm(n))^m+1 and λ ∈ C\ {0} be such that M = (P(λ))⁻¹ exists. Let w = (w0, . . . , wm) be a weight vector. In the view of Lemma 1.2.6 we have the following.

η^Herm_w,2 (P, λ) = infn

|||(∆0, . . . ,∆m)|||^w,2(∆0, . . . ,∆m)∈(Herm(n))^m+1, ∃v0, . . . , vm ∈Cⁿ, v_λ :=Pm

j=0λ^jv_j 6= 0, v_j = ∆_jMv_λ, j = 0, . . . , mo

. (2.1.1)

By Theorem 1.2.9, there exist ∆j ∈Herm(n) such that ∆jMvλ =vj forj = 0, . . . , mif and only if v_j^∗Mvλ ∈R for each j = 0, . . . , m. Also the minimal 2-norm of all such ∆js is

kvjk kM v_λk.

As briefly mentioned in Chapter 1, the aim is to reformulate the original problem of computing η_w,2^Herm(P, λ) into an equivalent optimization problem of maximizing Rayleigh Quotient of a Hermitian matrix under some constraints involving Hermitian matrices.

We further convert this reformulated problem into an equivalent problem of minimizing the maximal eigenvalue of a parameter-depending Hermitian matrix using the strategy suggested by M. Karow in [26].

In order to illustrate the main ideas, let us first consider the pencil case m= 1. Thus, for the moment, assume thatP(z) =A0+zA1 and for simplicity let us consider the |||.|||^w,2 norm (1.2.1) with weight vector w = (1,1). In view of (2.1.1), we need to find vectors v0, v1 ∈Cⁿ with vλ :=v0+λv1 6= 0 and matrices ∆0,∆1 ∈Herm(n) of minimal norm such that

v0 = ∆0Mvλ and v1 = ∆1Mvλ, (2.1.2) where M := P(λ)⁻¹. By Theorem 1.2.9 the minimal norm |||(∆0,∆1)|||^w,2 for a fixed pair (v0, v1) is then given by

|||(∆0,∆1)|||²w,2 =k∆0k²+k∆1k² = kv0k²

kMvλk² + kv1k²

kMvλk² = kv0k²+kv1k² kM(λv1+v0)k². Setting

v :=

, and G:=

M^∗M λ M^∗M λ M¯ ^∗M |λ|²M^∗M

# , and using kM(v0+λv1)k² = (v^∗₀+ ¯λv^∗₁)M^∗M(v0 +λv1) we obtain

|||(∆0,∆1)|||²w,2= kv₀k²+kv₁k²

kM(v0+λv1)k² = v^∗v v^∗

M^∗M λ M^∗M λ M¯ ^∗M |λ|²M^∗M

# v

= v^∗v

v^∗Gv (2.1.3)

which is just the reciprocal of the Rayleigh quotient of v with respect to the Hermitian matrix G. Since this quantity is minimal in norm for a fixed pair (v0, v1), we now have

to minimize (2.1.3) over all admissible pairs (v0, v1), i.e., all pairs for which there exists

∆j ∈Herm(n), j = 0,1 such that (2.1.2) is satisfied. By Theorem 1.2.9 those are exactly the pairs (v0, v1) satisfying Im v₀^∗M(v0+λv1)

= 0 = Im v₁^∗M(v0+λv1)

andv0+λv1 6= 0.

Setting

H0 :=i

M −M^∗ λM

−λM¯ ^∗ 0

and H1 :=i

0 −M^∗

M λM −λM¯ ^∗

these identities can be reformulated as 0 = −2 Im v₀^∗M(v0+λv1)

=i v₀^∗M(v0+λv1)−(M(v0 +λv1))^∗v0

= i "

#_∗"

M λ M

0 0

# "

−

#_∗"

M^∗ 0 λ M¯ ^∗ 0

# "

= v^∗H0v, (2.1.4)

and

0 = −2 Im v₁^∗M(v0+λv1)

=i v₁^∗M(v0+λv1)−(M(v0 +λv1))^∗v1

= i "

#_∗"

0 0

M λ M

# "

−

#_∗"

0 M^∗ 0 ¯λ M^∗

# "

= v^∗H1v. (2.1.5)

Observe that v^∗Gv =kM(v0+λv1)k² 6= 0 if and only if v0+λv1 6= 0. Thus, from (2.1.1) that for S= (Herm(n))² we have

η_w,2^S (P, λ)2

= infn

|||(∆0,∆1)|||²w,2

∆j ∈Herm(n),∃v0, v1 ∈Cⁿ : λ v1+v0 6= 0, vj = ∆jM(λ v1+v0), j = 0,1o

= inf

v^∗v v^∗Gv

v ∈C²ⁿ, v^∗Gv 6= 0, v^∗H₀v = 0, v^∗H₁v = 0

sup

v^∗Gv v^∗v

v ∈C²ⁿ\ {0}, v^∗H₀v = 0, v^∗H₁v = 0 ₋1

. (2.1.6) Note that in the latter identity the conditionv^∗Gv 6= 0 could be dropped, becauseη_w,2^S (P, λ) is finite which implies that the supremum in (2.1.6) will be positive. Therefore including vectors v satisfying v^∗Gv = 0 will not change the supremum of the considered set.

From these observations, we see that the structured backward error η_w,2^S (P, λ) can be computed by maximizing a Rayleigh quotient under two constraints. Since for Hermitian

matrices the maximum of the Rayleigh quotient is equal to the maximal eigenvalue, the idea is to introduce the function

L:R² →R, (t0, t1)7→λmax(G+t0H0+t1H1). (2.1.7) and invoke Theorem 1.2.18. This will establish that the supremum in (2.1.6) coincides with the global minimum of L and give a formula to compute η_w,2^S (P, λ). With this objective in mind, we will prove that the matrices G, H₀ and H₁ satisfy the conditions of Theorem 1.2.18.

We extend these ideas of the pencil case in the next section to obtain backward errors of approximate eigenvalues of Hermitian polynomials under structure preserving perturbations.

2.2 Structured eigenvalue backward errors of Hermit-

Dalam dokumen Eigenvalue Backward Errors of Polynomial Eigenvalue Problems under Structure Preserving Perturbations (Halaman 42-51)

Minimizing the maximal eigenvalue of a Hermitian matrix function 20

1.2 Preliminaries

1.2.4 Minimizing the maximal eigenvalue of a Hermitian matrix function 20

1.2.4 Minimizing the maximal eigenvalue of a Hermitian matrix

Chapter 2

Structured eigenvalue backward

errors under ||| . ||| w, 2 norm: Hermitian and related structures

2.1 Framework for the Hermitian backward error

2.2 Structured eigenvalue backward errors of Hermit-

errors under ||| . ||| _w, ² norm: Hermitian and related structures