vol9_pp112-117. 107KB Jun 04 2011 12:06:16 AM

(1)

VARIATIONAL CHARACTERIZATIONS OF THE SIGN-REAL

AND THE SIGN-COMPLEX SPECTRAL RADIUS∗

SIEGFRIED M. RUMP†

Key words. Generalized spectral radius, sign-real spectral radius, sign-complex spectral radius, Perron-Frobenius theory.

AMS subject classifications.15A48, 15A18

Abstract. The sign-real and the sign-complex spectral radius, also called the generalized spec-tral radius, proved to be an interesting generalization of the classical Perron-Frobenius theory (for nonnegative matrices) to general real and to general complex matrices, respectively. Especially the generalization of the well-known Collatz-Wielandt max-min characterization shows one of the many one-to-one correspondences to classical Perron-Frobenius theory. In this paper the corresponding inf-max characterization as well as variational characterizations of the generalized (real and com-plex) spectral radius are presented. Again those are almost identical to the corresponding results in classical Perron-Frobenius theory.

1. Introduction. _DenoteR₊_:=_{_x_≥_{0 :} _x_∈R_}_{, and let}K_{∈ {}R₊_,R_,C_}_{. The}

generalized spectral radius is defined [6] by

ρK(A) := m ax{|λ|: ∃0=x∈Kn_, _∃_λ_∈K_, _|_Ax_|₌_|_λx_|} _for _A_∈_M_n₍K₎_.

(1.1)

Note that absolute value and comparison of matrices and vectors are always to be

understood componentwise. For example, A ≤ |C| for A ∈ Mn(R), C ∈ Mn(C) is

equivalent toAij ≤ |Cij|for alli, j.

ForK₌R₊ _{the quantity in (1.1) is the classical Perron root, for} K_{∈ {}R_,C_} _it

is the sign-real or sign-complex spectral radius, respectively. Note that the quantities

are only defined for matrices over the specific set K_{, and also note that for}_ρR _the

maximum|λ| is only taken over realλ and realx. Vectors 0=x∈Kn _{and scalars}

λ∈K_{satisfying the nonlinear eigenequation} _|_Ax_| ₌_|_λx_| _{are also called generalized}

eigenvectors and generalized eigenvalues, respectively.

Denote the set of signature matrices overK_by_S₍K_{), which are diagonal matrices}

S with |Sii|= 1 for all i. In short notation S ∈ S(K) :⇔ S ∈Mn(K) and|S|=I.

ForK₌R₊ _{this is just the identity matrix}_I_{, for}K₌R_{the set of}_S _{= diag(}_±_{1) or}

diagonal orthogonal, and forK₌C_{the set of diagonal unitary matrices. Obviously,}

for y ∈ Kn _{there is} _S _{∈ S}₍K_{) with} _Sy _≥ _{0. In case} _|_y_| _> _{0, this} _S _{is uniquely}

determined. Note thatS−1₌_S∗_{∈ S}₍_K_{) for all}_S_{∈ S}₍_K_).

By definition (1.1) there is y∈Kn _with_|_Ay_|₌_|_ry_|₌_r_|_y_| _for_r_:=_ρK₍_A_{), and}

therefore forK_{∈ {}R₊_,R_,C_}_,

∃S∈ S(K₎_, _∃₀₌_y_∈Kn _: _SAy₌_ry

(1.2)

∗_{Received by the editors on 20 April 2002. Final manuscript accepted on 7 June 2002. Handling} editor: Ludwig Elsner.

†_{Institut f¨}_{ur Informatik III, Technical University Hamburg-Harburg, Schwarzenbergstr. 95, 21071} Hamburg, Germany ([email protected]).

(2)

and

∃S1, S2∈ S(K), ∃x≥0, x= 0 : S1AS2x=rx.

(1.3)

Among the variational characterizations of the Perron root are

max

x≥0 xmini=0

(Ax)i xi

=ρR+₍_A_{) =}_ρ₍_A_{) = inf}

x>0maxi

(Ax)i xi

for A≥0

(1.4)

and

max

x≥0 miny≥0 yT

x=0 yT_Ax

yT_x =ρ(A) = m in_y_≥₀ max_x_≥₀ yT

x=0 yT_Ax

yT_x for A≥0.

The purpose of this paper is to prove a generalization of both characterizations for the generalized spectral radius.

We note that the only non-obvious property of the generalized spectral radius we use is [6, Corollary 2.4]

ρK(A[µ])≤ρK(A) forK_{∈ {}R₊_,R_,C_}_{, A}_∈_M_n₍K_{) and}_µ_{⊆ {}₁_{, . . . , n}_}_.

(1.5)

2. Variational characterizations. _{For the following results we need three}

preparatory lemmata, the first showing that for K _{∈ {}R_,C_} _{there exists a}

gener-alized eigenvector in every orthant.

Lemma 2.1. _Let K_{∈ {}R_,C_} _and_A_∈_M_n₍K₎_{be given. Then}

∀S∈ S(K₎_, _∃₀₌_z_∈Kn_, _∃_λ_∈R₊ _: _Sz_≥₀_, _|_Az_|₌_λ_|_z_|_.

Remark 2.2. _{The condition} _Sz _≥_{0 for} _z _∈ Kn _means_Sz _∈Rn _and _Sz _≥_0,

or shortly Sz ∈ Rn

+. Note that Lemma 2.1 is also true for K= R+, in which case

S∈ S(K_{) implies}_S₌_I_.

Proof of Lemma 2.1. Let fixed S ∈ S(K_{) be given and define} _O _:= _{_z _∈ Kn _:

z1= 1, Sz≥0}. The setOis nonempty, compact and convex. If there exists some

z ∈ O withAz = 0 we are finished withλ= 0. SupposeAz = 0 for allz ∈ O and

define ϕ(x) := Ax−11·S∗|Ax|. Then ϕ is well-defined and continuous on O, and

ϕ: O → O, such that by Brouwer’s theoremthere exists a fixed point z∈ O with

ϕ(z) =Az1−1·S∗|Az|=z. Then|Az|=λSz=λ|z|withλ=Az1.

The next lemma states a property of vectors out of the interior of a certain orthant.

Lemma 2.3. _Let K_{∈ {}R_,C_}_,_A_∈_M_n₍K₎_{and define} _r_:=_ρK₍_A_{). Then}

∀S∈ S(K₎_, _∀_{ε >}₀_, _∃_z_∈Kn _: _{Sz >}₀_, _|_Az_{| ≤}₍_r₊_ε₎_{· |}_z_|_.

Proof. We proceed by induction. Forn= 1, it isr=|A11| ∈R+, andz:= sign(S11)∈

K_{does the job. Suppose the lemma is proved for dimension less than} _n_{. For given}

(3)

and |Az| = λ|z|. Then λ ≤ r by definition (1.1). If Sz > 0 we are finished. Let

µ:={j: zj = 0}and let µ:={1, . . . , n}\µ such that

T U V W x 0 =λ x 0 with

T =A[µ], U =A[µ, µ], V =A[µ, µ], W =A[µ], z[µ] =x and z[µ] = 0.

Then|T x|=λ|x|,V x= 0 and|x|>0.

By the induction hypothesis there existsy′_∈_K|µ|_with_S_[_µ_]_y′_>_{0 and}

|W y′| ≤ (ρK(W) +ε)|y′| ≤ (r+ε)|y′|,

where the latter inequality follows by (1.5). Define

α:=    min i xi

(U y′₎ i

forU y′_{= 0}

1 otherwise,

and sety:=αy′_{. Then}_|_y_|_>_{0 and}

A· x εy =

T x+εU y εW y ≤

λ|x|+εα|U y′_| εα(r+ε)|y′_|

≤(r+ε)

|x|

ε|y|

.

The above lemma is obviously not true when replacingr+εbyr, as the example

A=

1 1

0 1

withρK(A) = 1 forK_{∈ {}R₊_,R_,C_} _{shows. It is, at least for}K₌R_,

also not valid for irreducible|A|. Consider

A=

 

0 1 1

−1 0 1

−1 −1 0

 .

It has been shown in [5, Lemma 5.6] thatρR(A) = 1. We show that|Au| ≤uis not

possible foru >0. Setu:= (x, y, z)T_{, then} _|_Au_{| ≤}_u_{is equivalent to}

−x ≤ y+z ≤ x

−y ≤ −x+z ≤ y

−z ≤ −x−y ≤ z.

The second and third row imply that

x≤y+z and y≤ −x+z,

and by the first and second row,

x=y+z and y=−x+z

(4)

Third, we need a generalization of a theoremby Collatz [3, Section 2] to the complex case.

Lemma 2.4. _Let _A _∈_M_n₍C_), _A∗_z ₌_λz _for ₀ ₌ _z _∈Rn_, _λ_∈ C_{. Then for all}

x∈Rn _with_|_x_|_>₀ _and_x_i_z_i_≥₀ _{for all}_i _{the following estimations hold true:}

min Reµi ≤ Reλ ≤ max Reµi

min Imµi ≤ Imλ ≤ max Imµi,

whereµi:= (Ax)i/xi for 1≤i≤n.

Remark 2.5. _{Note that} _x _{and the left eigenvector}_z _of _A _{are assumed to be} real.

Proof of Lemma 2.4. Similar to Collatz’s original proof for the case A ≥0 we note that

i

(λ−µi)xizi=

i

xi(A∗z)i−

i

(Ax)izi=x∗A∗z−z∗Ax= 0,

the latter becausex andz are real. Now xizi are real nonnegative for all i, and by

|x|>0 not all productsxizi can be zero. The assertion follows.

With these preparations we can prove the first two-sided characterization ofρK.

Theorem 2.6. _Let K_{∈ {}R₊_,R_,C_} _and_A_∈_M_n₍K_{). Then}

max

S∈S(K)

max

x∈Kn Sx≥0

min

xi=0

(Ax)i xi =ρ K

(A) = m ax

S∈S(K)

inf

x∈Kn Sx>0 max i

(Ax)i xi . (2.1)

Remark 2.7. _{The characterization is almost identical to the classical}

Perron-Frobenius characterization (1.4). The difference is that for nonnegativeA the

non-negative orthant is the generic one, and vectors xcan be restricted to this generic

orthant. For general real or complex matrices, there is no longer a generic orthant, and therefore the max-min and inf-max characterization is maximized over all orthants.

Note that in the left hand side the two maximums can be replaced by max_x_∈Kn, but

are separated for didactic purposes.

Proof of Theorem 2.6. The result is well-known forK₌R₊_{, and the left equality}

was shown in [5, Theorem3.1] forK₌R_{, and for}K₌C_{it was shown in a different}

context in [4] and [2]; see also [6, Theorem2.3]. We need to prove the right equality for

K_{∈ {}R_,C_}_{. Let}_S_{∈ S}₍K_{) be fixed but arbitrary and denote}_r_:=_ρK₍_A_{). By Lemma}

2.3, for every ε >0 there exists som e x∈ Kn _with _{Sx >} _{0 and} _|_Ax_{| ≤} ₍_r₊_ε₎_|_x_|_,

so thatr is larger than or equal to the r.h.s. of (2.1). We will showris less than or

equal to the r.h.s. of (2.1) to finish the proof. By (1.3) andρK(A∗_{) =}_ρK₍_A_{) there}

isS1, S2∈ S(K) and 0=z∈Rn withz≥0 andS1A∗S2z=rz. Then for anyx∈Kn

withS1x >0, Lemma 2.4 implies that

max i

(Ax)i xi

= m ax

i

((S∗

2AS1∗)·S1x)i

(S1x)i

≥Rer=r.

(5)

Theorem 2.8. _Let K_{∈ {}R₊_,R_,C_} _and_A_∈_M_n₍K_{). Then}

max

S1,S2∈S(K)

max

x∈Kn S1x≥0

min

y∈Kn S2y≥0

|y∗_{| |}_x_|₌₀

|y∗_Ax_|

|y∗_{| |}_x_| =ρ

K

(A) = m ax

S1,S2∈S(K)

min

y∈Kn S2y≥0

max

x∈Kn S1x≥0

|y∗_{| |}_x_|₌₀

|y∗_Ax_|

|y∗_{| |}_x_|. (2.2)

Proof. Let, according to (1.2), SAx = rx for S ∈ S(K_{), 0} ₌ _x _∈ Kn _and

r=ρK(A). Define S1 such thatS1x≥0 and set S2=S1S. Then for everyy ∈Kn

withS2y≥0 and|y∗| |x| = 0, it isS1x=|x|,S2y=|y|,S∗2S1S=I and

y∗Ax=y∗S2∗S1SAx=ry∗S2∗S1x=r|y∗| |x|, or |y∗Ax|=r|y∗| |x|.

That means for the specific choice ofS1,S2 andx, the ratio|y∗Ax|/(|y∗| |x|) is equal

to r independent of the choice ofy providedS2y ≥0. Therefore, both the left and

the right hand side of (2.2) are greater than or equal tor=ρK(A). This proves also

that the extrema are actually achieved.

On the other hand, letS1, S2∈ S(K) andx∈Kn,S1x≥0 be fixed but arbitrarily

given. Denote µ:={j : xj = 0}, k:=|µ|, and µ:={1, . . . , n}\µ. By Lem m a 2.1,

there exists ˜y ∈ Kk _{with ˜}_y _{= 0,} _S₂_[_µ_{] ˜}_y _≥ _{0 and} _|_A∗_[_µ_]_· _y_˜_| ₌ _λ_|_y_˜_| _for _λ _≥ _0.

Thereforeλ≤ρK(A∗_[_µ_{]) =}_ρK₍_A_[_µ_{]). Define}_y _∈_Kn _by_y_[_µ_{] := ˜}_y _and _y_[_µ_{] := 0.}

Then|y∗_{| |}_x_|₌_|_y_[_µ_]∗_{| |}_x_[_µ_]_|_{= 0 and}_x_[_µ_{] = 0 imply that}

|y∗Ax|=|y[µ]∗A[µ]x[µ]| ≤ |y[µ]∗A[µ]| · |x[µ]|=λ|y[µ]∗| |x[µ]|=λ|y∗| |x|.

By (1.5),

|y∗_Ax_|

|y∗_{| |}_x_| ≤λ≤ρ

K

(A).

Therefore, for that choice ofy (depending onS1,S2andx) the left hand side of (2.2)

is less than or equal toρK(A). It remains to prove that the right hand side of (2.2)

is less than or equal toρK(A). LetS1, S2 be given, fixed but arbitrary. By Lemma

2.1, there exists 0=y∈Kn _with_S₂_y_≥_{0 and}_|_A∗_y_|₌_λ_|_y_|_for _λ_∈_R

+. Then for all

x∈Kn_,

|y∗Ax| ≤ |y∗A| |x|=λ|y∗| |x|,

such that for that choice ofy (depending onS1, S2) the ratio|y∗Ax|/(|y∗| |x|) is less

than or equal toλfor allx∈Kn _with_|_y∗_{| |}_x_|_{= 0. It follows that the right hand side}

of (2.2) is less than or equal toλ≤ρK(A∗_{) =}_ρK₍_A_{), and the proof is finished.}

We note that Theorem2.8 and its proof cover the case K₌R₊_{, where in this}

caseS(R₊_{) consists only of the identity matrix. That means for general}_A_≥_0,

max

x≥0 miny≥0 yT

x=0 yT_Ax

yT_x =ρ(A) = m in_y_≥₀ max_x_≥₀ yT

x=0 yT_Ax

(6)

Finally we note that for the classical Perron-Frobenius theory this characteri-zation is mentioned without proof in the classical book by Varga [7] for irreducible matrices. As in other textbooks, the result is referenced as if it were included in [1], where in turn we only found a reference to an internal report.

REFERENCES

[1] G. Birkhoff and R.S. Varga. Reactor Criticality and Nonnegative Matrices.J. Soc. Indust. Appl. Math., 6:354–377, 1958.

[2] B. Cain. private communication, 1998.

[3] L. Collatz. Einschließungssatz f¨ur die charakteristischen Zahlen von Matrizen.Math. Z., 48:221– 226, 1942.

[4] J.C. Doyle. Analysis of Feedback Systems withStructured Uncertainties.IEE Proceedings, Part D, 129:242–250, 1982.

[5] S.M. Rump. Theorems of Perron-Frobenius type for matrices without sign restrictions. Linear Algebra Appl., 266:1–42, 1997.

[6] S.M. Rump. Perron-Frobenius Theory for Complex Matrices, to appear inLinear Algebra Appl., 2002.