• Tidak ada hasil yang ditemukan

050_09_www. 302KB Jun 04 2011 12:07:13 AM

N/A
N/A
Protected

Academic year: 2017

Membagikan "050_09_www. 302KB Jun 04 2011 12:07:13 AM"

Copied!
16
0
0

Teks penuh

(1)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

Title Page

Contents

◭◭ ◮◮

◭ ◮

Page1of 16 Go Back

Full Screen

Close

TRACE INEQUALITIES WITH APPLICATIONS TO

ORTHOGONAL REGRESSION AND MATRIX

NEARNESS PROBLEMS

I.D. COOPE AND P.F. RENAUD

Department of Mathematics and Statistics University of Canterbury

Christchurch, New Zealand

EMail:{ian.coope,peter.renaud}@canterbury.ac.nz

Received: 19 February, 2009

Accepted: 17 September, 2009

Communicated by: S.S. Dragomir

2000 AMS Sub. Class.: 15A42, 15A45, 15A51, 65F30.

Key words: Trace inequalities, stochastic matrices, orthogonal regression, matrix nearness problems.

Abstract: Matrix trace inequalities are finding increased use in many areas such as analysis, where they can be used to generalise several well known classical inequalities, and computational statistics, where they can be applied, for example, to data fit-ting problems. In this paper we give simple proofs of two useful matrix trace inequalities and provide applications to orthogonal regression and matrix near-ness problems.

(2)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

Title Page

Contents

◭◭ ◮◮

◭ ◮

Page2of 16 Go Back

Full Screen

Close

Contents

1 Introduction 3

2 A Matrix Trace Inequality 4

3 An Application to Data Fitting 6

4 A Stronger Result 10

(3)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

Title Page

Contents

◭◭ ◮◮

◭ ◮

Page3of 16 Go Back

Full Screen

Close

1.

Introduction

(4)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

Title Page

Contents

◭◭ ◮◮

◭ ◮

Page4of 16 Go Back

Full Screen

Close

2.

A Matrix Trace Inequality

The following result contains the basic ideas we need when considering best approx-imation problems. Although the result is well known, an alternative proof paves the way for the applications which follow.

Theorem 2.1. LetXbe an×nHermitian matrix withrank(X) =rand letQkbe an n×kmatrix, k ≤ r, with korthonormal columns. Then, for givenX,tr(Q∗

kXQk)

is maximized when Qk = Vk, where Vk = [v1, v2, . . . , vk] denotes a matrix of k

orthonormal eigenvectors ofX corresponding to theklargest eigenvalues.

Proof. LetX = V DV∗

be a spectral decomposition ofX withV unitary andD = diag[λ1, λ2, . . . , λn], the diagonal matrix of (real) eigenvalues ordered so that

(2.1) λ1 ≥λ2 ≥ · · · ≥λn.

Then,

(2.2) tr(Q∗

kXQk) = tr(Z

kDZk) = tr(ZkZk∗D) = tr(P D),

whereZk=V∗QkandP =ZkZk∗is a projection matrix withrank(P) =k. Clearly,

the n ×k matrix Zk has orthonormal columns if and only if Qk has orthonormal

columns. Now

tr(P D) = n

X

i=1 Piiλi

with 0 ≤ Pii ≤ 1, i = 1,2, . . . , n and Pni=1Pii = k because P is an Hermitian

projection matrix with rankk. Hence,

tr(Q∗

(5)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

whereLdenotes the maximum value attained by the linear programming problem:

(2.3) max

An optimal basic feasible solution to the LP problem (2.3) is easily identified (noting the ordering (2.1)) aspj = 1, j = 1,2, . . . , k;pj = 0, j =k+ 1, k+ 2, . . . , n, with

L = Pk

1λi. However, P = EkEk∗ gives tr(P D) = L where Ek is the matrix

with orthonormal columns formed from the first k columns of the n ×n identity matrix, therefore (2.2) provides the required result thatQk =V Ek =Vkmaximizes trQ∗ Rn×kdenotes a matrix of korthonormal right singular vectors ofY corresponding

to theklargest singular values.

Corollary 2.3. If a minimum rather than maximum is required then substitute the

k smallest eigenvalues/singular values in the above results and reverse the order-ing (2.1).

(6)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

Title Page

Contents

◭◭ ◮◮

◭ ◮

Page6of 16 Go Back

Full Screen

Close

3.

An Application to Data Fitting

Suppose that data is available as a set ofmpoints inRnrepresented by the columns of then×mmatrixAand it is required to find the bestk-dimensional linear manifold

Lk ∈ Rnapproximating the set of points in the sense that the sum of squares of the

distances of each data point from its orthogonal projection onto the linear manifold is minimized. A general point inLk can be expressed in parametric form as

(3.1) x(t) = z+Zkt, t∈Rk,

wherez is a fixed point inLkand the columns of then×kmatrixZk can be taken

to be orthonormal. The problem is now to identify a suitablez and Zk. Now the orthogonal projection of a pointa ∈RnontoL

kcan be written as

proj(a, Lk) =z+ZkZkT(a−z),

and hence the Euclidean distance fromatoLkis

dist(a, Lk) = ||a−proj(a, Lk)||2 =||(I−ZkZkT)(a−z)||2.

Therefore, the total least squares data-fitting problem is reduced to finding a suitable

zand correspondingZkto minimize the sum-of-squares function

SS =

m

X

j=1

||(I−ZkZkT)(aj −z)||22,

whereaj is thejth data point (jth column ofA). A necessary condition forSSto be

minimized with respect toz is

0 = m

X

j=1

(I−ZkZkT)(aj −z) = (I−ZkZkT) m

X

j=1

(7)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

column space ofZk. The parametric representation (3.1) shows that there is no loss

of generality in lettingPm

j=1(aj −z) = 0or

Thus, a suitable z has been determined and it should be noted that the value (3.2) solves the zero-dimensional case corresponding tok = 0. It remains to findZkwhen k >0, which is the problem:

subject to the constraint that the columns ofZk are orthonormal and thatz satisfies

equation (3.2). Using the properties of orthogonal projections and the definition of the vector 2-norm, (3.3) can be rewritten

(3.4) min

m

X

j=1

(aj−z)T(I−ZkZkT)(aj−z).

Ignoring the terms in (3.4) independent ofZkthen reduces the problem to

(8)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

Title Page

Contents

◭◭ ◮◮

◭ ◮

Page8of 16 Go Back

Full Screen

Close

The introduction of the trace operator in (3.5) is allowed because the argument to the trace function is a matrix with only one element. The commutative property of the trace then shows that problem (3.5) is equivalent to

max tr m

X

j=1

ZkT(aj −z)(aj−z)TZk≡max trZkTAˆAˆTZk,

whereAˆis the matrix

ˆ

A= [a1−z, a2−z, . . . , am−z].

Theorem2.1and its corollary then show that the required matrixZk can be taken to

be the matrix ofkleft singular vectors of the matrixAˆ(right singular vectors ofAˆT)

corresponding to theklargest singular values.

This result shows, not unexpectedly, that the best point lies on the best line which lies in the best plane, etc. Moreover, the total least squares problem described above clearly always has a solution although it will not be unique if the(k+ 1)th largest singular value ofAˆhas the same value as the kth largest. For example, if the data points are the 4 vertices of the unit square inR2,

A=

0 1 1 0 0 0 1 1

,

then any line passing through the centroid of the square is a best line in the total least squares sense because the matrix Aˆ for this data has two equal non-zero singular values.

The total least squares problem above (also referred to as orthogonal regression) has been considered by many authors and as is pointed out in [7, p 4]:

(9)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

Title Page

Contents

◭◭ ◮◮

◭ ◮

Page9of 16 Go Back

Full Screen

Close

(10)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

4.

A Stronger Result

The proof of Theorem2.1 relies on maximizing tr(DP) whereD is a (fixed) real diagonal matrix and P varies over all rank k projections. Since any two rank k

projections are unitarily equivalent the problem is now to maximize tr(DU∗P U

)

(for fixedDand P) over all unitary matrices U. Generalizing fromP to a general Hermitian matrix leads to the following theorem.

Theorem 4.1. LetA, B ben×nHermitian matrices. Then

max

are the eigenvalues ofAandBrespectively, both similarly ordered.

Clearly, Theorem2.1can be recovered since a projection of rankkhas eigenvalues1, repeatedktimes and0repeatedn−ktimes.

Proof. Let{ei}ni=1 be an orthonormal basis of eigenvectors of A corresponding to

the eigenvalues{αi}ni=1, written in descending order. Then

tr(AU∗

whereDis diagonal andV is unitary. WritingW =V U gives

(11)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

where theβj’s are the elements on the diagonal ofD, i.e. the eigenvalues ofB and

pij =|(W ei)j|2.

Note that sinceW is unitary, the matrixP = [pij], is doubly stochastic, i.e., has

non-negative entries and whose rows and columns sum to 1. The theorem will therefore follow once it is shown that forα1 ≥α2 ≥ · · · ≥αnandβ1 ≥β2 ≥ · · · ≥βn

where the maximum is taken over all doubly stochastic matricesP = [pij].

For fixedP doubly stochastic, let

χ=

is doubly stochastic where

(12)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

Title Page

Contents

◭◭ ◮◮

◭ ◮

Page12of 16 Go Back

Full Screen

Close

If we write

χ′

= n

X

i,j=1

αiβjp′ij,

then

χ′

−χ=ǫ(αkβk−αkβl−αmβkmβl) =ǫ(αk−αm)(βk−βl)

≥0

which means that the term P

αiβjpij is not decreased. Clearly ǫ can be chosen to

reduce a non-diagonal term inP to zero. After a finite number of iterations of this process it follows that P = I maximizes this term. This proves (4.2) and hence Theorem4.1.

Corollary 4.2. If a minimum rather than maximum is required then reverse the or-dering for one of the sets (4.1).

Note that this theorem can also be regarded as a generalization of the classical result that if{αi}n

i=1, {βi}ni=1 are real sequences then

P

(13)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

5.

A Matrix Nearness Problem

Theorem4.1also allows us to answer the following problem. IfA, B are Hermitian

n×n matrices, what is the smallest distance betweenA and a matrixB′

unitarily equivalent toB? Specifically, we have:

Theorem 5.1. Let A, B be Hermitian n × n matrices with ordered eigenvalues

α1 ≥α2 ≥ · · · ≥αn and β1 ≥β2 ≥ · · · ≥βn respectively. Let || · || denote the So by Theorem4.1

min||A−Q∗

and the result follows.

An optimal Q for problem (5.1) is clearly given byQ = V U∗

(14)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

Title Page

Contents

◭◭ ◮◮

◭ ◮

Page14of 16 Go Back

Full Screen

Close

similarly ordered eigenvalues). This follows becauseA = U DαU∗, B = V DβV∗,

whereDα, Dβ denote the diagonal matrices of eigenvalues {αi},{βi} respectively

and so

||A−Q∗

BQ||2 =||Dα−U

Q∗

V DβV

QU||2

=X(αi−βi)2 if Q=V U

.

Problem (5.1) is a variation on the well-known Orthogonal Procrustes Problem (see, for example, [4], [5]) where an orthogonal (unitary) matrix is sought to solve

min

Qunitary||A−BQ||.

In this case A and B are no longer required to be Hermitian (or even square). A minimizingQfor this problem can be obtained from a singular value decomposition ofB∗

(15)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

Title Page

Contents

◭◭ ◮◮

◭ ◮

Page15of 16 Go Back

Full Screen

Close

References

[1] I.D. COOPE, On matrix trace inequalities and related topics for products of Hermitian matrices, J. Math. Anal. and Appl., 188(3) (1994), 999–1001.

[2] C. ECKARTANDG. YOUNG, The approximation of one matrix by another of lower rank, Psychometrica, 1 (1936), 211–218.

[3] G.H. GOLUB AND C.F. VAN LOAN, An analysis of the total least squares problem, SIAM J. Numer. Anal., 17 (1980), 883–893.

[4] G.H. GOLUB AND C.F. VAN LOAN, Matrix Computations, Johns Hopkins

University Press, Baltimore, MD, 3rd edition, 1996.

[5] N.J. HIGHAM, Computing the polar decomposition with – applications, J. Sci. and Stat. Comp., 7 (1986), 1160–1174.

[6] R.O. HORN AND C. R. JOHNSON, Matrix Analysis, Cambridge University Press, New York, 1985.

[7] S. VAN HUFFEL AND J. VANDEWALLE, The total least squares problem: computational aspects and analysis, SIAM Publications, Philadelphia, PA, 1991.

[8] M.V. JANKOVIC, Modulated Hebb-Oja learning rule – a method for principle subspace analysis, IEE Trans. on Neural Networks, 17 (2006), 345–356.

[9] JUNG-TAI LIU, Performance of multiple space-time coded MIMO in spatially corellated channels, IEEE Wireless Communications and Networking Confer-ence, 2003, 1 (2003), 349–353, .

(16)

Trace Inequalities I.D. Coope and P.F. Renaud vol. 10, iss. 4, art. 92, 2009

Title Page

Contents

◭◭ ◮◮

◭ ◮

Page16of 16 Go Back

Full Screen

Close

[11] F. WANG, Toward understanding predictability of climate: a linear stochastic modeling approach, PhD thesis, Texas A& M University, Institute of Oceanog-raphy, August, 2003.

[12] L. WOLF, Learning using the Born rule, Technical Report, Computer Science and Artificial Intelligence Laboratory, MIT-CSAIL-TR-2006-036, 2006.

[13] K. ZARIFI ANDA.B. GERSHMAN, Performance analysis of blind

Referensi

Dokumen terkait

[r]

[r]

Dengan adanya Sistem Informasi Bank Sampah Binangkit Sukagalih ini dapat membantu dalam proses penyimpanan data buku tabungan menggunakan database..

Hasil uji tarik menunjukkan bahwa pada fraksi berat serat 30% memiliki kekuatan mekanik yang paling tinggi yaitu sebesar ±9 MPa.. Akan tetapi kekuatan

Berita Acara Evaluasi Dokumen Data Teknis Pekerjaan Pengadaan dan Pemasangan Perlengkapan Jalan No Ruas 015 Tempel/Salam Batas Prov.. Sehubungan dengan

Berdasarkan Identifikasi Masalah diatas, maka dapat diambil kesimpulan, yaitu bagaimana menyajikan sebuah buku ilustrasi dengan latar belakang legenda Pulau Kemaro yang

Kami mengucapkan terima kasih kepada Peserta Pengadaan Barang/Jasa yang belum mendapat kesempatan sebagai Pemenang Lelang dan berharap dapat berpartisipasi pada

Oleh karena itu Pokja ULP PSMP Handayani menyimpulkan bahwa E-Lelang Sederhana Pascakualifikasi Pekerjaan Pengadaan Paket Pekerjaan Sewa Sistem Layanan Call Center