J. KSIAM Vol.14, No.1, 29–34, 2010
A NEW UNDERSTANDING OF THE QR METHOD
CHOHONG MIN
DEPARTMENT OFMATHEMATICS, KYUNGHEEUNIVERSITY, SOUTHKOREA
E-mail address:[email protected]
ABSTRACT. The QR method is one of the most common methods for calculating the eigenval- ues of a square matrix, however its understanding would require complicated and sophisticated mathematical logics. In this article, we present a simple way to understand QR method only with a minimal mathematical knowledge. A deflation technique is introduced, and its combina- tion with the power iteration leads to extracting all the eigenvectors. The orthogonal iteration is then shown to be compatible with the combination of deflation and power iteration. The connection of QR method to orthogonal iteration is then briefly reviewed. Our presentation is unique and easy to understand among many accounts for the QR method by introducing the orthogonal iteration in terms of deflation and power iteration.
1. INTRODUCTION
Since its inception by Francis [4, 5] and Kublanovskaya [8], the QR method has been the most widely used and the most popular method for calculating the eigenvalues of a full matrix.
It has been generalized to wider range of eigenvalue problems; QZ method, one of its variants, solves generalized eigenvalue problemAx=λBx[7], and a generalization of the QR method called GR method has been researched by Watkins et al [11]. For each specific application, the method has been tuned and upgraded; the restarted QR method is for comrade and fellow ma- trices [3], and the QR method with a balance technique is for finding the roots of polynomials [1].
Though its importance cannot be overstated, the QR method is normally deferred to graduate course, or would be just presented without enough explanation why it works. The enigma stems from the fact that the convergence proof is not trivial at all [6]. Even presenting a simple way to understand the QR method has been a research topic [10, 2]. The most accounts take the approach of explaining the orthogonal iteration and its connection to QR method [9, 6, 10], and so does this article. But our presentation is different in explaining the orthogonal iteration as successive application of the power iteration to deflated matrices. This article introduces a simple way to understand the QR method only with a minimal knoledge of mathematics.
Received by the editors November 4 2009; Accepted March 8 2010.
2000Mathematics Subject Classification. 65F15.
Key words and phrases. eigenvalues, eigenvector, QR method.
29
2. POWER ITERATION ANDDEFLATEDMATRIX
Throughout this paper, we assume a matrixA ∈Cn×nto have distinct eigenvaluesλ1,λ2,
· · ·,λnwith their associated eigenvectorsx1,x2, · · ·,xn. The eigenvalues are numbered in decreasing order,|λ1|>|λ2|>· · ·>|λn|>0.
Theorem 2.1. (Power Iteration) Assume thatv0 in algorithm 1 is chosen randomly enough to have nonzero component of eigenvectorx1, then the sequencevksatisfies
k→∞lim dist
³
vk, span(x1)
´
= 0.
Proof. By the assumption,v0 =a1x1+a2x2+· · ·+anxnwitha1 6= 0.vkin algorithm 1 is parallel toAvk−1, wheneverk≥1. Repeating this argument leads to the fact thatvkis parallel toAkv0, andvk=Akv0/°
°Akv0°
°.
Akv0 = a1λk1x1+a2λk2x2+· · ·+anλknxn
k→∞lim vk = lim
k→∞
a1λk1x1+a2λk2x2+· · ·+anλknxn
°°a1λk1x1+a2λk2x2+· · ·+anλknxn°
°
= lim
k→∞
³ λ1
kλ1k
´k
x1+aa21
³ λ2
kλ1k
´k
x2+· · ·+aan1
³ λn
kλ1k
´k xn
°°
°°x1+aa21
³λ2
λ1
´k
x2+· · ·+aan1
³λn
λ1
´k xn
°°
°°
= lim
k→∞
µ λ1 kλ1k
¶k x1 kx1k
Ask→ ∞,vkbelongs to thespan(x1), thuslimk→∞dist¡
vk, span(x1)¢
= 0. ¤
Algorithm 1Power iteration :v=power(A) Input : ann×nmatrixA
v0: randomly chosen fork= 0,1,2,· · · do
vk+1 = kAvAvkkk end for
Output : v=vkfor some largek
The above theorem shows that the sequence of power iteration will eventually belong to span(x1). Let us set a unit eigenvectorv1 ∈ span(x1)as a vectorvkin the power iteration applied toAfor some largek. The next theorem suggests a way to calculate the next largest one.
Theorem 2.2. (Deflated Matrix) Letv1 be a unit eigenvector ofAassociated withλ1, then the matrix¡
I−v1vT1¢
A, called the deflated matrix ofAwithv1, has eigenvalues0, λ2, · · ·, λn with corresponding eigenvectorsv1,¡
I−v1v1T¢
x2,· · ·,¡
I−v1v1T¢ xn. Proof. Since¡
I−v1v1T¢
Av1 = λ1v1−λ1v1 = 0,0is an eigenvalue of the deflated matrix with eigenvectorv1. The deflated matrix annihilates thev1component, hence
¡I−v1v1T¢ A¡
I−v1v1T¢
xj = ¡
I−v1v1T¢ Axj
= λj¡
I−v1v1T¢ xj.
Note that wheneverj6= 1,xj−(v1·xj)v1cannot be a zero vector, otherwise it contradicts the linear independence of the eigenvectors belonging two different eigenvalues. ¤ Corollary 2.3. Let the power iteration, denoted by ¡
vk2¢
, operate on the deflated matrix
¡I−v1v1T¢
A, then
k→∞lim dist
³
v2k, span(x1, x2)
´
= 0.
Proof. The largest eigenvalue of the deflated matrix isλ2and its associated eigenvector isx2− (v1·x2)v1. From theorem 2.1, ask → ∞, vk2 belongs to the spacespan(x2−(v1·x2)v1)
which is the subspace ofspan(x1, x2). ¤
Power iteration obtains a unit eigenvector of the eigenvalue with the largest modulus. The deflation annihilates the largest eigenvalue, and exposes the next largest one for the power iteration to pick up. In this way, we set a unit vectorv2 ∈ span(x1, x2) asvk in the power iteration applied to¡
I−v1vT1¢
A for some largek. We repeatedly apply the power iteration on the deflated matrix to obtain unit length vectorsv1,v2,· · ·,vnwith the properties listed in Algorithm 2.
Algorithm 2Successive power iterations on deflated matrices Input : ann×nmatrixA
v1 =power(A) v2 =power¡¡
I−v1v1T¢ A¢ ...
vn=power¡¡
I−vn−1vn−1T ¢
· · ·¡
I−v1vT1¢ A¢ Output : v1,v2,· · ·,vn
Theorem 2.4. The vectorsv1,v2,· · ·,vnin Algorithm 2 satisfy v1 ∈ span(x1)
v2 ∈ span(x1, x2) ...
vn ∈ span(x1, x2,· · · , xn) .
Proof. LetAi = ¡
I−viviT¢
· · ·¡
I−v1v1T¢
Abe the ith deflated matrix in Algorithm 2. By Theorem 2.2, each deflation annihilates the largest eigenvalue ofAiand adds a scalar multiple of the associated eigenvector to all the other eigenvectors. Thus the largest eigenvalue in mod- ulus ofAi isλi+1, and its associated eigenvector belongs tospan(v1,· · · , vi, xi+1), for each i= 0,1,· · · , n−1. For the matrixAi,vi+1is an eigenvector associated with the eigenvalue of the largest modulus, andvi+1 ∈span(v1,· · ·, vi, xi+1)for eachi. By mathematical induction oni, it is clear thatvi+1 ∈span(x1,· · · , xi, xi+1)for eachi. ¤
3. ORTHOGONALITERATION
Algorithm 2 sequentially computes all the eigenvectors. Even though the power iteration for viis not completed, the temporary value should serve as a good approximation, and Algorithm 3 combines all the power iterations in one iteration by using the approximationsv1k,· · ·, vnkfor v1,· · · , vn, respectively. The two algorithms are just two formulations of the successive power iterations on deflated matrices. Practically, Algorithm 3 is more efficient than Algorithm 2 in a sense that the former can propose a good approximation for the full eigenvector system in a meantime of the iteration, while the latter cannot.
Algorithm 3Combined power iterations on recursively deflated matrices fork= 0,1,2,· · · do
v1k+1=Avk1 , v1k+1=v1k+1/
°°
°vk+11
°°
° v2k+1=
µ
I−vk+11
³ vk+11
´T¶
Avk1 , v2k+1=v2k+1/
°°
°vk+12
°°
° ...
vnk+1= µ
I−vk+1n−1
³ vk+1n−1
´T¶
· · · µ
I−v1k+1
³ v1k+1
´T¶
Avnk , vnk+1=vnk+1/°
°vk+1n °
° end for
Note that the routine inside Algorithm 3 is nothing but the QR factorization that obtains the orthonormal basisvk+11 ,· · ·, vnk+1 from the basis Av1k,· · ·, Avnk. Writing the vectors in columns, Vk = £
vk1,· · · , vnk¤
, Algorithm 4, called orthogonal iteration, is hence a simple restatement of Algorithm 3.
Algorithm 4Orthogonal Iteration fork= 0,1,2,· · · do
AVk=Vk+1Rk+1: QR factorization of the matrixAVk end for
Theorem 3.1. Let¡ Vk¢
k∈Nbe the sequence in the orthogonal iteration, then
• limk→∞span
³
v1k,· · ·, vjk
´
=span(x1,· · · , xj)forj= 1,2,· · · , n
• limk→∞vik·Avjk= 0ifi > j.
Proof. The orthogonal iteration is an implementation of the successive application of the power iteration on the deflated matrices. By theorem 2.4,limk→∞dist
³
vjk, span(x1,· · ·, xj)
´ for eachj, and the first argument follows.
As k → ∞, vjk belongs tospan(x1,· · ·, xj), andAvjk belongs to the same space, since the eigenvectors are invariant. By the first argument, Avkj belongs to span
³
v1k,· · ·, vkj
´
when k → ∞. In the QR factorization, vki is orthogonal tovjk whenever i > j, andvki ⊥ span
³
vk1,· · · , vjk
´
. Thereforelimk→∞vik·Avkj = 0wheneveri > j. ¤ 4. QR METHOD
The previous section reveals the relation between the column vectors of the orthogonal itera- tion and the eigenvectors. To retrieve the eigenvalues ofA, let us setAk=¡
Vk¢T
AVk. Then Theorem 3.1 states that Ak becomes upper triangular ask → ∞. Since Ak is similar toA, the eigenvalues ofAshould appear in the diagonal of the upper triangular matrix. Thus when Ak = ¡
Vk¢T
AVk is inserted in the orthogonal iteration, the eigenvalues can be obtained.
FromAVk=Vk+1Rk+1, Ak = ¡
Vk¢T
AVk = ¡
Vk¢T
Vk+1Rk+1 Ak+1 = ¡
Vk+1¢T
AVk+1 = Rk+1¡ Vk¢T
Vk+1 . SinceVkandVk+1are orthogonal matrices,¡
Vk¢T
Vk+1is also orthogonal. Therefore the above equations can be simply written as
Ak = Qk+1Rk+1 Ak+1 = Rk+1Qk+1,
whereAk=Qk+1Rk+1is the QR factorization ofAk. The above recursive formula shows that the sequenceAk can be generated detached from the orthogonal iteration. Algorithm 5, called QR method, shows the complete procedure how to find the full eigenvalues of a general matrix. SinceAkis similar toA, the eigenvalues are preserved eachk. By Theorem 3.1, the sequenceAkwill eventually become upper triangular matrix, which reveals the eigenvalues on its diagonal.
ACKNOWLEDGMENTS
This research was supported by the Korea Research Foundation Grant funded by the Korean Government (MOEHRD, Basic Research Promotion Fund) (KRF-2008-331-C00045).
Algorithm 5QR Method Input : ann×nmatrixA A0=A
fork= 0,1,2,· · · do Ak=Qk+1Rk+1 Ak+1=Rk+1Qk+1 end for
Output : an upper triangular matrixAkfor some largek
REFERENCES
[1] G. Ammar, D. Calvetti, W. Gragg, and L. Reichel. Polynomial zerofinders based on szego polynomials.J.
Comput. Appl. Math., 127:1–16, 2001.
[2] R. Burnside and P. Guest. A simple proof of the transposed qr algorithm.SIAM Review, 38:306–308, 1996.
[3] D. Calvetti, S. Kim, and L. Reichel. The restarted qr-algorithm for eigenvalue computation of structured matrices.J. Comput. Appl. Math., 149:415–422, 2002.
[4] J. Francis. The qr transformation i.Comput. J., 4:265–271, 1961.
[5] J. Francis. The qr transformation ii.Comput. J., 4:332–345, 1961.
[6] G. Golub and C. Loan.Matrix Computations. The John Hopkins University Press, 1989.
[7] B. Kagstrom and A. Ruhe. Matrix pencils.Springer-Verlag Lecture Notes in Mathematics, 973, 1983.
[8] V. Kublanovskaya. On some algorithms for the solution of the complete eigenvalue problem.USSR Comput.
Math. Math. Phys., 3:637–657, 1961.
[9] L. Trefethen and D. Bau III.Numerical Linear Algebra. SIAM, 1997.
[10] D. Watkins. Understanding the qr algorithm.SIAM Review, 24:427–439, 1982.
[11] D. Watkins. Qr-like algorithms for eigenvalue problems.J. Comput. Appl. Math., 123:67–83, 2000.