Cross-Level Refinement of Eigenspace - HIERARCHICALLY PRECONDITIONED EIGENSOLVER

HIERARCHICALLY PRECONDITIONED EIGENSOLVER

4.4 Cross-Level Refinement of Eigenspace

Moreover, consider using the PCG method to solve A^(k⁾x = w for w ∈ W_m₊

k with

preconditioner M^(k) and initial guess x₀such thatr₀ = w− A^(k)x₀ ∈W_m⁽^k)₊

k, where W_m^(k)₊

k = span{(Ψ^(k))^Tv_i^(k) : mk <i ≤ N^(k)}. Let x∗be the exact solution, and xt be the solution at thet^{t h} step of the PCG iteration. Then we have

kxt −x∗k_A(k) ≤ 2 qµ⁽_m^k)

k+1δk −1 qµ⁽_m^k)

k+1δk +1 t

kx₀− x∗k_A(k),

and

kxt− x∗k₂ ≤ 2q εkµ^(k)_m

k+1δ²_k qµ^(k)_m

k+1δk −1 qµ^(k)_m

k+1δk +1 t

kx₀−x∗k₂.

Recall that we will use the CG method to implement Lanczos iteration on each levelkto complete the target spectrum. To ensure the efficiency of the CG method, namely to bound the restricted condition numberκ(A^(k)_Ψ ,Z_m⁽^k)₊)on each level, we need a priori knowledge of the spectrum {(µ^(k)_i ,v_i^(k)) : 1 ≤ i ≤ mk} such that µ^(k)_m

k+1δk

is uniformly bounded. This given spectrum should be inductively computed on the lower level k +1. But notice that there is a compression error between each two neighbor levels, which will compromise the orthogonality and thus the theoretical bound for restricted condition number, if we directly use the spectrum of the lower level as a priori spectrum of the current level. Therefore we introduce a refinement method in Section 4.4 to overcome this difficulty.

A^l_st = (Ψ^l)^TAΨ^l = (Ψ^l)^T(Ψ^h)^TAΨ^hΨ^l = (Ψ^l)^TA^h_stΨ^l, B^l_st = (U^l)^TAU^l = (U^l)^T(Ψ^h)^TAΨ^hU^l = (U^l)^TA^h_stU^l,

(A^h_st)⁻¹=Ψ^l(A^l_st)⁻¹(Ψ^l)^T +U^l(B_st^l )⁻¹(U^l)^T,

Θ^h = Ψ^h Ψ^l(A^l_st)⁻¹(Ψ^l)^T +U^l(B_st^l )⁻¹(U^l)^T

(Ψ^h)^T =Θ^l+U^l(B^l_st)⁻¹(U^l)^T. (4.5) Now suppose that we have obtained the firstml essential eigenpairs (µl,i,vl,i), i = 1,· · ·,ml, ofΘ^l. We want to use these eigenpairs as initial guess to obtain the first mhessential eigenpairs ofΘ^h. Recall that we have the estimates

|µh,i− µl,i| ≤ εl, 1 ≤i ≤ ml, and

kΘ^hv_l,i−µ_h,iv_l,ik₂ ≤ 2εl, 1≤ i ≤ ml,

whereεl is the compression error bound. These estimates give us confidence that we can obtain(µh,i,vh,i),i= 1,· · · ,mh, efficiently from(µl,i,vl,i),i =1,· · ·,ml, by using some refinement technique.

Indeed, we will use the Orthogonal Iteration with Ritz Acceleration as our refinement method. Consider an initial guessQ⁽⁰⁾ of the firstmeigenvectors of a PD operator Θ. To obtain more accurate eigenvalues and eigenspace, the Orthogonal Iteration with Ritz Acceleration runs as follows:

Q⁽⁰⁾ ∈R^n×m given with(Q⁽⁰⁾)^TQ⁽⁰⁾ = Im

F⁽⁰⁾ = ΘQ⁽⁰⁾ for k =1,2,· · ·

Q^(k)R^(k) = F^(k−¹⁾ (QR factorization)

F^(k) =ΘQ^(k) (∗)

S^(k) = (Q^(k))^TF^(k)

P⁽^k)D^(k)(P^(k))^T = S^(k) (Schur decomposition) Q^(k) ←Q^(k)P^(k)

F^(k) ← F⁽^k)P^(k) end

To state the convergence property of the Orthogonal Iteration with Ritz Acceleration, we first define the distance between two spaces. LetV₁,V₂⊂ Rⁿbe two linear spaces, andP_V₁,P_V₂ be the orthogonal projections onto V₁,V₂ respectively. We define the distance betweenV₁andV₂as

dist(V₁,V₂)= kPV₁ −PV₂k₂.

We also use the same notation dist(V₁,V₂) when V₁,V₂ are matrices of column vectors. In this case dist(V₁,V₂)means dist(span{V₁},span{V₂}).

Suppose that the diagonal entries µ_i^(k), i = 1,· · ·,m, of D^(k) are in a decreasing order, then µ^(k)_i is a good approximation of thei^{t h} eigenvalue ofΘ, and span{Q⁽_i^k)} is a good approximation of the eigenspace spanned by the firstieigenvectors ofΘ, whereQ_i^(k) denotes the firsticolumns ofQ^(k). We would like to emphasize that the meaning of the superscript(k) of µ_i^(k) is different from those in Section 4.3. More precisely, we have the following convergence estimate:

Theorem (Stewart [117], 1968): Let (µi,vi), i = 1,· · ·,N, be the ordered (essential) eigenpairs of Θ, and let µ^(k)_i , i = 1,· · ·,m, be the ordered eigenvalues of D^(k) = (Q^(k))^TΘQ^(k) given in the Orthogonal Iteration with Ritz Acceleration (∗).

LetVm = [v₁,v₂,· · ·,vm], andd⁽⁰⁾ =dist(Vm,Q⁽⁰⁾). Then we have

|µi−µ⁽_i^k)| ≤ O µm+1

µi

· kΘk₂· (d⁽⁰⁾)² 1−(d⁽⁰⁾)²

, 1 ≤i ≤ m.

Moreover, we have

dist(V_m,Q⁽^k)) ≤O µm+1

µm

· d⁽⁰⁾ p1−(d⁽⁰⁾)²

and fori =1,· · ·,m−1,if we further assume thatαi = µi−µi+1 > 0, then we have dist(Vi,Q_i^(k)) ≤O µm+1

µi

· d⁽⁰⁾ p1−(d⁽⁰⁾)²

√ i αi

· µ²_m₊₁ µmµi

· kΘk₂· (d⁽⁰⁾)² 1−(d⁽⁰⁾)²

, whereVi andQ_i^(k) are the firsticolumns ofVm andQ^(k) respectively.

Now we go back to our problem, where we have Θ = Θ^h, m = ml, and Q⁽⁰⁾ = V_m^l_l = [vl,1,· · ·,vl,m_l]. We next consider the efficiency of this refinement technique

in our problem. As long as the initial distance d⁽⁰⁾ = dist(V_m^h_l,V_m^l_l) < 1, the first mh eigenvalues and the eigenspace of the first mh eigenvectors of Θ^h converges exponentially fast at a rate (^µ_µ^h,_h,^ml+1

mh )^k. We can expect that a few iterations of refinement will be sufficient to give an accurate eigenspace for narrowing down the residual spectrum ofΘ^h, if we can ensure that the ratio ^µ_µ^h,_h,^ml+1

mh is small enough.

This will be verified in our numerical examples to be presented in Section 4.6. In particular, to refine the first mh eigenpairs subject to a prescribed accuracy , we needK = O(log(¹)/log(_µ^µ_h,^h,^mh

ml+1))refinement iterations.

The main cost of the refinement procedure comes from the computation ofΘ^hQ⁽⁰⁾ and the computation ofΘ^hQ^(k) in each iteration. We will reduce the computational cost by using the fact thatQ^(k) is a good approximation of eigenvectors of Θ^h. We first consider how to computeΘ^hQ⁽⁰⁾ efficiently.

Notice that in our problem, we take Q⁽⁰⁾ = V_m^l_l, whose columns are the first ml

eigenvectors ofΘ^l. Therefore by (4.5), we have

Θ^hQ⁽⁰⁾ =Θ^hV_m^l_l =Θ^lV_m^l_l +U^l(B_st^l )⁻¹(U^l)^TV_m^l_l =V_m^l_lD^l_m_l +U^l(B^l_st)⁻¹(U^l)^TV_m^l_l, whereD^l_m_l is a diagonal matrix whose diagonal entries areµl,1, µl,2,· · ·, µl,m_l. Recall that by Lemma 3.1.1 and Corollary 3.1.4, κ(B^l_st) is bounded by εlδh that can be well controlled in the decomposition procedure. Thus it is efficient to solve (B^l_st)⁻¹ using the CG method. As we have mentioned before, applying (U^l)^T orU^l from the left is performed by doing patch-wise Householder transformations that involve only one local Householder vector on each patch, which takesO(N^h)computational cost, whereN^his the compressed dimension on levelhor the size ofA^h_st. Therefore in the CG method, the cost of matrix multiplication of B^l_st = (U^l)^TA^h_stU^l mainly comes from the number of nonzero entries ofA^h_st. Then the total computational cost of computingΘ^hQ⁽⁰⁾ subject to a relative error can be bounded by

O ml·nnz(A^h_st)·εlδh·log(1 )

! .

Next, we consider how to compute Θ^hQ^(k). To do so, we first compute w_i^(k) = (Ψ^h)^Tq_i^(k), where q_i^(k) is the i^{t h} column of Q^(k), then compute (A^h_st)⁻¹w_i^(k), and applyΨ^h. Again we will use the PCG method with predictioner M^h= (Ψ^h)^TΨ^hto compute(A^h_st)⁻¹w_i^(k). As we have discussed in Section 4.3, this is equivalent to using the CG method to compute(A^h_Ψ)⁻¹z_i^(k), whereA^h_Ψ = (M^h)⁻¹²A^h_st(M^h)⁻¹², andz_i^(k) = (M^h)⁻¹²w_i⁽^k) = (M^h)⁻¹²(Ψ^h)^Tq_i^(k). Inspired by Corollary 4.3.6, we seek to provide

a good initial guess for the CG method to ensure efficiency. In the Orthogonal Itera- tion with Ritz Acceleration (∗), one can check that(Q^(k))^T(Θ^hQ⁽^k)−Q⁽^k)D^(k)) =0, whereD⁽^k)is a diagonal matrix with diagonal entriesµ^(k)₁ , µ₂^(k),· · ·, µ⁽m^k)_l, and therefore

(Z^(k))^T (A^h_Ψ)⁻¹Z^(k) − Z^(k)D^(k)

= (Q^(k))^TΨ^h(M^h)⁻¹²

(A^h_Ψ)⁻¹(M^h)⁻¹²(Ψ^h)^TQ⁽^k)−(M^h)⁻¹²(Ψ^h)^TQ^(k)D^(k)

= (Q^(k))^T

Ψ^h(A^h_st)⁻¹(Ψ^h)^TQ⁽^k)−Ψ^h(M^h)⁻¹(Ψ^h)^TQ^(k)D⁽^k)

= (Q^(k))^T

Θ^hQ⁽^k)−Q⁽^k)D^(k)

=0,

where we have used that Q^(k) ∈ span{Ψ^h} and so Ψ^h(M^h)⁻¹(Ψ^h)^TQ^(k) = Q^(k). This observation implies that if we use µ_i^(k)z_i⁽^k) as the initial guess for computing (A_Ψ^h)⁻¹z_i⁽^k)using the CG method, the initial residualz_i⁽^k)−(A_Ψ^h)(µ^(k)_i z_i^(k))is orthogonal to(A_Ψ^h)⁻¹Z⁽^k). SinceQ^(k) are already good approximate essential eigenvectors of Θ^h, Z^(k) are good approximate eigenvectors of (A^h_Ψ)⁻¹, we can expect that the target eigenspaceZm_h, namely the eigenspace of the firstmheigenvectors of(A^h_Ψ)⁻¹, can be well spanned in span{(A^h_Ψ)⁻¹Z^(k)}. Therefore we can reasonably assume that z_i^(k)−(A^h_Ψ)(µ⁽_i^k)z_i^(k⁾) ∈ Z_m+

h = Z_m^⊥_h, and so again we can benefit from the restricted condition number κ(A_Ψ^h,Z_m+

h) ≤ µ_h,m_h₊₁δh as introduced in Section 4.3. More- over, we notice that the spectral residualkΘ^hq_i^(k)− µ_i^(k)q_i^(k)k₂is bounded by 2εl by Lemma 4.2.3, and we have

k(A^h_st)⁻¹w_i^(k)−µ_i^(k)(M^h)⁻¹w_i^(k)k₂ ≤ k(A^h_Ψ)⁻¹z_i^(k)−µ_i^(k)z_i^(k)k₂= kΘ^hq_i⁽^k)−µ^(k)_i q_i^(k)k₂, (4.6) where we have usedλmin(M^h) ≥ 1 (Lemma 4.3.7). Thus if we use µ⁽_i^k)z_i^(k) as the initial guess, the initial error will be bounded by 2εl at most, and the CG procedure will only need

κ(A^h_Ψ,Z_m+

h)·log(εl

) =O

µh,m_h+1δh·log(εl

)

iterations to achieve a relative accuracy, instead ofO(κ(A^h_Ψ,Z_m+

h)·log(¹)). Notice that using the initial guess µ_i^(k)z_i^(k) for (A_Ψ^h)⁻¹z_i^(k) is equivalent to using the initial guess µ_i^(k)(M^h)⁻¹w_i^(k) for (A^h_st)⁻¹w_i^(k).

Supported by the analysis above, we will compute(A^h_st)⁻¹w_i^(k) using the preconditioned CG method with preconditionerM^hand initial guessµ^(k)_i (M^h)⁻¹w_i⁽^k). Again suppose that in each PCG iteration, we also use the CG method to apply (M^h)⁻¹

subject to a higher relative accuracy ˆ, which takesO(nnz(M^h)· κ(M^h) ·log(¹_ˆ)) computational cost. In practice, it is sufficient to take ˆ comparable to. Recall that nnz(M^h) ≤ nnz(A^h_st), andκ(M^h) ≤O(εhδh)(Lemma 4.3.7), the cost of computing Θ^hQ^(k) subject to a relative error is then bounded by

O ml· µh,m_h+1δh·log(εl

)·nnz(A^h_st)·εhδh·log(1 )

! .

Notice that in each refinement iteration we also need to perform one QR factorization and one Schur decomposition, which together cost O(N^h· m_l²). However, as we have mentioned in the introduction, we only consider the asymptotic complexity of our method when the original Abecomes super large. In this case, the number mtar of the target eigenpairs is considered as a fixed constant, and so the term O(N^h· m²_l) ≤ O(N^hm²_tar) is considered to be minor and will be omitted in our complexity analysis. Therefore, the total cost of refining the first mh eigenpairs subject to a prescribed accuracy can be bounded by

O ml·nnz(A^h_st)·εlδh·log(1 )

(4.7) +O ml· µ_h,m_h₊₁δh·log(εl

)·nnz(A^h_st)·εhδh·log(1

)·log(1

)/log( µ_h,m_h µh,m_l+1)

! .

Again we remark that the operator Θ^h, the long vectorsQ^(k), F^(k), V^l andV^h are only for analysis use. Operations on long vectors of sizenwill be very expensive and unnecessary, especially on lower levels where the compression dimension N^h(the size of A^h_st) is small. Notice that all long vectors on theh-level are in span{Ψ^h}as

Q^(k) = Ψ^hQD^(k), F⁽^k) = Ψ^hFD⁽^k), V_m^l_l = Ψ^hVD_m^l_l, V_m^h_h = Ψ^hVD_m^h_h,

we thus only operate on their coefficients in the basisΨ^h. Correspondingly, when- ever we need to consider orthogonality of long vectors, we replace it by the M^h- orthogonality of their coefficient vectors. One can check that all discussions above still apply. Also another advantage of using the coefficient vectors is that in the previ- ous discussions, the good initial guess µ^(k)_i (M^h)⁻¹w_i⁽^k) = µ^(k_i ⁾(M^h)⁻¹(Ψ^h)^Tq_i^(k) = µ⁽_i^k)qˆ⁽^k) is obtained explicitly.

Summarizing the analysis above, we propose the following Algorithm 12 as our refinement method. Since we want the eigenspace spanned by the firstmheigenvectors ofΘ^h to be computed accurately, the refinement stops when dist(Q_m^(k−_h¹⁾,Q^(k)_m_h) <

for some prescribed accuracy , where Q_m_h denotes the first mh columns of Q^(k). SinceQ^(k) is orthogonal, one can check that

dist(Q^(k−_m_h¹⁾,Q⁽m^k)_h)= kQ^(k)m_h −Q^(k−m_h¹⁾(Q_m^(k−_h¹⁾)^TQ^(k)m_hk₂

= kQD^(k)_m_h −QD^(k−_m_h¹⁾(QD_m^(k−_h¹⁾)^TM^hQD_m^(k)_hk_Mh

≤

qλmax(M^h)kQD⁽_m^k)_h −QD_m^(k−_h¹⁾(QD^(k−_m_h¹⁾)^TM^hQD^(k)_m_hk₂

≤p

1+εhδhkQD^(k)_m_h −QD^(k−_m_h¹⁾(QD_m^(k−_h¹⁾)^TM^hQD_m^(k)_hk_F. In practical, we use kQD^(k)_m_h −QD^(k−_m_h¹⁾(QD_m^(k−_h¹⁾)^TM^hQD_m^(k)_hk_F < ^√₁₊_ε

hδ_h as the stopping criterion since it is easy to check. We have used Lemma 4.3.7 to boundλmax(M^h). Algorithm 12Eigenpair Refinement

Input: VD_m^l_l, D^l_m_l, prescribed accuracy, target eigenvalue threshold µh. Output: VD_m^h_h, D_m^h_h.

1: SetQD⁽⁰⁾ =V_m^l_l, D⁽⁰⁾ = D^l_m_l, k = 0;

2: fori =1 :ml do

3: gi = pcg(B^l_st,(U^l)^TM^hqˆ_i⁽⁰⁾,−,−, ); (QD=[ ˆq₁,· · ·,qˆm_l])

4: end for

5: FD⁽⁰⁾ =QD⁽⁰⁾D⁽⁰⁾ +U^lG; (G=[g₁,· · · ,gm_l])

6: repeat

7: k ← k+1;

8: QD^(k)R⁽^k) = FD^(k−¹⁾; (QR factorization with respect toM^horthogonality, i.e.

(QD^(k))^TM^hQD^(k) = I)

9: W^(k) = M^hQD^(k);

10: fori= 1 :ml do

11: fˆ_i⁽^k) = pcg(A^h_st,w_i^(k),M^h, µ^(k−_i ¹⁾qˆ_i^(k⁾, ); (FD= [ ˆf₁,· · ·, fˆm_l])

12: end for

13: S^(k) = (W^(k))^TFD^(k);

14: P^(k)D⁽^k)(P^k))^T = S^(k) (Schur decomposition, diagonals ofD^(k)in decreasing order);

15: renewmhso that µ^(k)_m_h ≥ µh > µ^(k)_m

h+1;

16: QD^(k) ←QD^(k)P^(k), FD^(k) ← FD^(k)P⁽^k);

17: until kQDm^(k)_h −QDm^(k−_h¹⁾(QD^(k−m_h¹⁾)^TM^hQD^(k)m_hk_F < .

18: VD_m^h_h = QD⁽_m^k)_h, D_m^h_h = D⁽_m^k)_h. (D_m^(k)_h denotes the first mh-size block ofD^(k⁾)

Dalam dokumen De Huang (Halaman 115-121)