Multiresolution Matrix Decomposition (MMD)

MULTIRESOLUTION MATRIX DECOMPOSITION

3.1 Multiresolution Matrix Decomposition (MMD)

C h a p t e r 3

the locality (i.e. sparsity) of the energy decomposition to resolve the difficulty of large condition number. The main idea is to decompose the computation of A⁻¹into hierarchical resolutions such that (i)the relative condition number in each scale/level can be well bounded, and (ii)the sub-system to be solved on each level is as sparse as the original A. Here we will make use of the choice of the partition P, the basis ΦandΨobtained in Section 2.2 and Section 2.3 to serve the purpose of multiresolution operator decomposition. In the following, We first implement a one-level decomposition.

3.1.1 One-level Operator Decomposition

LetP,Φ,ΨandU be constructed as in Algorithm 1, namely (i) P = {Pj}^M_j₌₁is a partition ofV;

(ii) Φ=[Φ₁,Φ₂,· · ·,Φ_M] such that everyΦ_j ⊂ span{Pj}has dimensionqj; (iii) Ψ = A⁻¹Φ(Φ^TA⁻¹Φ)⁻¹;

(iv) U =[U₁,U₂,· · · ,UM] such that everyUj ⊂ span{Pj}has dimension dim(Uj) = dim(span{Pj})−qj and satisfiesΦ^T_jUj =0.

Then [U,Ψ] forms a basis ofRⁿ, and we have

U^TAΨ =U^TΦ(Φ^TA⁻¹Φ)⁻¹= 0. (3.1) Thus the inverse of Acan be written as

A⁻¹= 



 U^T Ψ^T







−1





 U^T Ψ^T





 Af

U Ψ g f

U Ψ g−1−1

=U(U^TAU)⁻¹U^T +Ψ(Ψ^TAΨ)⁻¹Ψ^T.

(3.2)

In the following we denote Ψ^TAΨ = A_st andU^TAU = B_st respectively. We also use the phrase “solving A⁻¹” to mean “solving A⁻¹bfor any b". From (3.2), we observe that solving A⁻¹ is equivalent to solving A⁻_st¹and B_st⁻¹ separately. For B_st, notice that since the space/basisU is constructed locally with respect to each patch Pj, B_st will inherit the sparsity characteristic from A if Ais local/sparse. Thus it will be efficient to solve B_st⁻¹ using iterative type methods if the condition number of B_stis bounded. In the following, we introduce Lemma 3.1.1 which provides an upper bounded of theB_stthat ensures the efficiency of solvingB_st⁻¹. The proof of the

lemma imitates the proof from Theorem 10.9 of [89], where the required condition (2.18) corresponds to Equation (2.3) in [89].

Lemma 3.1.1. IfΦsatisfies the condition(2.18)with constant, then λ_max(B_st) ≤ λ_max(A)·λ_max(U^TU), λ_min(B_st) ≥ 1

² ·λ_min(U^TU), (3.3) and thus

κ(B_st) ≤ ²·λ_max(A)·κ(U^TU). (3.4)

Proof. Forλ_max(B_st), we have

λ_max(B_st)= kB_stk₂ = kU^TAUk₂≤ kAk₂kUk²₂

= kAk₂kU^TUk₂= λ_max(A)λ_max(U^TU).

Forλ_min(B_st), sinceΦsatisfies the condition (2.18) with constant andΦ^TU = 0, we have

kxk₂²≤ 1

λ_min(U^TU)x^TU^TU x ≤ ²

λ_min(U^TU)x^TU^TAU x = ²

λ_min(U^TU)x^TB_stx,

thusλ_min(B_st) ≥ ¹₂λ_min(U^TU).

We shall have some discussions on the bound²·λ_max(A)·κ(U^TU)separately into two parts, namely (i) ²· λ_max(A); and (ii) κ(U^TU). Notice thatU^TU is actually block-diagonal with blocksU^T_jUj, therefore

κ(U^TU) = λ_max(U^TU)

λ_min(U^TU) = max₁≤j≤M λ_max(U^T_jUj)

min₁≤j≤Mλ_min(U^T_jUj). (3.5) In other words, we can bound κ(U^TU)well by choosing properUjfor each Pj. For instance, if we allow any kind of local computation on Pj, we may simply extend Φ_j to an orthonormal basis of span{Pj}to getUj by using QR factorization [102].

In this case, we have κ(U^TU)= 1.

For the part ²λ_max(A), recall that we construct Φbased on a partition P and an integerqso thatΦsatisfies condition (2.18) with constantε(P,q), thus the posterior bound of κ(B_st) is ε(P,q)²λ_max(A)(when κ(U^TU) = 1). Recall that κ(A_st) is bounded byδ(P,q)kA⁻¹k₂, thereforeκ(A_st)κ(B_st) ≤ ε(P,q)²δ(P,q)κ(A). That is, A_standB_stdivide the burden of the large condition number ofAwith an amplification

factorε(P,q)²δ(P,q). We call κ(P,q) , ε(P,q)²δ(P,q) theq_th-order condition number of the partition P. This explains why we attempt to bound κ(P,q) in the construction of the partitionP.

Ideally, we hope the one-level operator decomposition gives κ(A_st) ≈ κ(B_st), so that the two parts equally share the burden in parallel. But such result may not be good enough when κ(A_st) and κ(B_st) are still large. To fully decompose the large condition number of A, a simple idea is to recursively apply the one-level decomposition. That is, we first set a small enough to sufficiently bound κ(B_st); then if κ(A_st) is still large, we apply the decomposition to A⁻_st¹ again to further decomposeκ(A_st). However, the decomposition ofA⁻¹is based on the construction of P and Φ, namely on the underlying energy decomposition E = {Ek}^m_k₌₁ of A. Hence, we have to construct the correspondingenergy decompositionofA_stbefore we implement the same operator decomposition on A⁻_st¹.

3.1.2 Inherited Energy Decomposition

Let E = {Ek}^m_k₌₁ be the energy decomposition of A, then the inherited energy decompositionofA_st= Ψ^TAΨwith respect toE is simply given byE^Ψ = {E_k^Ψ}^m_k₌₁ where

E_k^Ψ =Ψ^TEkΨ, k =1,2,· · ·,m. (3.6) Notice that this inherited energy decomposition ofA_stwith respect toEhas the same number of energy elements as E, which is not preferred and actually redundant in practice. Therefore we shall consider reducing the energy decomposition of A_st. Indeed we will use ΨH instead of Ψ in practice, where each ˜ψi is some local approximator ofψi(obtained by Construction 2.2.9). Specifically, we will actually deal withAH_st =ΨH^TAΨHand thus we shall consider to find a proper condensed energy decomposition of AH_st.

If we see AH_stas a matrix with respect to the reduced space R^N, then for any vector x= {x₁, . . . ,xN} ∈ R^N, the connectionx∼ E^H^Ψbetweenxand some E^H^Ψ =ΨH^TEΨH comes from the connection between E and those ˜ψi corresponding to nonzero xi, and such connections are the key to constructing a partition P_stfor AH_st. Recall that the support of each ˜ψi(Sk(Pj_i) for some k) is a union of patches, there is no need to distinguish among energy elements interior to the same patch when we deal with the connections between these elements and the basis HΨ. Therefore we introduce thereduced inherited energy decompositionof HA_st =ΨH^TAΨHas follows:

Definition 3.1.2 (Reduced inherited energy decomposition). With respect to the

underlying energy decomposition E of A, the partition P and the corresponding ΨH, thereduced inherited energy decompositionof AH_st = ΨH^TAΨH is given byE_{r e}^H^Ψ = {A^Ψ_P^H

j}^M_j₌₁∪ {E^H^Ψ: E ∈ E_P^c}with A^H^Ψ_P

j = HΨ^TA_P

jΨ,H j = 1,2,· · ·,M, and (3.7) E^Ψ^H = HΨ^TEΨ,H ∀E ∈ E^c_P, (3.8) whereE_P^c = E\E_P withE_P ={E ∈ E :∃Pj ∈ P s.t. E ∈ Pj}.

Once we have the underlying energy decomposition of A_st (or AH_st), we can repeat the procedure to decompose A⁻_st¹(or HA⁻_st¹) inR^Nas what we have done to A⁻¹inRⁿ. We will introduce themulti-level decompositionof A⁻¹in the following subsection.

3.1.3 Multiresolution Matrix Decomposition

Let A⁽⁰⁾ = A, and we construct A^(k),B^(k) recursively from A⁽⁰⁾. More precisely, let E⁽^k−¹⁾ be the underlying energy decomposition of A^(k−¹⁾, andP^(k), Φ^(k), Ψ^(k) andU^(k⁾be constructed corresponding to A^(k−¹⁾ andE^(k−¹⁾ in spaceR^N

(k−1)

, where N^(k−¹⁾ is the dimension of A^(k−¹⁾. We use one-level operator decomposition to decompose(A^(k−¹⁾)⁻¹as

(A⁽^k−¹⁾)⁻¹=U^(k) (U^(k))^TA⁽^k−¹⁾U⁽^k)⁻1(U^(k))^T +Ψ⁽^k) (Ψ^(k))^TA^(k−¹⁾Ψ^(k)−1(Ψ^(k))^T,

and then define A^(k) = (Ψ^(k))^TA^(k−¹⁾Ψ^(k), B⁽^k) = (U^(k))^TA⁽^k−¹⁾U⁽^k), and E^(k) = (E^(k−¹⁾)_{r e}^Ψ^(k) as in Definition 3.1.2. Moreover, if we write

Φ⁽¹⁾ = Φ⁽¹⁾, Φ⁽^k) = Φ⁽¹⁾Φ⁽²⁾· · ·Φ^(k−¹⁾Φ^(k), k ≥ 1, (3.9a) U⁽¹⁾ =U⁽¹⁾, U^(k) =Ψ⁽¹⁾Ψ⁽²⁾· · ·Ψ^(k−¹⁾U^(k), k ≥ 1, (3.9b) Ψ⁽¹⁾ = Ψ⁽¹⁾, Ψ^(k) =Ψ⁽¹⁾Ψ⁽²⁾· · ·Ψ^(k−¹⁾Ψ^(k), k ≥1, (3.9c) then one can prove by induction that for k ≥ 1,

A^(k) = (Ψ⁽^k))^TAΨ⁽^k) = (Φ^(k))^TA⁻¹Φ^(k)−1, B^(k) = (U^(k))^TAU^(k), (Φ^(k))^TΦ⁽^k) = (Φ^(k))^TΨ^(k) = I_N^(k), Ψ⁽^k) = A⁻¹Φ^(k) (Φ^(k))^TA⁻¹Φ^(k)−1, and for any integerK,

A⁻¹= (A⁽⁰⁾)⁻¹

k=1

U⁽^k) (U^(k))^TAU^(k)−1(U^(k))^T +Ψ^(K) (Ψ^(K))^TAΨ^(K)−1(Ψ^(K))^T. (3.10)

Remark 3.1.3.

- One shall notice that the partition P⁽^k) on each level k is not a partition of the whole space Rⁿ, but a partition of the reduced space R^N

(k−1)

, and Φ^(k),Ψ^(k),U^(k) are all constructed corresponding to this P^(k) in the same reduced space. Intuitively, if the average patch sizes (basis number in a patch) for partitionP^(k) is s^(k), then we have N⁽^k) = ^q_s(k)^(k)N^(k−¹⁾, whereq^(k) is the integer for constructingΦ⁽^k).

- Generally, methods of multiresolution type use nested partitions/meshes that are generated only based on the computational domain [8, 123]. But here the nested partitions are replaced by level-wisely constructed ones which are adaptive to A^(k) on each level and require no a priori knowledge of the computational domain/space.

- In the Gamblet setting introduced in [88], equations(3.9)together with(3.10) can be viewed as the Gamblet Transform.

The multiresolution operator decomposition up to a levelKis essentially equivalent to a decomposition of the whole spaceRⁿ[88] as

Rⁿ=U⁽¹⁾ ⊕ U⁽²⁾ ⊕ · · · ⊕ U^(K) ⊕Ψ^(K),

where again we also useU^(k)( orΨ^(k)) to denote the subspace spanned by the basis U⁽^k)( or Ψ^(k)). Due to the A-orthogonality between these subspaces, using this decomposition to solveA⁻¹is equivalent to solvingA⁻¹in each subspace separately (or more precisely solving(B^(k))⁻¹,k =1,· · ·,K, or(A^(K))⁻¹), and by doing so we decompose the large condition number of Ainto bounded pieces as the following corollary states.

Corollary 3.1.4. If on each level Φ^(k) is given by Construction 2.2.6 with integer q^(k), then fork ≥1we have

λ_max(A^(k)) ≤ δ(P^(k),q^(k)), λ_min(A^(k)) ≥ λ_min(A), λ_max(B^(k)) ≤ δ(P^(k−¹⁾,q^(k−¹⁾)λ_max (U⁽^k))^TU^(k⁾,

λ_min(B^(k)) ≥ 1

ε(P^(k),q^(k))²λ_min (U^(k))^TU^(k), and thus

κ(A^(k)) ≤ δ(P^(k),q^(k))kA⁻¹k₂,

κ(B^(k)) ≤ ε(P^(k),q^(k))²δ(P^(k−¹⁾,q^(k−¹⁾)κ (U^(k)) U^(k) . For consistency, we writeδ(P⁽⁰⁾,q⁽⁰⁾) = λ_max(A⁽⁰⁾) = λ_max(A).

Proof. These results follow directly from Theorem 2.2.25 and Lemma 3.1.1.

Remark 3.1.5. The fact λ_min(B^(k)) & _ε_(P(k)¹,q^(k))² implies that the level k is a level with resolution of scale no greater than ε(P^(k),q^(k)), namely the spaceU^(k) is a subspace of the whole spaceRⁿof scale finer thanε(P^(k),q^(k))with respect to A. This is essentially what multiresolution means in this decomposition.

Now we have a multiresolution decomposition ofA⁻¹, the applying ofA⁻¹(namely solving linear systemAx =b)) can break into the applying of(B^(k))⁻¹on each level and the applying of(A^(K))⁻¹on the bottom level. In what follows, we always assume κ((U^(k))^TU^(k)) = 1. Then the efficiency of the multiresolution decomposition in resolving the difficulty of large condition number of A lies in the effort to bound each ε(P^(k),q^(k))²δ(P^(k−¹⁾,q^(k−¹⁾) so that B^(k) has a controlled spectrum width and can be efficiently solved using the CG type method. Define κ(P⁽^k),q^(k)) = ε(P^(k),q^(k))²δ(P^(k),q^(k))andγ⁽^k) = _ε(^ε(P_P(k−^(k)1)^,q,q^(k)^(k−⁾¹⁾), then we can write

κ(B⁽^k)) ≤ ε(P^(k),q^(k))²δ(P^(k−¹⁾,q^(k−¹⁾)= (γ^(k))²κ(P^(k−¹⁾,q^(k−¹⁾). (3.11) The partition condition numberκ(P⁽^k),q^(k))) is a level-wise information only con- cerning the partitionP^(k). Similar to what we do in Algorithm 4, we will impose a uniform boundcin the partitioning process so thatκ(P^(k),q^(k))) ≤ con every level.

The ratioγ^(k)reflects the scale gap between levelk−1 andk, which is why it should measure the condition number(spectrum width) of B^(k). However, it turns out that the choice ofγ^(k) is not arbitrary, and it will be subject to a restriction derived out of concern of sparsity.

So far theA^(k)andB^(k⁺¹⁾are dense fork ≥ 1 since the basisΨ^(k)are global. It would be pointless to bound the condition number ofB^(k)if we cannot take advantage of the locality/sparsity of A. So in practice, the multiresolution operator decomposition is performed with localization on each level to ensure locality/sparsity. Thus we have the modified multiresolution operator decomposition in the following subsection.

Dalam dokumen De Huang (Halaman 79-85)