Chapter V: Incomplete Cholesky factorization
5.4 Proof of stability of ICHOL(0)
Theorem17implies that the application of Algorithm3to a suitableO (π)-perturbation ΞβπΈ returns anO (π)-accurate Cholesky factorization ofΞin computational com-
82 plexity O (πlog2(π)log2π(π/π)). In practice, we do not have access toπΈ, so we need to rely on the stability of Algorithm3 to deduce that Ξ andΞβπΈ (used as inputs) would yield similar outputs for sufficiently small πΈ. Even though such a stability property ofICHOL(0)would also be required by prior works on incomplete LU-factorization such as [94], we did not find this type of result in the literature. We also found it surprisingly difficult to prove (and were unable to do so) when using the maximin ordering and sparsity pattern, although we always observed stability of Algorithm3in practice, for reasonable values ofπ.
The key problem is that the standard perturbation bounds for Schur complements are multiplicative. Therefore, applying themπtimes (after each elimination) results in a possible growth of the approximation error that is exponential inπ and cannot be compensated for by a logarithmic increase inπ.
However, we have already seen in Section5.2that when using a supernodal multi- color ordering, the incomplete Cholesky factorization can be expressed in a smaller number of groups of independent dense linear algebra operations. In this section, we are going to prove rigorously that the number of colors used by the multicolor ordering is upper bounded as O (log(π)) and that this allows us to control the approximation of the supernodal factorization by only invoking O (log(π)) Schur complement perturbation bounds. Therefore, the error amplification is polynomial in π and can be controlled by choosing π ' log(π). By relating ordinary and supernodal Cholesky factorization, we are able to deduce the same error bounds for the ordinary Cholesky factorization when using a supernodal multicolor ordering and sparsity pattern.
5.4.2 Revisiting the supernodal multicolor ordering
We begin by reintroducing the supernodal multicolor ordering of Section 5.2 in slightly different notation.
Forπ >0, 1 β€ π β€ π andπ βπ½(π), write
π΅π(π) (π) B {π βπ½(π) |π(π, π) β€π}. (5.11) Construction 3(Supernodal multicolor ordering and sparsity pattern). LetΞβRπΌΓπΌ with πΌ B Γ
1β€πβ€ππ½(π) and let π( Β·, Β· ) be a hierarchical pseudometric. For π β₯ 1, define thesupernodal multicolor orderingβΊπandsparsity patternππas follows. For
each π β {1, . . . , π}, select a subsetπ½Λ(π) β π½(π) of indices such that
βπ,Λ πΛβ π½Λ(π), πΛβ πΛ =β π΅(π)
π/2 πΛ
β©π΅(π)
π/2 πΛ
=β , (5.12)
βπ βπ½(π), βΛπ β π½Λ(π) :π β π΅(
π) π πΛ
. (5.13)
Assign every index in π½(π) to the element of π½Λ(π) closest to it, using an arbitrary method to break ties. That is, writing π { πΛfor the assignment of π to πΛ,
Λ
π βarg min
Λ π0βπ½Λ(π)
π π , πΛ0
, (5.14)
for all π β π½(π) and πΛβ π½Λ(π) such that π { πΛ. DefineπΌΛB Γ
1β€πβ€ππ½Λ(π) and define the auxiliary sparsity patternπΛπ β πΌΛΓπΌΛby
πΛπ B π,Λ πΛ
β πΌΛΓπ½Λ
βπ { π, πΛ { πΛ: π(π, π) β€ π . (5.15) Define the sparsity patternππ β πΌ ΓπΌ as
ππ B
(π, π) β πΌΓ πΌ
βπ,Λ πΛβ πΌΛ:π {π, πΛ { π ,Λ π,Λ πΛ
βπΛπ (5.16) and call the elements ofπ½Λ(π) supernodes. Color each πΛβ π½Λ(π) in one of π(π) colors such that no π,Λ πΛ β π½Λ(π) with π,Λ πΛ
β πΛπ have the same color. For π β π½(π) write node(π)for theπΛβ π½Λ(π) such thatπ {πΛand write color(πΛ)for the color ofπΛ. Define the supernodal multicolor orderingβΊπby reordering the elements of πΌsuch that
(1) π βΊπ π forπ βπ½(π), π β π½(π) andπ < π;
(2) within each level π½(π), we order the elements of supernodes colored in the same color consecutively, i.e. given π, π β π½(π) such that color(node(π)) β color(node(π)),π βΊπ π =β π0 βΊπ π0for color(node(π0)) = color(node(π)), and color(node(π0)) =color(node(π)); and
(3) the elements of each supernode appear consecutively, i.e. given π, π β π½(π) such that node(π) β node(π), π βΊπ π =β π0 βΊπ π0 for node(π0) = node(π), and node(π0) =node(π).
Starting from a hierarchical ordering and sparsity pattern, the modified ordering and sparsity pattern can be obtained efficiently:
Lemma 14. In the setting of Examples 1 and 2, given {(π, π) | π(π, π) β€ π}, there exist constantsπΆandπmaxdepending only on the dimensionπand the cost of computingπ( Β·,Β· )such that the ordering and sparsity pattern presented in Construc- tion3can be constructed with π(π) β€ πmax, for each1 β€ π β€ π, in computational complexityπΆ π πππ.
84 Proof. The aggregation into supernodes can be done via a greedy algorithm by keeping track of all nodes that are not already within distance π/2 of a supernode and removing them one at a time. We can then go through π-neighbourhoods and remove points within distanceπ/2 from our list of candidates for future supernodes.
To create the coloring, we use the greedy graph coloring of [125] on the undirected graph πΊ with vertices Λπ½(π) and edges
(π,Λ πΛ) β πΛπ
π,Λ πΛ β π½Λ(π) . Defining deg(πΊ) as the maximum number of edges connected to any vertex ofπΊ, the computational complexity of greedy graph coloring is bounded above by deg(πΊ)#
π½(π)
and the number of colors used by deg(πΊ) +1. A sphere-packing argument shows that deg(πΊ)is at most a constant depending only on the dimensionπ, which yields the
result.
5.4.3 Proof of stability of incomplete Cholesky factorization in the supernodal multicolor ordering
We will now bound the approximation error of the Cholesky factors obtained from Algorithm3, using the supernodal multicolor ordering and sparsity pattern described in Construction3. For Λπ, πΛβ πΌΛ, letΞπ,ΛπΛ be the submatrix (Ξπ π)πβΛπ, πβπΛand letβ
π be the (dense and lower-triangular) Cholesky factor of a matrixπ.
Algorithm 3 with supernodal multicolor ordering βΊπ and sparsity pattern ππ is equivalent to the block-incomplete Cholesky factorization described in Algorithm9 where the functionRestrict!(Ξ, ππ) sets all entries ofΞoutside ofππto zero.
Algorithm 9Supernodal incomplete Cholesky factorization Input: ΞβRπΌΓπΌ symmetric
Output: πΏ βRπΌΓπΌ lower triangular Restrict!(Ξ, ππ)
forπΛβ πΌΛdo πΏ:,Λπ β Ξ:,Λπ/p
Ξπ,ΛΛπ>
for πΛΛπΛ: (π,Λ πΛ) βπΛdo
forπΛ πΛ: (π ,Λ πΛ),(π ,Λ πΛ) β πΛdo Ξπ ,Λ πΛ βΞπ ,Λ πΛβΞπ ,ΛΛπ(Ξπ,ΛπΛ)β1Ξπ ,ΛΛπ end for
end for end for return πΏ
We will now reformulate the above algorithm using the fact that the elimination of nodes of the same color, on the same level of the hierarchy, happens consecutively.
Let π be the maximal number of colors used on any level of the hierarchy. We can then write πΌ =Γ
1β€πβ€π,1β€πβ€ππ½(π ,π), where π½(π ,π) is the set of indices on level π colored in the color π. LetΞ(π ,π),(π,π) be the restriction of Ξto π½(π ,π) Γ π½(π,π) and write (π, π) βΊ (π , π) ββ π < π or(π = π andπ < π). We can then rewrite Algorithm9as:
Algorithm 10Supernodal incomplete Cholesky factorization Input: ΞβRπΌΓπΌ symmetric
Output: πΏ βRπΌΓπΌ lower triangular for1β€ π β€ πdo
for1β€ π β€ πdo Restrict!(Ξ, ππ) πΏ(:,:),(π ,π) βΞ(:,:),(π ,π)/p
Ξ(π ,π),(π ,π)>
ΞβΞβΞ(:,:),(π ,π) Ξ(π ,π),(π ,π)
β1
Ξ(π ,π),(:,:)
end for end for return πΏ
For 1 β€ π β€ π,1 β€ π β€ πand a matrixπ βRπΌΓπΌ withπ(:,:),(π,π), π(π,π),(:,:) =0 for all(π, π) βΊ (π , π), letS[π]be the matrix obtained by applyingRestrict!(π , ππ) followed by the Schur complementationπ β πβπ(:,:),(π ,π) π(π ,π),(π ,π)β1
π(π ,π),(:,:). We now prove a stability estimate for the operatorS. Letππ ,(π,π) be the restriction of a matrixπ βRπΌΓπΌ toπ½(π) Γπ½(π,π).
Lemma 15. For1 β€ πβ¦β€ πand1 β€ πβ¦ β€ πletΞ, πΈ βRπΌΓπΌ be such that
Ξ(:,:),(π,π),Ξ(π,π),(:,:) =0for all(π, π) βΊ (πβ¦, πβ¦), (5.17) and (writingΞπ ,π for the π½(π) Γπ½(π) submatrix ofΞandπmaxfor maximal singular values) define
πmin Bπmin(Ξπβ¦, πβ¦), πmaxB max
πβ¦β€πβ€π
πmax(Ξπβ¦, π). (5.18) If
max
πβ¦β€π ,πβ€πkπΈπ ,πkFro β€ π β€ πmin
2 , (5.19)
86 then the following perturbation estimate holds:
max
πβ¦β€π ,πβ€π
S[Ξ] βS[Ξ+πΈ]
π ,π
Fro
β€ 3
2 +2πmax πmin
+8 π2
max
π2
min
!
π . (5.20) Proof. Write ΛΞ, ΛπΈfor the versions ofΞ,πΈset to zero outside ofππ. Forπβ¦ β€ π , π β€ π,
(S[Ξ+πΈ] βS[Ξ])π ,π (5.21)
=ΞΛπ ,π+πΈΛπ ,πβ ΞΛ +πΈΛ
π ,(πβ¦,πβ¦) ΞΛ +πΈΛβ1
(πβ¦,πβ¦),(πβ¦,πβ¦) ΞΛ +πΈΛ
(πβ¦,πβ¦),π (5.22)
βΞΛπ ,π+ΞΛπ ,(πβ¦,πβ¦)ΞΛβ(π1β¦,πβ¦),(πβ¦,πβ¦)ΞΛ(πβ¦,πβ¦),π (5.23)
=πΈΛπ ,π+ ΞΛ +πΈΛ
π ,(πβ¦,πβ¦) ΞΛ +πΈΛβ1
(πβ¦,πβ¦),(πβ¦,πβ¦)πΈΛ(πβ¦,πβ¦),(πβ¦,πβ¦)ΞΛβ1(πβ¦,πβ¦),(πβ¦,πβ¦) ΞΛ +πΈΛ
(πβ¦,πβ¦),π
(5.24)
β ΞΛ +πΈΛ
π ,(πβ¦,πβ¦)ΞΛβ1(πβ¦,πβ¦),(πβ¦,πβ¦) ΞΛ +πΈΛ
(πβ¦,πβ¦),π+ΞΛπ ,(πβ¦,πβ¦)ΞΛβ1(πβ¦,πβ¦),(πβ¦,πβ¦)ΞΛ(πβ¦,πβ¦),π
(5.25)
=πΈΛπ ,π+ ΞΛ +πΈΛ
π ,(πβ¦,πβ¦) ΞΛ +πΈΛβ1
(πβ¦,πβ¦),(πβ¦,πβ¦)πΈΛ(πβ¦,πβ¦),(πβ¦,πβ¦)ΞΛβ(π1β¦,πβ¦),(πβ¦,πβ¦) ΞΛ +πΈΛ
(πβ¦,πβ¦),π
(5.26)
βπΈΛπ ,(πβ¦,πβ¦)ΞΛβ1(πβ¦,πβ¦),(πβ¦,πβ¦)ΞΛ(πβ¦,πβ¦),πβΞΛπ ,(πβ¦,πβ¦)ΞΛβ1(πβ¦,πβ¦),(πβ¦,πβ¦)πΈΛ(πβ¦,πβ¦),π (5.27)
βπΈΛπ ,(πβ¦,πβ¦)ΞΛβ1(πβ¦,πβ¦),(πβ¦,πβ¦)πΈΛ(πβ¦,πβ¦),π, (5.28)
where the second equality follows from the matrix identity
(π΄+π΅)β1 = π΄β1β (π΄+π΅)β1π΅ π΄β1. (5.29) Now recall that, for all π΄ β RπΓπ, π΅ β RπΓπ , kπk β€ kπkFro and kπ΄ π΅kFro β€ kπ΄k kπ΅kFro. Therefore, k (π΄+πΈ)β1k β€ 2/πmin and kπ΄+πΈk β€ 2πmax. Combining these estimates and using the triangle inequality yields
(S[π΄+πΈ] βS[π΄])π ,π
Fro (5.30)
β€ kπΈπ ,πkFro+8π2
max
π2
min
kπΈπβ¦, πβ¦kFro+πmax πmin
( kπΈπ ,πkFro+ kπΈπ , πkFro) (5.31) +πβ1
minkπΈπ , πβ¦kFrokπΈπβ¦,πkFro (5.32)
β€ 1+8π2
max
π2
min
+2πmax πmin
+ π πmin
!
π (5.33)
β€ 3
2+2πmax πmin
+8π2
max
π2
min
!
π . (5.34)
Recursive application of the above lemma gives a stability result for the incomplete Cholesky factorization.
Lemma 16. For π > 0, let βΊπ and ππ be a supernodal ordering and sparsity pattern such that the maximal number of colors used on each level is at most π. LetπΏππbe an invertible lower-triangular matrix with nonzero patternππand define π B πΏπππΏππ,>. Assume that π satisfies Condition2 with constantπ . Then there exists a universal constantπΆ such that, for all0 < π < πmin(π)
2π2(πΆ π )2π π and allπΈ βRπΌΓπΌ withkπΈkFro β€ π,
π βπΏΛπππΏΛππ,>
Fro β€ π2(πΆ π )2π ππ , (5.35) where πΏΛ(ππ) is the Cholesky factor obtained by applying Algorithm10toπ+πΈ. Proof. The result follows from applying Lemma15at each step of Algorithm10.
5.4.4 Conclusion
Using the stability result in Lemma 16, we can finally prove that when using the supernodal multicolor ordering and sparsity pattern incomplete Cholesky factor- ization applied toΞ attains anπ-accurate Cholesky factorization in computational complexityO
πlog2(π)log2π(π/π) .
Theorem 18. In the setting of Examples1and2, there exists a constantπΆdepending only on π , π ,kL k,kLβ1k, β, and πΏ such that, given the ordering βΊπ and sparsity pattern ππ defined as in Construction 3 with π β₯ πΆlog(π/π), the incomplete Cholesky factor πΏobtained from Algorithm3has accuracy
kπΏ πΏπ βΞkFro β€ π . (5.36)
Furthermore, Algorithm3has complexity of at mostπΆ π π2πlog2(π)in time and at mostπΆ π ππlog(π)in space.
Proof of Theorem18. Theorem9implies that by choosing π β₯ πΆΛlog(π/π), there exists a lower-triangular matrix ΛπΏππwith
ΞβπΏΛπππΏΛππ,>
Fro β€ π and sparsity pattern ππ. Theorem 7 implies that the Examples 1 and 2 satisfy πmin β₯ 1/poly(π).
Therefore, choosing π β₯ πΆΛlogπ ensures that π <
πmin(Ξ)
2 and thus that ΛΞ B πΏΛπππΏΛππ,> satisfies Condition2 with constant 2πΆΞ¦, whereπΆΞ¦ is the corresponding constant forΞ. By possibly changing ΛπΆ again,π β₯πΆΛlogπ ensures that
π β€ πmin(Ξ) 2π2 πΆ π ΞΛ 2π π
,
88 where πΆ is the constant of Lemma 16, since π β logπ and, by Lemma 14, π is bounded independently ofπ. Thus, by Lemma16, the Cholesky factorπΏππobtained from applying Algorithm10toΞ =ΞΛ + ΞβΞΛ
satisfies
ΞΛ βπΏπππΏππ,>
Fro β€ π2(4πΆ π )2π ππ β€ poly(π)π , (5.37) where π is the constant with which Ξ satisfies Condition 2 and the polynomial depends only onπΆ, π , and π. Since, for the orderingβΊπ and sparsity pattern ππ, the Cholesky factors obtained via Algorithms 3 and 10 coincide, we obtain the
result.
This result holds for both element-wise and supernodal factorization, in either its left, up, or right-looking forms. As remarked in Section5.3.1, using the two-way supernodal sparsity pattern for factorization of Ξβ1 in the fine-to-coarse ordering degrades the asymptotic complexity. Therefore, the above result does not immedi- ately prove the accuracy of the Cholesky factorization in this setting, with optimal complexity. However, the column-supernodal factorization described in Section5.3 can similarly be described in terms of O (log(π)) Schur-complementations. Thus, the above proof can be modified to show that when using the column-supernodal multicolor ordering and sparsity pattern, ICHOL(0)applied to Ξβ1 computes an π-approximation in computational complexityO (πlog(π/π)).
5.5 Numerical example: Compression of dense kernel matrices