Setting and notation - Proving exponential decay of Cholesky factors

Chapter IV: Proving exponential decay of Cholesky factors

4.2 Setting and notation

4.2.1 The class of elliptic operators

For our rigorous, a priori, complexity-vs.-accuracy estimates, we assume that G is the Green’s function of an elliptic operator L of order 2𝑠 (𝑠, 𝑑 ∈ N), defined on a bounded Lipschitz domainΩ ⊂ R^𝑑, and acting on𝐻^𝑠

0(Ω), the Sobolev space of (zero boundary value) functions having derivatives of order 𝑠 in 𝐿²(Ω). More

Figure 4.1: A regularity criterion. We measure the regularity of the distributions of measurement points as the ration of𝛿_min, the smallest distancebetween neighboring points or points and the boundary, and𝛿_max, the radius of the largest ballthat does not contain any points.

precisely, writing 𝐻⁻^𝑠(Ω) for the dual space of 𝐻^𝑠

0(Ω) with respect to the 𝐿²(Ω) scalar product, our rigorous estimates will be stated for an arbitrary linear bijection

L: 𝐻^𝑠

0(Ω) →𝐻^−𝑠(Ω) (4.4)

that issymmetric (i.e.∫

Ω𝑢L𝑣d𝑥 = ∫

Ω𝑣L𝑢d𝑥), positive (i.e.∫

Ω𝑢L𝑢d𝑥 ≥ 0), and localin the sense that

∫

𝑢L𝑣d𝑥 =0 for all𝑢, 𝑣 ∈𝐻^𝑠

0(Ω)such that supp𝑢∩supp𝑣=∅. (4.5) Let kL k B sup𝑢∈𝐻^𝑠

0 kL𝑢k𝐻^−𝑠/k𝑢k𝐻^𝑠

0 and kL⁻¹k B sup𝑓∈𝐻^−𝑠 kL⁻¹𝑓k𝐻^𝑠

0/k𝑓k𝐻^−𝑠

denote the operator norms ofLandL⁻¹. The complexity and accuracy estimates for our algorithm will depend on (and only on)𝑑 , 𝑠,Ω,kL k, kL⁻¹k, and the parameter

𝛿 B 𝛿_min 𝛿_max B

min𝑖≠𝑗∈𝐼dist 𝑥_𝑖,{𝑥_𝑗} ∪𝜕Ω

max𝑥∈Ωdist(𝑥 ,{𝑥_𝑖}_𝑖∈𝐼 ∪𝜕Ω), (4.6) the geometric meaning of which is illustrated in Figure4.1.

46 4.2.2 Discretization in the abstract

Before talking about computation, we need to discretize the infinite-dimensional spaces 𝐻^𝑠

0(Ω) and 𝐻^−𝑠(Ω) by approximating them with finite vector spaces. We first introduce this procedure in the abstract.

ForBa separable Banach space with dual space B^∗ (such as 𝐻^𝑠

0(Ω) and𝐻^−𝑠(Ω)), we write [ ·, · ] for the duality product between B^∗ and B. Let L: B → B^∗ be a linear bijection and let G B L⁻¹. Assume L to be symmetric and positive (i.e.

[L𝑢, 𝑣] = [L𝑣 , 𝑢]and[L𝑢, 𝑢] ≥ 0 for𝑢, 𝑣 ∈ B). Letk · kbe the quadratic (energy) norm defined byk𝑢k²B [L𝑢, 𝑢]for𝑢 ∈ Band letk · k_∗be its dual norm defined by

k𝜙k_∗ B sup

0≠𝑢∈B

[𝜙, 𝑢]

k𝑢k =[𝜙,G𝜙]for𝜙 ∈ B^∗. (4.7) Let {𝜙_𝑖}_𝑖∈𝐼 be linearly independent elements ofB^∗ (known asmeasurement func- tions) and letΘ∈R^𝐼×𝐼 be the symmetric positive-definite matrix defined by

Θ𝑖 𝑗 B [𝜙_𝑖,G𝜙_𝑗] for𝑖, 𝑗 ∈ 𝐼. (4.8) We assume that we are given 𝑞 ∈ N and a partition 𝐼 = Ð

1≤𝑘≤𝑞𝐽⁽^𝑘⁾ of 𝐼. We represent𝐼 ×𝐼 matrices as𝑞×𝑞 block matrices according to this partition. Given an 𝐼 × 𝐼 matrix 𝑀, we write 𝑀_{𝑘 ,𝑙} for the (𝑘 , 𝑙)^th block of 𝑀, and 𝑀_𝑘

1:𝑘₂,𝑙₁:𝑙₂ for the sub-matrix of 𝑀 defined by blocks ranging from 𝑘₁ to 𝑘₂ and𝑙₁ to𝑙₂. Unless specified otherwise, we write 𝐿 for the lower-triangular Cholesky factor of Θand define

Θ^(𝑘) B Θ_1:𝑘 ,1:𝑘, 𝐴^(𝑘) B Θ^(𝑘),−1, 𝐵⁽^𝑘) B 𝐴^(𝑘)

𝑘 , 𝑘 for 1 ≤ 𝑘 ≤ 𝑞. (4.9)

We interpret the{𝐽⁽^𝑘⁾}_1≤𝑘≤𝑞as labelling a hierarchy of scales with𝐽⁽¹⁾representing the coarsest and𝐽⁽^𝑞⁾ the finest. We write 𝐼⁽^𝑘⁾ forÐ

1≤𝑘⁰≤𝑘𝐽⁽^𝑘

0).

Throughout this section, we assume that the ordering of the set 𝐼 of indices is compatible with the partition 𝐼 = Ð

𝑘=1^𝑞𝐽^(𝑘), i.e. 𝑘 < 𝑙, 𝑖 ∈ 𝐽^(𝑘) and 𝑗 ∈ 𝐽^(𝑙) together imply𝑖 ≺ 𝑗. We will write 𝐿 or chol(Θ) for the Cholesky factor ofΘin that ordering.

4.2.3 Discretization of𝐻^𝑠

0(Ω)and𝐻⁻^𝑠(Ω)

While similar results are true for a wide range of measurements{𝜙_𝑖} ∈ B =𝐻⁻^𝑠(Ω) we will restrict our attention to two archetypical examples given by pointwise evaluation and nested averages.

We will assume (without loss of generality after rescaling) that diam(Ω) ≤ 1. As described in Figure3.8, successive points of the maximin ordering can be gathered into levels so that after appropriate rescaling of the measurements, the Cholesky factorization in the maximin ordering falls in the setting of Example1.

Example 1. Let 𝑠 > 𝑑/2. For ℎ, 𝛿 ∈ (0,1) let {𝑥_𝑖}_𝑖∈𝐼(1) ⊂ {𝑥_𝑖}_𝑖∈𝐼(2) ⊂ · · · ⊂ {𝑥_𝑖}_𝑖∈𝐼(𝑞) be a nested hierarchy of points inΩthat are homogeneously distributed at each scale in the sense of the following three inequalities:

(1) sup_𝑥∈Ωmin_𝑖∈𝐼⁽^𝑘⁾ |𝑥−𝑥_𝑖| ≤ ℎ^𝑘, (2) min_𝑖∈𝐼^(𝑘)inf𝑥∈𝜕Ω|𝑥−𝑥_𝑖| ≥ 𝛿 ℎ^𝑘, and (3) min𝑖, 𝑗∈𝐼^(𝑘):𝑖≠𝑗 |𝑥_𝑖−𝑥_𝑗| ≥ 𝛿 ℎ^𝑘.

Let𝐽⁽¹⁾ B 𝐼⁽¹⁾ and𝐽⁽^𝑘⁾ B 𝐼⁽^𝑘⁾ \ 𝐼⁽^𝑘⁻¹⁾ for 𝑘 ∈ {2, . . . , 𝑞}. Let𝜹 denote the unit Dirac delta function and choose

𝜙_𝑖 B ℎ

𝑘 𝑑

2 𝜹(𝑥−𝑥_𝑖) for𝑖 ∈ 𝐽⁽^𝑘⁾ and𝑘 ∈ {1, . . . , 𝑞}. (4.10) The discretization chosen in Example 1 is not applicable for 𝑠 < 𝑑/2 since in this case, functions in 𝐻^𝑠

0(Ω) are not defined point-wise and thus 𝜹 ∉ B^∗. A possible alternative is to replace 𝜙_𝑖 B ℎ

𝑘 𝑑

2 𝜹(𝑥 −𝑥_𝑖) with 𝜙_𝑖 B ℎ⁻

𝑘 𝑑

2 1𝐵_ℎ(0)(𝑥 − 𝑥_𝑖). However, while the exponential decay result in Theorem 4 seems to be true empirically for this choice of measurements, we are unable to prove it for𝑠 < 𝑑/2.

Furthermore, the numerical homogenization result of Theorem 7 is false for this choice of measurements and 𝑠 < 𝑑/2. However, our results can still be recovered by choosing measurements obtained as a hierarchy of local averages.

Given subsets ˜𝐼 ,𝐽˜ ⊂ 𝐼, we extend a matrix 𝑀 ∈ R^𝐼^˜^×^𝐽^˜ to an element of R^𝐼×𝐽 by padding it with zeros.

Example 2. (See Figure4.2.) Forℎ, 𝛿∈ (0,1), let(𝜏^(𝑘)

𝑖 )_𝑖∈𝐼(𝑘) be uniformly Lipschitz convex sets forming a regular nested partition of Ω in the following sense. For 𝑘 ∈ {1, . . . , 𝑞}, Ω = Ð

𝑖∈𝐼⁽^𝑘⁾𝜏⁽

𝑘)

𝑖 is a disjoint union except for the boundaries.

𝐼^(𝑘) is a nested set of indices, i.e. 𝐼⁽^𝑘) ⊂ 𝐼⁽^𝑘+1) for 𝑘 ∈ {1, . . . , 𝑞 − 1}. For 𝑘 ∈ {2, . . . , 𝑞} and𝑖 ∈ 𝐼⁽^𝑘−1), there exists a subset 𝑐_𝑖 ⊂ 𝐼⁽^𝑘) such that𝑖 ∈ 𝑐_𝑖 and 𝜏⁽

𝑘−1)

𝑖 =Ð

𝑗∈𝑐𝑖

𝜏⁽

𝑘)

𝑗 . Assume that each𝜏⁽

𝑘)

𝑖 contains a ball 𝐵

𝛿 ℎ^𝑘(𝑥⁽

𝑘)

𝑖 )of center𝑥⁽

𝑘) 𝑖

and radius𝛿 ℎ^𝑘, and is contained in the ball 𝐵

ℎ^𝑘(𝑥^(𝑘)

𝑖 ). For𝑘 ∈ {2, . . . , 𝑞}and𝑖 ∈

Figure 4.2: Hierarchical averaging. We illustrate the construction described in Example2in the case𝑞 =2. On the left we see the nested partition of the domain, and on the right we see (the signs of) a possible choice for𝜙₁,𝜙₅, and𝜙₆.

𝐼^(𝑘−¹⁾, let the submatrices𝔴^(𝑘),𝑖 ∈R^(𝑐^𝑖^{\{𝑖})×𝑐}^𝑖 satisfyÍ

𝑗∈𝑐𝑖𝔴^(𝑘),𝑖

𝑚, 𝑗 𝔴^(𝑘),𝑖

𝑛, 𝑗 |𝜏⁽^𝑘)

𝑗 | = 𝛿_{𝑚 𝑛} andÍ

𝑗∈𝑐𝑖𝔴⁽^𝑘⁾^,𝑖

𝑙 , 𝑗 |𝜏⁽

𝑘)

𝑗 | = 0for each𝑙 ∈ 𝑐_𝑖 \ {𝑖}, where |𝜏⁽

𝑘)

𝑖 |denotes the volume of 𝜏⁽

𝑘)

𝑖 . Let 𝐽⁽¹⁾ B 𝐼⁽¹⁾ and𝐽⁽^𝑘⁾ B 𝐼⁽^𝑘⁾ \𝐼⁽^𝑘⁻¹⁾ for 𝑘 ∈ {2, . . . , 𝑞}. Let𝑊⁽¹⁾ be the 𝐽⁽¹⁾× 𝐼⁽¹⁾ matrix defined by𝑊⁽¹⁾

𝑖 𝑗 B 𝛿_{𝑖 𝑗}. Let𝑊⁽^𝑘⁾ be the𝐽⁽^𝑘⁾ ×𝐼⁽^𝑘⁾ matrix defined by𝑊⁽^𝑘⁾ BÍ

𝑖∈𝐼^(𝑘−1) 𝔴⁽^𝑘⁾^,𝑖 for 𝑘 >2, where we set 𝜙_𝑖 B ℎ⁻^{𝑘 𝑑}^/2

𝑗∈𝐼⁽^𝑘⁾

𝑊^(𝑘)

𝑖, 𝑗 1_𝜏^(𝑘)

𝑗

for each𝑖 ∈ 𝐽⁽^𝑘⁾ (4.11) and define [𝜙_𝑖, 𝑢] B ∫

Ω𝜙_𝑖𝑢d𝑥. In order to keep track of the distance between the different𝜙_𝑖of Example2, we choose an arbitrary set of points{𝑥_𝑖}𝑖∈𝐼 ⊂ Ωwith the property that𝑥_𝑖 ∈supp(𝜙_𝑖) for each𝑖 ∈ 𝐼.

In the above, we have discretized the Green’s functions of the elliptic operators resulting in the Green’s matrixΘas the fundamental discrete object. The inverse𝐴of Θcan be interpreted as the stiffness matrix obtained from the Galerkin discretization ofL using the basis given by

𝜓_𝑖 B Õ

𝑗

𝐴_{𝑖 𝑗}G 𝜙_𝑗

∈ B. (4.12)

These types of basis functions are referred to as gamblets in the prior works of [188–190] that form the basis for the proofs in this chapter. While exponentially

decaying, these basis functions are nonlocal and unknown apriori, hence they cannot be used to discretize a partial differential operator with unknown Green’s function.

ForΘthe inverse of a Galerkin discretization of L in a local basis, analog results can be obtained by repeating the proofs of Theorems4and7in the discrete setting.

In the setting of Examples 1 and 2, denoting as 𝐿 the lower triangular Cholesky factor ofΘorΘ⁻¹, we will show that

|𝐿_{𝑖 𝑗}| ≤ poly(𝑁)exp(−𝛾 𝑑(𝑖, 𝑗)), (4.13) for a constant𝛾 > 0 and a suitable distance measure𝑑( ·, · ): 𝐼× 𝐼 →R.

Dalam dokumen Inference, Computation, and Games (Halaman 67-72)