Smoothing Problem - Texts in Applied Mathematics 62

Fig. 2.8: Dynamics of the Lorenz-96 model in the chaotic regime (F, K) = (8, 40).

Fig. 2.9: Projection of the Lorenz-96 attractor onto two diﬀerent pairs of coordinates.

case of either the stochastic dynamics or the deterministic dynamics as the smoothing dis-tribution. It is a random variable that contains all the probabilistic information about the signal, given our observations. The key concept that drives this approach is Bayes’s formula from Section1.1.4, which we use repeatedly in what follows.

2.3.2. Stochastic Dynamics

We wish to find the signal v from (2.1) from a single instance of data y given by (2.2). To be more precise, we wish to condition the signal on a discrete time intervalJ0={0, . . . , J}, given data on the discrete time intervalJ = {1, . . . , J}; we refer to J0as the data assimilation window. We define v ={vj}j∈J0, y ={yj}j∈J, ξ ={ξj}j∈J0, and η ={ηj}j∈J. The smoothing distribution here is the distribution of the conditioned random variable v|y. Recall that we have assumed that v₀, ξ, and η are mutually independent random variables. With this fact in hand, we may apply Bayes’s formula to find the pdfP(v|y).

Prior. The prior on v is speciﬁed by (2.1), together with the independence of u and ξ and the i.i.d. structure of ξ. First note that using (1.5) and the i.i.d. structure of ξ in turn, we obtain

P(v) = P(vJ, v_J−1,· · · , v0)

=P(vJ|vJ−1,· · · , v0)P(vJ−1,· · · , v0)

=P(vJ|vJ−1)P(vJ−1,· · · , v0).

Proceeding inductively gives

P(v) =

J−1

j=0

P(vj+1|vj)P(v0).

Now

P(v0)∝ exp

−1

2C₀⁻¹²(v₀− m0)² , while

P(vj+1|vj)∝ exp

−1 2

Σ⁻¹²

v_j+1− Ψ(vj)² .

The probability distributionP(v) that we now write down is not Gaussian, but the distribution on the initial conditionP(v0), and the conditional distributionsP(vj+1|vj), are all Gaussian, making the explicit calculations above straightforward.

Combining the preceding information, we obtain P(v) ∝ exp(−J(v)), where

J(v) := ¹₂C₀⁻¹²(v₀− m0)²+!_J−1

j=0 1 2Σ⁻¹²

v_j+1− Ψ(vj)² (2.19a)

= ¹₂v₀− m0²

C0+!_J−1

j=0 1

2v_j+1− Ψ(vj)²

Σ. (2.19b)

The pdfP(v) = ρ0(v) proportional to exp(−J(v)) determines a prior measure μ0 onR^|J⁰^|×n. The fact that the probability is not in general Gaussian follows from the fact that Ψ is not in general linear.

Likelihood. The likelihood of the data y|v is determined as follows. It is a (Gaussian) probability distribution onR^|J|×m, with pdfP(y|v) proportional to exp(−Φ(v; y)), where

Φ(v; y) =

J−1

j=0

2y_j+1− h(vj+1)²

Γ. (2.20)

To see this, note that because of the i.i.d. nature of the sequence η, it follows that

P(y|v) =

J−1

j=0

P(yj+1|v)

J−1

j=0

P(yj+1|vj+1)

∝^J−1

j=0

exp

−1 2Γ⁻¹²

y_j+1− h(v_j+1)²

= exp(−Φ(v; y)).

In the applied literature, m₀ and C₀ are often referred to as the background mean and background covariance respectively; we refer to Φ as the model–data misfit functional.

Using Bayes’s formula (1.7), we can combine the prior and the likelihood to determine the posterior distribution, that is, the smoothing distribution, on v|y. We denote the measure with this distribution by μ.

Theorem 2.8. The posterior smoothing distribution on v|y for the stochastic dynamics model (2.1), (2.2) is a probability measure μ on R^|J⁰^|×n with pdf P(v|y) = ρ(v) proportional to exp(−I(v; y)), where

I(v; y) = J(v) + Φ(v; y). (2.21)

Proof Bayes’s formula (1.7) gives us

P(v|y) = P(y|v)P(v) P(y) .

Thus, ignoring constants of proportionality that depend only on y, we have P(v|y) ∝ P(y|v)P(v0)

∝ exp(−Φ(v; y)) exp(−J(v))

= exp(−I(v; y)).

Note that although the preceding calculations required only knowledge of the pdfs of Gaussian distributions, the resulting posterior distribution is non-Gaussian in general, unless Ψ and h are linear. This is because unless Ψ and h are linear,I(·; y) is not quadratic. We refer toI as the negative log-posterior. It will be helpful later to note that

ρ(v)

ρ₀(v) ∝ exp

−Φ(v; y)

. (2.22)

2.3.3. Reformulation of Stochastic Dynamics

For the development of algorithms to probe the posterior distribution, the following reformu-lation of the stochastic dynamics problem can be very useful. For this, we deﬁne the vector ξ = (v₀, ξ₀, ξ₁,· · · , ξJ−1)∈ R^|J⁰^|n. The following lemma is key to what follows.

Lemma 2.9. Deﬁne the mapping G :R^|J⁰^|×n→ R^|J⁰^|×n by

G_j(v₀, ξ₀, ξ₁,· · · , ξJ−1) = v_j, j = 0,· · · , J,

where v_j is determined by (2.1). Then this mapping is invertible. Furthermore, if Ψ≡ 0, then G is the identity mapping.

Proof In words, the mapping G takes the initial condition and noise into the signal. Invert-ibility requires determination of the initial condition and the noise from the signal. From the signal, we may compute the noise as follows, noting that of course, the initial condition is speciﬁed, and that then we have

ξ_j = v_j+1− Ψ(vj), j = 0,· · · , J − 1.

The fact that G becomes the identity mapping when Ψ ≡ 0 follows directly from (2.1) by

inspection.

We may thus consider the smoothing problem as finding the probability distribution of ξ, as defined prior to the lemma, given data y, with y as defined in Section2.3.2. Furthermore, we have, using the notion of pushforward,

P(v|y) = G P(ξ|y), P(ξ|y) = G⁻¹P(v|y). (2.23) These formulas mean that it is easy to move between the two measures: samples from one can be converted into samples from the other simply by applying G or G⁻¹. This means that algorithms can be applied, for example, to generate samples from ξ|y and then convert them into samples from v|y. We will use this later on. In order to use this idea, it will be helpful to have an explicit expression for the pdf of ξ|y. We now ﬁnd such an expression.

To begin, we introduce the measure ϑ₀ with density π₀ found from μ₀ and ρ₀ in the case Ψ≡ 0. Thus

π₀(v)∝ exp

⎛

⎝−1 2

C₀⁻¹²(v₀− m₀)²−^J−1

j=0

2|Σ⁻¹²v_j+1|²

⎞

⎠ (2.24a)

∝ exp

⎛

⎝−1

2|v0− m0|²_C₀−

J−1

j=0

1 2|vj+1|²_Σ

⎞

⎠ , (2.24b)

and hence ϑ₀ is a Gaussian measure, independent in each component v_j for j = 0,· · · , J.

By Lemma2.9, we also deduce that measure ϑ₀ with density π₀ is the prior on ξ as deﬁned above:

π₀(ξ)∝ exp

⎛

⎝−1

2|v0− m0|²_C₀−

J−1

j=0

1 2|ξj|²_Σ

⎞

⎠ . (2.25)

We now compute the likelihood of y|ξ. For this, we deﬁne Gj(ξ) = h

G_j(ξ)

(2.26)

and note that we may then concatenate the data and write

y =G(ξ) + η, (2.27)

where η = (η₁,· · · , η_J) is the Gaussian random variable N (0, Γ_J), where Γ_Jis a block diagonal nJ× nJ matrix with n × n diagonal blocks Γ . It follows that the likelihood is determined by P(y|ξ) = N

G(ξ), ΓJ

. Applying Bayes’s formula from (1.7) to ﬁnd the pdf for ξ|y, we obtain the posterior ϑ on ξ|y, as summarized in the following theorem.

Theorem 2.10. The posterior smoothing distribution on ξ|y for the stochastic dynamics model (2.1), (2.2) is a probability measure ϑ on R^|J⁰^|×n with pdf P(ξ|y) = π(ξ) proportional to exp(−Ir(ξ; y)), where

Ir(ξ; y) = Jr(ξ) + Φr(ξ; y), (2.28)

Φr(ξ; y) := 1

2|(y − G(ξ))|²_Γ_J, and

Jr(ξ) := 1

2|v0− m0|²_C₀+

J−1

j=0

1 2|ξj|²_Σ. We refer toIr as the negative log-posterior.

2.3.4. Deterministic Dynamics

It is also of interest to study the posterior distribution on the initial condition in the case that the model dynamics contains no noise and is given by (2.3); this we now do. Recall that Ψ^(j)(·) denotes the j−fold composition of Ψ(·) with itself. In the following, we sometimes refer toJdet as the background penalization, and m₀ and C₀ as the background mean and covariance; we refer to Φ_detas the model–data misfit functional.

Theorem 2.11. The posterior smoothing distribution on v₀|y for the deterministic dynamics model (2.3), (2.2) is a probability measure ν on Rⁿ with densityP(v0|y) = (v0) proportional to exp(−Idet(v₀; y)), where

Idet(v₀; y) =Jdet(v₀) + Φ_det(v₀; y), (2.29a) Jdet(v₀) = 1

2v₀− m0²

C0, (2.29b)

Φ_det(v₀; y) =

J−1

j=0

2y_j+1− h

Ψ^(j+1)(v₀)²

Γ. (2.29c)

Proof We again use Bayes’s rule, which states that P(v₀|y) =P(y|v0)P(v0)

P(y) .

Thus, ignoring constants of proportionality that depend only on y, we have P(v0|y) ∝ P(y|v0)P(v0)

∝ exp

−Φdet(v₀; y) exp

−1

2|v0− m0|²_C₀

= exp(−I_det(v₀; y)).

Here we have used the fact that P(y|v0) is proportional to exp

−Φdet(v₀; y)

; this follows from the fact that y_j|v0 form an i.i.d. sequence of Gaussian random variables N

h(v_j), Γ )

with v_j = Ψ^(j)(v₀).

We refer toIdet as the negative log-posterior.

Dalam dokumen Texts in Applied Mathematics 62 (Halaman 51-56)