Bayesian model class selection for multi-events

3.3 Bayesian approach for multi-events EEW

3.3.1 Bayesian model class selection for multi-events

3.3.1.1 Major notation

The EEW system receives and processes seismic network data continuously to provide updated warning and earthquake information. In the case of multi-events, the EEW system needs to identify the number of concurrent events given the current data set, as well as their information, in order

to broadcast accurate warnings to the users. For the ease of derivation, this chapter adopts the following notation:

• D_1:t={D_j(1 :t)|j= 1, ..., N_st}—set of waveform data for N_st stations from initial time step 1 to time stept

• F_t = {F_j(t)|j = 1, ..., N_st}—set of vectors of data features, which are extracted from the waveform dataD_1:t, for each of the Nst stations, and used for parameter estimation

• M_n—model class for assumingnconcurrent events are captured within the current data set D_1:t

• Θ ={θ_l|l= 1, ..., n}—set of vectors of earthquake parametersθ_lfor each of thenevents given M_n(Assumption A1: Θ is independent of time t, i.e., it is a static variable)

3.3.1.2 Probability approach of EEW

Based on the notation in Section 3.3.1.1, in this section the two outputs of EEW are expressed in a mathematical form based on fundamental probability theory.

Starting from some initial time step 1, at every time step t:

1. Find the most probable number of events that explain the current data set at time t, i.e., Find ˆn= argmax_n{P(M_n|D_1:t)}.

This optimization problem is often referred to as Bayesian model class selection in the lit- erature (Beck, 2010). To solve ˆn with a more explicit expression, one can consider a simple sequential model of Bayesian Network: (D_1:t)→(F_t)→(Θ/M_n)

P(M_n|D_1:t) = Z

P(M_n|F_t)p(F_t|D_1:t) dF_t (3.1) p(F_t|D_1:t) is the PDF that includes error in finding features from the original waveform data.

Assumption A2: For simplicity, take a deterministic model for finding F_t from a given data set up to timet,D_1:t. As a result, P(M_n|D_1:t) =P(M_n|F_t), whereF_t is a function of D_1:t. By Bayes’ theorem:

P(M_n|F_t) = p(F_t|M_n)P(M_n)

p(F_t) ∝p(F_t|M_n)P(M_n) (3.2)

Assumption A3: non-informative prior P(M_n) = constant∀n, to avoid imposing any bias on any model class before the data is collected. Hence:

P(M_n|F_t)∝p(F_t|M_n)

⇒nˆ= argmax_n{P(M_n|F_t)}= argmax_n{p(F_t|M_n)}

(3.3)

where by the Total Probability Theorem:

p(F_t|M_n) = Z

p(F_t|Θ,M_n)p(Θ|M_n) dΘ (3.4)

The models for p(F_t|Θ,M_n) and p(Θ|M_n) are introduced later.

2. Find earthquake parameter values Θ for all events givenM_n_ˆ, i.e., Find p(Θ|M_n_ˆ,D_1:t), the posterior PDF of Θ.

This is the Bayesian inference problem under a specified model class. By Bayes’ theorem (with Assumption A2 applied):

p(Θ|M_n_ˆ,D_1:t) =p(Θ|M_ˆ_n,F_t)

= p(F_t|Θ,M_ˆ_n)p(Θ|M_n_ˆ) p(F_t|M_n_ˆ)

∝p(F_t|Θ,M_n_ˆ)p(Θ|M_n_ˆ)

(3.5)

Note that the evidence functionp(F_t|M_ˆ_n) is the same one used for finding ˆn. Hence, one can simply calculate the posterior of the parameters of each possible model class and pick the one with maximum evidence value.

3.3.1.3 Practical implementation to handle multi-events

An effective EEW system should provide regular updates of the warning using the continuous seismic data stream. For example, the JMA EEW system provides a warning update every second.

Hence, there is limited time to perform the full model class selection scheme through calculating the evidence functions of all possible model class M_n_ˆ. From our experience, existing methods to calculate or estimate the evidence function p(F_t|M_n) may not be fast enough for this purpose.

This motivates the need of a suboptimal model class selection scheme that is efficient yet robust.

Earthquakes happen in sequence. Therefore, M_n_ˆ is a monotonically increasing function of time t. Exploiting this pattern, instead of searching for an optimal ˆnat every second, one may start with ˆ

n= 0 and increase ˆn by one every time a fast calculated criterion is met. Intuitively, ˆnshould be increased by one when the current data setD_1:tcannot be explained by any of the identified events given the currently selected model classM_n_ˆ. In other words, letM_n_ˆ ={M_l|l= 1, ...,n}ˆ whereM_l represents each event identified withinM_n_ˆ, one may increase ˆnby one when the following criterion is met (with Assumption A2 applied):

p(F_t|M_l)< τ_new ∀l= 1, ...,nˆ (3.6) Here,τ_new is some empirical threshold (possibly depending on ˆn) for how well the current data set features F_t are explained by an eventMl, and p(F_t|M_l) is calculated by Equation 3.4.

Note that with this new approach,θ_l is only dependent onM_l withinM_n_ˆ. Hence, the posterior of each θl∈Θ, p(θl|M_n_ˆ,D_1:t) =p(θl|M_l,D_1:t), can be found separately. For notational simplicity, M_l will be left as implicit whenever θ_l appears in the rest of this chapter. Hence, the posterior of parameters for each event becomes:

p(θ_l|F_t) = p(F_t|θ_l)p(θ_l)

p(F_t|M_l) ∝p(F_t|θ_l)p(θ_l) (3.7) Equation 3.6 involves calculations with the complete feature set F_t. One can further simplify the criterion by calculating with only one set of featuresF_j(t) that is extracted from a single station j. Because most earthquakes have only one first triggered station, each event can be represented by its first triggered station. One can continuously search for newly triggered stations that have a low probability to be caused by any of the existing events inM_n_ˆ. Those stations are likely to be the first triggered stations of new events. As a result, a more efficient criterion is:

p(Fj(t)|M_l)< τnew for newly triggered station j &∀l= 1, ...,nˆ (3.8) As will be discussed in Section 3.3.2, this study adopts a simple numerical method, called the Rao-Blackwellized Importance Sampling (RBIS), to estimate the posterior PDF of the earthquake parameters. Under this method, calculation of p(Fj(t)|M_l) may involve integrating information from all samples ˜θ⁽ⁱ⁾ for i = 1, ..., N_s. To further reduce computational effort, one may estimate

p(F_j(t)|M_l) with an optimal value ˆθ_lbased on the assumption that the posterior PDF ofθ_l,p(θ_l|F_t), has a narrow peak. For example, ˆθl can be the mean or maximizer ofp(θl|F_t). Hence, the criterion can be changed to:

p(F_j(t)|θˆ_l)< τ_new for newly triggered station j &∀l= 1, ...,nˆ (3.9)

Dalam dokumen Future of Earthquake Early Warning (Halaman 46-50)