Chapter VI: Epidemic Spread Mitigation in Population Networks
6.3 CHMM Module: Individual-Level Propagation
∀e∈ {c1,· · · ,c7,d1,· · · ,d9}: dIk[e](t)
dt =α·,kEk[e](t)−γ·,k(d)Ik[e](t) (6.7c)
dRk(t) dt =
3
X
i=1
γi,k(r)Ik(t) (6.7d)
dDk(t)
dt = X
e∈{c1,···,d9}
γ·,k(d)Ik(t) (6.7e)
where the parameters are defined in (6.6) and the subscript of·in (6.7c) and (6.7e) is replaced with the appropriate index.
1 2 3
... ...
... ...
... ...
Figure 6.5: A sample coupled hidden Markov model relating the health statuses of three individuals in a community, along with the observed symptoms.
individual n∈ V, with entries On(Yn,j(t)|Xn(t)) denoting the probability of observing the positive presence symptom j in individual v at time t, i.e., Yn,j(t) = 1. Note that each row of this matrix does not necessarily need to sum to 1 since the entries correspond only to the case yn,j = 1. The boldfaced notation for X and x also extends to Yn,yn in the same way, as {Yn(0 :t) =yn(0 :t)} ≡ {Yn,1(0 :t) = yn,1(0 :t),· · · ,Yn,B(0 :t) =yn,B(0 :t)}.
Assumption 12. For each n∈ V, the OPMOn is given and known.
A sample CHMM of a specific community with three individuals, not strongly-connected, is visualized in Figure 6.5. Because contact-tracing data only provides us information about the evolution of the observed symptoms over time for a subset of tracked individuals, the transition probabilities among the different phases in X are unknown. Each individual is assigned a vector of unknown parameters similar toθk for the compartmental model of each community k∈ {1,· · · , K}. For individual n∈ V, the full vector of transition probability parameters is given byηn(t) := [βn(t), αn, γn(r), γn(d)], and the sparsity pattern of thetransition
S E I
R
D
1−βn(t) βn(t)
1−αn
αn
1−γn(r)−γn(d)
γ(r)n
γn(d)
1
1
Figure 6.6: The underlying Markov chain for a single chain of the CHMM module, using transition probabilities as parameters.
probability matrix (TPM) corresponding to the chain of v is given by
Pn(t) :=
1−βn(t) βn(t) 0 0 0
0 1−αn αn 0 0
0 0 1−γn(r)−γn(d) γn(r) γn(d)
0 0 0 1 0
0 0 0 0 1
. (6.8)
Note that the probability for transitioning fromStoEis time-varying because it is dependent on the time-varying health statuses of his/her immediate neighbors.
Given a complete sequence {yn(0 : Tsim)} of observed symptoms over some time duration Tsim>0, we address two questions for each individualn∈ V. Question 1: how can we estimate the values of the TPM Pn(t) in the CHMM? Question 2: given the TPM estimates ˆPn(t) for all t ∈ [0, Tsim], how can we estimate the true health status xn(0 : t)? Both question can be addressed by extending standard HMM techniques (see, e.g., [65]). For Question 1, theforward-backward algorithm (e.g., [126]) andBaum-Welch (expectation-maximization) (e.g., [16]) are standard procedures in the HMM literature which can estimate the transition and observation probabilities inPnandOn. For the purposes of this application, we make two simultaneous extensions: 1) multiple different time series of observations can be incorporated at once, and 2) the unknown parameters are assumed to be time-varying.
Define fn,j(t, x) :=P(Xn(t) =x,Y(t)n,j=y(t)n,j) to be the probability that the individual is in state x ∈ X at time t and the past observed symptom sequence is given by Y(t)n,j=y(t)n,j. Definebn,j(t, x) := P(Y(t+1:Tn,j )=y(t+1:Tn,j )|Xn(t) =x) to be the probability of observing a future sequence of symptoms Y(t+1:Tn,j )=y(t+1:Tn,j ) given we know the individual is in state x. The
recursive equations for fn,j and bn,j are then given by:
fn,j(t, x) = X
z∈X
fn,j(t−1, z)On(yn,j(t)|x) ˆPn,j(t−1)(z, x), fn,j(0, x) :=qn(x)On(yn,j(0)|x), (6.9a) bn,j(t, x) = X
z∈X
bn,j(t+ 1, z) ˆPn,j(t)(x, z)On(yn,j(t+ 1)|z), bn,j(T, x) = 1 ∀x∈ X. (6.9b)
Given observation sequenceYn,j(0 : Tsim) =yn,j(0 :Tsim), definegn,j(t, x) to be the probabil- ity that the state of individualn at timet isxgiven observation sequencej, andhn,j(t, x, z) to be the probability that the state of individualn makes a transition fromxtoz at time t:
gn,j(t, x) := P(Xn(t) =x|Yn,j(0 : Tsim) =yn,j(0 :Tsim)), (6.10a) hn,j(t, x, z) := P(Xn(t) =x, Xn(t+ 1) =z|Yn,j(0 :Tsim) =yn,j(0 :Tsim)). (6.10b)
The variables defined in (6.9) allow us to simplify (6.10) beyond their definitions:
gn,j(t, x) = fn,j(t, x)bn,j(t, x) P
z∈X
fn,j(t, z)bn,j(t, z), (6.11a)
hn,j(t, x, z) = fn,j(t, x) ˆPn,j(t)(x, z)On(yn,j(t+ 1)|z)bn,j(t+ 1, z) P
u,w∈X
fn,j(t, u) ˆPn,j(t)(u, w)On(yn,j(t+ 1)|w)bn,j(t+ 1, w). (6.11b)
Note that the expressions for (6.11) are dependent on previous estimates of the TPM ˆPn,j(0:t−1) for each time t; essentially, we are recursively building new estimates of ˆPn,j(t) based on the previous time’s estimates. For a single individual n ∈ V, estimating the TPM Pn(t) based on a single observation sequence j ∈ {1,· · · , B} can be solved according to the standard Baum-Welch algorithm [16]. Define ˆηn,j(t) to be the estimate of the true parameter vector ηn at time t based on observation sequence j, and define a corresponding auxiliary function as:
Qn,j(t) :=E
log
p(c)n,j(Xn(0 :t),Yn,j(0 :t)|ηn(0 :t))
yn,j(0 :T),ηˆn,j(0:t)
, (6.12)
where p(c)n,j denotes the joint probability distribution of observing a complete set of data {xn(0 :t),yn,j(0 :t)}for individual n:
p(c)n,j(xn(0 :t),yn,j(0 :t)|ηn(0 :t)) =qn(xn(0))
t−1
Y
s=0
Pn(s, xn(s), xn(s+ 1))
t
Y
s=0
On(yn,j(s)|xn(s)).
We maximize the (6.12) to determine the optimal initial probability distribution ˆq(t)n,j and the optimal TPM ˆPn,j(t). Note that the maximization must be done subject to the regularity conditions P
u∈XPˆn,j(t)(x, u) = 1 and P
x∈Xqˆ(t)n,j(x) = 1 for all x ∈ X. The optimal point has the following closed-form expression:
ˆ
qn,j(t)(x) =gn,j(0, x), (6.13a)
Pˆn,j(t)(x, z) =
t−1
X
s=0
hn,j(s, x, z)
! t X
s=0
gn,j(s, x)
!−1
, (6.13b)
where the gv,j(t, x) andhv,j(t, x, z) are defined in (6.10). The procedure is repeated for each t∈[0, Tsim] so that we obtain an estimate of ˆqn,j(t) and ˆPn,j(t) which evolves over time.
In order to account for time-varying parameters, we apply a discounting factor a ∈ (0,1]
which weights the values of past estimates less the further back in the past they were ob- served. To aggregate multiple observations into a single definitive estimate, define w∈RB to be weights such that P
wj = 1:
Pˆn(t, x, z) =
B
X
j=1
wj t−1
X
s=0
at−shn,j(s, x, z)
! B X
j=1
wj t
X
s=0
at−sgn,j(s, x)
!−1
. (6.14)
The assignment of weights is chosen via two metrics: 1) the observation sequences are sta- tistically correlated with each other, or 2) one observation sequence yields more information about a state than another, e.g., observing a fever on an individual may be more reflective of his/her ill state than a runny nose. For simplicity, we assume that these weights are known beforehand and that our observation processes are independent of each other, meaning that the weights are only chosen according to how well they represent the true state.
Question 2 can be addressed by applying the standard Viterbi algorithm to each separate observation sequence, then aggregating them. Specifically, recall Xn(t) ∈ X refers to the hidden state of individual n ∈ V, and suppose we are given a time series of observations Yn,j(0 : t) for symptom j ∈ {1,· · · , B}. The estimated time-varying TPM underlying the HMM is given by ˆPn(t) at time t, and the known OPM is given by On. The initial state is known and given by Xn(0) = xn(0). Then the standard Viterbi algorithm (e.g., [58]) can be applied to the observation sequence j∈ {1,· · · , B} to estimate the sequence ˆxn,j(1 : t) of likely hidden states over time based on symptom j. The probability of observing some specific sequence of health statuses xn(1 :t) for some t≤Tsim is given by:
P({Xn(t) = xn(t), n∈ V, t∈[0, T]})
= Y
n∈V
qn(xn(0)) Y
t∈[0,T−1]
n∈V
P(Xn(t+ 1)|Xn(t),{Xm(t), m ∈ N(n)}),
where qn(x) denotes the initial probability that individual n starts off at statex. Based on the observations of an individual’s symptoms, we recursively compute:
δn,j(0, x) =qn(x)On(yn,j(0)|x), δn,j(t, x) = max
z∈X δn,j(t−1, z) ˆPn(t−1)(z, x)On(yn,j(t)|x), t≥1.
Then for the specific observation sequence j, the optimal sequence of states is given by ˆ
xn,j(t) := argmaxz∈Xδn,j(t, z). Thus, ˆxn,j(t)∈ X is the most likely health status of individual n∈ V at time t ∈[0, Tsim] given observation process j∈ {1,· · · , B}. Then the health status ˆ
xn(t) determined by considering all observation processes simultaneously is then given by whichever phase inX occurs most often in the aggregate set {ˆxn,1(t),· · · ,xˆn,B(t)}. Ties are broken according to the state which is more “harmful” to the network, e.g., if the most likely state is tied between susceptible (S) or exposed (E), then we take the individual to be exposed because (s)he is liable to infecting more people in the network.
6.3.2 Including Multiple Variants
We can extend the CHMM module to account for variant viruses and mutations in a way similar to what was done for the compartmental ODE module (Section6.2.3). The unknown probabilities for the CHMM module, expanded to consider multiple strains, is given by
ηv(t) :=
{βi,v(t)}i∈{1,···,|A(S)|}
j∈{1,···,K}
,{αi,v(t)}i∈{1,···,|A(S)|+|A(I)|}, (6.15) {γi,v(r)}i∈{1,···,A},{γi,v(d)(t)}i∈{1,···,|A(S)|+|A(I)|},{νi,v(t)}i∈{1,···,|A(I)|}
.
Furthermore, the TPMPv(t) for each v ∈ V and time t∈N is updated similarly to (6.7):
Pv(t) =
Pv,SS(t) Pv,SE(t) 0 0 0 0 Pv,EE(t) Pv,EI(t) 0 0 Pv,IS(t) 0 Pv,II(t) Pv,IR(t) Pv,ID(t)
0 0 0 1 0
0 0 0 0 1
,
where each of the submatrices Pv,··(t) are defined as follows, using the parameters defined in (6.15):
Pv,SE(t) = diag
diag{β1,v(t),· · ·, βA,v(t)}, (6.16)
diag{βA+1,v(t) +βA+2,v(t),· · · , β2A−1,v(t) +β2A,v(t)},
|A(S)|
X
i=2A+1
βi,v(t)
Pv,SS(t) =I−Pv,SE(t)
Pv,EI(t) = diag{α1,v(t),· · · , α|A(I)|,v(t)}, Pv,EE(t) =I−Pv,EI(t) (6.17) Pv,IS(t) =
"
P(t)v,IS 04×12
#
Pv,IR(t) =
A
X
i=1
γi,v(r) (6.18)
Pv,ID(t) = diag{γ1,v(d)(t),· · · , γ|(d)A(I)|,v(t)}
Pv,II(t) =I− X
χ∈{S,R,D}
Pv,Iχ(t)
where
P(t)v,IS =
ν1,v(t) 0 ν3,v(t) 0 0 0 0 0 0
0 ν2,v(t) 0 0 ν5,v(t) 0 0 0 0
06×3 0 0 0 ν4,v(t) 0 ν6,v(t) 0 0 0
0 0 0 0 0 0 ν7,v(t) 0 0
0 0 0 0 0 0 0 ν8,v(t) 0
0 0 0 0 0 0 0 0 ν9,v(t)
,
andI−M for a rectangular matrixM ∈Rn×m is intended to mean an×n matrix where the diagonal elements are 1 minus the row sum ofM. The procedure for estimating parameters, which was detailed in Sections6.2(and will be detailed further in6.4), is then used with the multi-strain versions of the ODE and CHMM dynamics.