IIR FILTERING ON GRAPHS WITH RANDOM NODE-ASYNCHRONOUS UPDATES
3.4 Convergence of the Proposed Algorithm For convenience of analysis we defineFor convenience of analysis we define
uniformly randomly. However, the computational stage of [74] consists of a linear mapping followed by a sigmoidal function, whereas Algorithm 5 uses a linear update model. More importantly, aggregations are done synchronously in [74], that is, all nodes are required to complete the necessary computations before proceeding to the next level of aggregation. On the contrary, nodes aggregate information repetitively andasynchronouslywithout waiting for each other in Algorithm 5.
3.4 Convergence of the Proposed Algorithm
follows:
x0๐ = ร
๐โNin(๐)
๐บ๐, ๐ x๐ =ร
๐
X(๐-1) e๐ eT๐ GTe๐ =X(๐-1) GT e๐. (3.40) Therefore, the next value for its state vector is given as follows:
X(๐)e๐ =
A X(๐-1)GT+b(u+w(๐-1))T
e๐, ๐ โ T๐. (3.41) On the other hand, if the node๐does not get into the active stage at the๐๐ก โ iteration, i.e.,๐โT๐, its state vector remains unchanged. Thus, we can write the following:
X(๐) e๐ =X(๐-1) e๐, ๐ โT๐. (3.42) Since both (3.41) and (3.42) are linear in the augmented state variable matrixX(๐), we can transpose, and then vectorize both equations and represent them as follows:
(xยฏ๐)๐ =
๏ฃฑ๏ฃด
๏ฃด
๏ฃฒ
๏ฃด๏ฃด
๏ฃณ
(Aยฏ xยฏ๐-1)๐+ (uยฏ๐-1)๐, ๐ โ Tยฏ๐, (xยฏ๐-1)๐, ๐ โTยฏ๐,
(3.43) where the variables of the vectorized model are as follows:
xยฏ๐ =vec XT(๐)
, Aยฏ =AโG,
uยฏ =bโu, wยฏ๐ =bโw(๐), (3.44) and ยฏu๐ is defined similar to (3.2) as ยฏu๐ =uยฏ +wยฏ๐. Furthermore, the update set ยฏT๐ of the vectorized model is defined as follows:
ยฏ T๐ =
๐+ ๐ ๐ | ๐ โ T๐, 0โค ๐ < ๐ฟ , (3.45) which follows from the fact that when a node gets into the active stage, it updates all elements of its own state vector simultaneously according to Line 10 of the algorithm.
We note that the mathematical model in (3.43) appears as a pull-like algorithm, in which nodes retrieve data from their incoming neighbors. However, with the use of a buffer, the model (3.43) can be implemented in a collect-compute-broadcast scheme as proposed in Algorithm 5. See also Figure 3.1.
When the algorithm is implemented in a synchronous manner, the state recursions of (3.43) reduce to the following form:
xยฏ๐ =Aยฏ xยฏ๐-1+uยฏ๐-1, (3.46) and the following theorem (whose proof is provided in Section 3.7.3) presents the mean-squared error of the algorithm:
Theorem 3.2. In Algorithm 5, assume that all the nodes on the graph get into the active stage synchronously, and the matrixAยฏ does not have an eigenvalue equal to 1. Then,
E
y(๐) โeu 2
2
=
cโG
Aยฏ๐-1 xยฏ0โxยฏโ
2 2
+
๐โ1
ร
๐=0
|โ๐|2 tr
G๐๐ช(G๐)H
, (3.47) where xยฏโ is the fixed point of (3.46), and โ๐โs are the coefficients of the impulse response of the digital filter as in(3.34).
In (3.47) it is clear that as long as
๐(A)ยฏ <1, (3.48)
the first term of (3.47) converges to zero irrespective of the initial vector ยฏx0, as the iteration progresses. So, from Theorem 3.2 the residual error approaches an error floor:
๐โโlim E
y(๐) โeu 2
2
=tr(H๐ช), (3.49)
where
H=
โ
ร
๐=0
|โ๐|2(G๐)H G๐. (3.50) Thus, the error floor in the synchronous case depends on theimpulse responseof the underlying digital filter as well as the graph operator, but the similarity transform Tdoes not affect the error floor. In short, the similarity transform does not affect either the convergence or the error floor in the synchronous case. Note that the stability condition in (3.48) ensures the convergence of (3.50). Note also that ๐(Aยฏ) = ๐(A)๐(G)in view of (3.44).
Next consider the asynchronous case. The equivalent model of the algorithm in (3.43) is in the form of (3.12), thus the results presented in Section 3.2 (Corollary 3.1 in particular) can be used to study the convergence of the algorithm. In this regard, we present the following theorem, whose complete proof is given in Section 3.7.4:
Theorem 3.3. In Algorithm 5, letPdenote the average node selection matrix and๐ช the covariance matrix of the measurement noise. If the state transition matrixAof the filter, and the operatorGof the graph satisfy the following:
kAk2
2GH P G โบP, (3.51)
then
๐limโโ E
ky(๐)โ euk2
2
โค tr(R๐ช), (3.52)
where
R= kbk2
2 kck2
2 kGk2
2
๐min
Pโ kAk2
2GHP G P+ |๐|2I. (3.53) Theorem 3.3 presents an upper bound on the mean-squared error. In the noise-free case (๐ช=0), the right-hand-side of (3.52) becomes zero, and the condition (3.51) ensures the convergence of the output signal to the desired filtered signal in the mean-squared sense. We note also that the right-hand-side of (3.52) is linear in the noise covariance matrix, which implies that the error floor of the algorithm increases at most linearly with the input noise. This will be numerically verified later in Section 3.5.1. (See Figure 3.4b.) In fact, it is possible to integrate stochastic averaging techniques studied in [6, 7] into Algorithm 5 in order to overcome the error due to noise at expense of a reduced convergence rate.
We conclude by noting that graph filtering implementations considered in [158, 149, 150, 154, 157, 87, 88, 108, 109] are likely to tolerate asynchronicity up to a certain degree. In fact, [109] presented numerical evidences in this regard. This is not surprising because linear asynchronous fixed-point iterations are known to converge under some conditions [32, 14]. The main difference of Algorithm 5 studied in this chapter is due to its proven convergence under some mild and interpretable conditions with the assumed random asynchronous model (Theorem 3.3).
3.4.1 Selection of the Similarity Transform
In addition to the dependency on the graph operator and the average node selection matrix, the sufficiency condition (3.51) depends also on the realization of the filter of interest. Thus, in the asynchronous case, both the condition for convergence and the error bound depend on the similarity transform. Since the condition becomes more relaxed as the state transition matrixAhas a smaller spectral norm, it is important to select the similarity transformTin (3.35) in such a way thatAhas the minimum spectral norm.
Due to their robustness, minimum-norm realizations of digital filters have been studied extensively in signal processing [12, 120, 198]. A minimum-norm imple- mentation corresponds to an appropriate selection of the similarity transformTin (3.35) due to the following inequality:
kAk2โฅ ๐(A)= ๐(bA). (3.54)
The lower bound๐(bA)depends only on the coefficients of the polynomial๐(๐ฅ)due to the definition ofAbin (3.36).
The lower bound in (3.54) maynotbe achieved with equality in general, and we will consider one such example in the next section. Nevertheless, it is known that the companion matrixAbis diagonalizable if and only if the digital filter in (3.34) has๐ฟ distinct poles [85]. That is to say, when there are ๐ฟ distinct nonzero๐ง๐โs such that ๐(๐ง-1
๐) =0, we can write the following eigenvalue decomposition:
Ab=V
bA ๐ฒ
bA V-1
Ab, (3.55)
where๐ฒ
bAis a diagonal matrix with๐ง-1
๐โs on the diagonal, andV
bAis a Vandermonde matrix corresponding to๐ง-1
๐โs. If the similarity transformTis selected according to (3.55), then the bound in (3.54) is indeed achieved. More precisely,
T=V
bA =โ A=๐ฒ
bA =โ kAk2=๐(bA). (3.56) Thus, the most relaxed version of the sufficiency condition of Theorem 3.3 is obtained when the updates of Algorithm 5 are implemented using the similarity transform given in (3.56).
When the filter (3.34) has repeated poles, the companion matrix Ab is not diago- nalizable, hence an implementation achieving the bound (3.54) does not exist [12].
Nevertheless, the study [12] discussed that for any๐ > 0, there exists a realization with a state transition matrixAsuch that
kAk2โค ๐(bA) +๐ . (3.57)
Therefore, it is always possible to obtain โalmost minimum" realizations with the spectral norm arbitrarily close to the lower bound in (3.54). As a particular example, the case of FIR graph filters will be considered in the next section.
3.4.2 The Case of Polynomial Filters
Polynomial (FIR) graph filters can be considered as a special case of the rational graph filter (3.29), in which the denominator is selected as๐(๐ฅ) =1 so that๐(G) =I, and the filtered signal in (3.30) reduces toeu= ๐(G)u. In this case, the companion
matrixbA(direct form implementation) has the following form:
bA=
๏ฃฎ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฐ
0 1 0 ยท ยท ยท 0 0 0 1 ยท ยท ยท 0
.. .
.. .
.. .
... .. . 0 0 0 ยท ยท ยท 1 0 0 ยท ยท ยท 0
๏ฃน
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃป
โR๐ฟร๐ฟ, (3.58)
which has all eigenvalues equal to zero, so that๐(bA) =0. As a result, no realization of a polynomial filter can achieve the lower bound (3.54) since kAk2 =0 implies A=0. However, the spectral norm of a realization can be made arbitrarily small.
In particular, consider the following similarity transform:
T=diag [1 ๐ ๐2 ยท ยท ยท ๐๐ฟ-1]
, (3.59)
where๐ is an arbitrary nonzero complex number. Then, the corresponding realiza- tionAcan be found as follows:
A=T-1A Tb =๐Ab =โ kAk2 =|๐|. (3.60) Thus, it is possible to select a value for ๐ (with a sufficiently small magnitude) in order to satisfy the condition (3.51). (See [101, Fact 2.5.4].) Such a selection is not unique in general, and one can easily find a value for๐ satisfying the following:
|๐| <
kGk2
q kP
2
P-1 2
-1
, (3.61)
which ensures that the condition (3.51) is met.
As a result, for any graph operator G and average node selection matrix P, it is always possible to implementanypolynomial filter in a random node-asynchronous manner that is guaranteed to converge in the mean-squared sense. However, we note thatTgiven in (3.59) may not be the optimal similarity transform in general.
We also note that when a polynomial filter is implemented in a synchronous manner, Theorem 3.2 shows that the algorithm reaches the error floor after๐ฟiterations since Ab in (3.58) is a nil-potent matrix and Ab๐ =A๐ =0 for ๐ โฅ ๐ฟ. This convergence behavior will be verified numerically later in Section 3.5.4. The error bound still depends onTbecause of kAk2.