RANDOM NODE-ASYNCHRONOUS UPDATES ON GRAPHS
2.3 Cascade of Asynchronous Updates
2.3.2 Convergence in Mean-Squared Sense
In the following we will assume thatAhas a unit eigenvalue with multiplicityπ β₯ 1.
This assumption ensures that the asynchronous update equation has a fixed point (Lemma 2.2). Without loss of generality we will order the eigenvalues of Asuch thatππ β 1 for 1β€ π β€ π-π. Notice that non-unit eigenvalues are allowed to be complex in general, and complex eigenvalues on or outside the unit circle are not ruled out. Then, the eigenvalue decomposition ofAcan be written as
A=[U V
1]diag [π
1 Β· Β· Β· ππ
-π 1 Β· Β· Β· 1]
[U V1]H, (2.29) whereV1βCπΓπ is an orthonormal basis for the eigenspace of the unit eigenvalue, and U βCπΓ(π-π) corresponds to the eigenvectors of the non-unit eigenvalues.
SinceAis assumed to be a normal matrix, we haveUHV1=0, andUHU=I. We now define the following quantities:
π =π
max UH diag(U UH)U
, (2.30)
Β― π =π
min UH diag(U UH)U
, (2.31)
which will play a crucial role in the analysis of convergence. Notice thatπand Β―πdo not depend on the particular selection of the basis matrixU. Just the column space ofUdetermines their values. More importantly, we have the following property:
Lemma 2.3. The following holds true for anyUβCπΓ(π-π) withUHU=I:
1 π
I UH diag U UH
U I. (2.32)
Proof. We note that UβCπΓ(π-π) has orthonormal columns, i.e., UHU=I and prove the upper bound in (2.32) first. Note thatU UH I. Then we can write the following:
(U UH)π,π =eHπ U UHeπ β€ 1 =β diag U UH
I =β UH diag U UH
U I, (2.33) whereeπdenotes theππ‘ β column of the identity matrix of dimension π.
We now prove the lower bound in (2.32). Letu(π) denote theππ‘ β row ofU, then it is clear that (U UH)π, π =u(π)uH(π). LetxβCπ be an arbitrary vector. Then,
xHU UHx=
xHU UHx =
π
Γ
π=1 π
Γ
π=1
π₯β
π (U UH)π, ππ₯π
=
π
Γ
π=1 π
Γ
π=1
π₯β
π u(π)uH(π)π₯π
β€
π
Γ
π=1 π
Γ
π=1
|π₯π|
u(π)uH(π) |π₯π| β€
π
Γ
π=1 π
Γ
π=1
|π₯π| ku(π)k2 ku(π)k2 |π₯π|
=
π
Γ
π=1
|π₯π| ku(π)k2
!2
β€ π
π
Γ
π=1
|π₯π|2 ku(π)k2
2= πxH diag U UH x.
(2.34) Then, the inequity (2.34) implies that
U UH π diag U UH
=β UHU UHU πUH diag U UH
U, (2.35) which proves the lower bound due to the fact thatUHU=I. So, Lemma 2.3 implies the following inequality regarding the quantities Β―πandπ:
1 π
β€ πΒ― β€ π β€ 1. (2.36)
For an arbitrary xπ, let rπ denote the residual from the projection of xπ onto the column space ofV1. That is,
rπ =xπβV1VH1 xπ =U UH xπ. (2.37)
Then, the convergence of xπ to an eigenvector of the unit eigenvalue is equivalent to the convergence ofrπ to zero. The following theorem, whose proof is presented in Appendix 2.10.2, provides bounds forrπ as follows:
Theorem 2.2. The expected squaredβ
2-norm of the residual at theππ‘ β iteration is bounded as follows:
ππ kr0k2
2 β€ E krπk2
2
β€ Ξ¨π kr0k2
2, (2.38)
where
Ξ¨ = max
1β€πβ€π-π
π(ππ), π = min
1β€πβ€π-π πΒ―(ππ), (2.39)
π(π) =1+ πT π
|π|2β1+πΏT(πβ1) |πβ1|2
, (2.40)
Β―
π(π) =1+ πT π
|π|2β1+πΏT(πΒ―β1) |πβ1|2
.
The importance of Theorem 2.2 is twofold: First, it reveals the effect of the eigen- values (ππ), the eigenspace geometry (π,πΒ―), and the amount of asynchronicity of the updates (πΏT) on the rate of convergence. In the synchronous case πΏT =0 and πT =π, hence we haveΞ¨ =max1β€πβ€π-π |ππ|2. This result is consistent with the well-known fact that the rate of convergence of the power iteration is determined by the second largest eigenvalue. However, in the asynchronous case (πΏT >0), not just the eigenvalues but the eigenspace geometry ofAhas an effect. As a result, similar matrices may have different convergence rates due to their different eigenspaces.
This point will be elaborated in Section 2.4. Furthermore, in order to guarantee that E[krπk2
2] β€πkr0k2
2 for a given error threshold π, inequalities in (2.38) show that it is necessary to have at leastblog(π)/log(π)citerations, and sufficient to have dlog(π)/log(Ξ¨)eiterations.
Secondly, Theorem 2.2 reveals a region for the eigenvalues such that the residual error through asynchronous updates is guaranteed to convergence to zero in the mean-squared sense. The following corollary presents this result formally.
Corollary 2.2. Assume that all non-unit eigenvalues of A satisfy the following condition:
πβ πΌ πΌ+1
< 1
πΌ+1
, (2.41)
where
πΌ=πΏT (πβ1). (2.42)
Then,
πlimββ E krπk2
2
=0. (2.43)
Proof. From (2.39) it is clear thatΨ < 1 if and only if
|π|2β1+πΌ|πβ1|2 < 0, (2.44) for all non-unit eigenvaluesπ. The inequality in (2.44) can be equivalently written as in (2.41). Since it implies thatΞ¨ < 1, Theorem 2.2 guarantees the convergence ofE[krπk2
2] to zero as the number of updates,π, goes to infinity.
An important remark is as follows: Corollary 2.2 provides a condition under which rπ is guaranteed to converge to a point (zero) as π goes to infinity. On the other hand, xπ itself only converges to a random variable defined over the eigenspace of the unit eigenvalue. This is illustrated in Figure 2.1 where the eigenspace of the unit eigenvalue is spanned by the vector [1 1]H, and x0 =[-1 1]H. In the synchronous case the signal converges to a point through a deterministic trajectory as shown in Figure 2.1a. For the random asynchronous case, Figure 2.1b illustrates the trajectories of the signals for different realizations. Convergence ofrπ to zero implies that the limit ofxπ always lie in the eigenspace of the unit eigenvalue (with a random orientation). Since any point in the eigenspace is an eigenvector, we can safely say thatxπ converges to an eigenvector of the unit eigenvalue.
Notice that the convergence region for the eigenvalues defined in (2.41) is parametrized by πΌ, and it is a disk on the complex plane with radius 1/(πΌ+1) centered at πΌ/(πΌ+1). This region is visualized in Figure 2.2. Notice that 0 β€ πΏT β€ 1 and 0< π β€ 1 always hold true. As a resultπΌsatisfiesβ1< πΌβ€ 0. The key observa- tion is that the region in (2.41) grows asπΌapproachesβ1, and it is the smallest (and corresponds to the unit disk) when πΌ=0. The quantity π½ and the large circle in Figure 2.2 will be explained after Corollary 2.4.
Corollary 2.2 reveals the combined effect of the eigenspace geometry ofA(quantified withπ) and the amount of asynchronicity (quantified withπΏT) on the convergence of the iterations. In the case ofπΏT =0 the region reduces to the unit disk, which is the well-known condition on the eigenvalues for the synchronous updates to converge.
This is an expected result since the case ofπΏT =0 corresponds to the synchronous
(a) (b)
Figure 2.1: Some realizations of the trajectories of the signal through updates for (a) the non-random synchronous case, (a) the random asynchronous case.
update itself. More importantly, the synchronous updates implyπΌ=0 independent of the eigenspace geometry ofA. Therefore, the convergence is determined entirely by the eigenvalues ofAin the synchronous case.
On the other hand, the case of asynchronous updates results in alargerconvergence region for the eigenvalues. First of all, it should be noted that asynchronous updates increase the convergence region if the eigenspace geometry ofApermits. If π=1 thenπΌ=0, and the region of convergence is not improved by asynchronous itera- tions. However, ifπ <1 (which is the case in most practical applications), then it is possible to enlarge the region of convergence using asynchronous iterations. AsπΏT gets larger (less number of nodes are updated concurrently),πΌ gets smaller, hence the convergence region gets larger. Even if one index is left unchanged in some iterations, we haveπΏT >0, and the residualrπcan converge to zero,even when non- unit eigenvalues outside the unit circle might exist. This is a remarkable property of the asynchronous updates since the residual (hence the signal itself) would blow up in the case of synchronous updates. Notice that in the extreme case of πΏT =1, the region of convergence is the largest possible. That is to say, updating exactly one node in each iteration maximizes the region of convergence of the eigenvalues.
On the other extreme, the synchronous update is the most restrictive case, which is formally stated in the following corollary:
Corollary 2.3. If the synchronous updates onAconverge, then
πlimββ E krπk2
2
=0, (2.45)
for random updates onAwith any amount of asynchronicity.
Proof. If the synchronous updates converge, then all non-unit eigenvalues of A satisfy |π|< 1. Hence, they also satisfy (2.41) for any value of πΌ. Therefore, Corollary 2.2 ensures the convergence of the updates irrespective of the value of
πΏT.
It should be clear that converse of Corollary 2.3 is not true. Thus consider a scenario in which a signal over a network of nodes with autonomous (asynchronous) behavior stays in the steady-state. If the nodes start to operate synchronously, then it is possible for the signal to blow up. This happens if some of the eigenvalues fall outside of the reduced convergence region due to the reduction in the amount of asynchronicity.
In fact, the study in [142] claims that large-scale synchronization of neurons is an underlying mechanism of epileptic seizures. Similarly, the study in [196] presents the relation between increased neural synchrony and epilepsy as well as Parkinsonβs disease. It should be noted that neural networks follow nonlinear models whereas the model we consider here is linear. Thus, results presented here do not apply to brain networks. Nevertheless, these neurobiological observations are consistent with the implications of Corollary 2.2 and Corollary 2.3 from a conceptual point of view.
Apart from the convergence of the iterations, Theorem 2.2 is also useful to charac- terize the case of non-converging iterations. In this regard, the following corollary presents a region for the eigenvalues such that asynchronous updates are guaranteed not to converge.
Corollary 2.4. Assume that all non-unit eigenvalues ofAsatisfy the following:
πβ π½ π½+1
β₯ 1
π½+1
, (2.46)
where
π½ =πΏT (πΒ―β1). (2.47)
Then,
E krπk2
2
β₯ kr0k2
2. (2.48)
Furthermore, if (2.46) is satisfied with strict inequality, thenE[krπk2
2] grows un- boundedly asπ goes to infinity.
π π π π (Ξ») πΌπΌπΌπΌ(Ξ»)
βππ ππ
πΆπΆ β ππ πΆπΆ+ππ π·π· β ππ
π·π·+ππ
Convergence region, synchronous.
Convergence region, random asynchronous.
Both (2.41) and (2.46) are violated.
Convergence inconclusive.
Region of no convergence, random asynchronous.
Figure 2.2: Regions (given in (2.41) and (2.46)) for the eigenvalues such that random asynchronous updates are guaranteed to converge and diverge, respectively.
Proof. From (2.39) it is clear thatπ β₯ 1 if and only if
|π|2β1+π½|πβ1|2 β₯ 0, (2.49) for all non-unit eigenvaluesπ. The inequality in (2.49) can be equivalently written as in (2.46). Since (2.46) implies thatπ β₯ 1, Theorem 2.2 indicates thatE[krπk2
2] is lower bounded by kr0k2
2. If (2.46) is satisfied strictly, then π >1. As a result, E[krπk2
2]grows unboundedly as π goes to infinity.
From the definitions in (2.42) and (2.47) note thatπΌ β₯ π½is always true due to the fact that π β₯ πΒ―. Therefore, the conditions in (2.41) and (2.46) describe disjoint regions on the complex plane. See Figure 2.2. Corollary 2.4 also shows that the condition
|πβπ½/(π½+1) | < 1/(π½+1)isnecessaryfor the iterations to converge, whereas the condition in (2.41) issufficientfor the convergence (both in the mean square sense).
If there exists an eigenvalue that violates both (2.41) and (2.46), then convergence is inconclusive. This region is also indicated in Figure 2.2.
At this point it is important to compare the implications of Corollary 2.2 with the classical result presented in [14, 32]. Under the mild assumption that all the indices
are selected sufficiently often (see [32] for precise definition), the study [32] showed that the linear asynchronous model in (2.5) converges for any index sequence if and only if the spectral radius of|A|is strictly less than unity, where|A|denotes a matrix with element-wise absolute values ofA. On the other hand, our Corollary 2.2 allows eigenvalues with magnitudes grater than unity. Although these two results appear to be contradictory (whenAconsists of non-negative elements), the key difference is the notion of convergence. As an example, consider the matrix A2 defined in (2.55). Its spectral radius is exactly 1, and [32] proved thatthere existsa sequence of indices under which iterations onA2do not converge. For example, assumingπ is odd, consider the index sequence generated asπ =(2π β1) (modπ) +1. However, Corollary 2.2 proves the convergence in a statisticalmean-square averaged sense.
(See Figure 2.5.) In short, when compared with [32], Corollary 2.2 requires a weaker condition onAand guarantees a convergence in a weaker (and probabilistic) sense.