• Tidak ada hasil yang ditemukan

RANDOM NODE-ASYNCHRONOUS UPDATES ON GRAPHS

2.3 Cascade of Asynchronous Updates

2.3.2 Convergence in Mean-Squared Sense

In the following we will assume thatAhas a unit eigenvalue with multiplicity𝑀 β‰₯ 1.

This assumption ensures that the asynchronous update equation has a fixed point (Lemma 2.2). Without loss of generality we will order the eigenvalues of Asuch thatπœ†π‘— β‰ 1 for 1≀ 𝑗 ≀ 𝑁-𝑀. Notice that non-unit eigenvalues are allowed to be complex in general, and complex eigenvalues on or outside the unit circle are not ruled out. Then, the eigenvalue decomposition ofAcan be written as

A=[U V

1]diag [πœ†

1 Β· Β· Β· πœ†π‘

-𝑀 1 Β· Β· Β· 1]

[U V1]H, (2.29) whereV1∈C𝑁×𝑀 is an orthonormal basis for the eigenspace of the unit eigenvalue, and U ∈C𝑁×(𝑁-𝑀) corresponds to the eigenvectors of the non-unit eigenvalues.

SinceAis assumed to be a normal matrix, we haveUHV1=0, andUHU=I. We now define the following quantities:

𝜌 =πœ†

max UH diag(U UH)U

, (2.30)

Β― 𝜌 =πœ†

min UH diag(U UH)U

, (2.31)

which will play a crucial role in the analysis of convergence. Notice that𝜌and ¯𝜌do not depend on the particular selection of the basis matrixU. Just the column space ofUdetermines their values. More importantly, we have the following property:

Lemma 2.3. The following holds true for anyU∈C𝑁×(𝑁-𝑀) withUHU=I:

1 𝑁

I UH diag U UH

U I. (2.32)

Proof. We note that U∈C𝑁×(𝑁-π‘š) has orthonormal columns, i.e., UHU=I and prove the upper bound in (2.32) first. Note thatU UH I. Then we can write the following:

(U UH)𝑖,𝑖 =eH𝑖 U UHe𝑖 ≀ 1 =β‡’ diag U UH

I =β‡’ UH diag U UH

U I, (2.33) wheree𝑖denotes the𝑖𝑑 β„Ž column of the identity matrix of dimension 𝑁.

We now prove the lower bound in (2.32). Letu(𝑖) denote the𝑖𝑑 β„Ž row ofU, then it is clear that (U UH)𝑖, 𝑗 =u(𝑖)uH(𝑖). Letx∈C𝑁 be an arbitrary vector. Then,

xHU UHx=

xHU UHx =

𝑁

Γ•

𝑖=1 𝑁

Γ•

𝑗=1

π‘₯βˆ—

𝑖 (U UH)𝑖, 𝑗π‘₯𝑗

=

𝑁

Γ•

𝑖=1 𝑁

Γ•

𝑗=1

π‘₯βˆ—

𝑖 u(𝑖)uH(𝑗)π‘₯𝑗

≀

𝑁

Γ•

𝑖=1 𝑁

Γ•

𝑗=1

|π‘₯𝑖|

u(𝑖)uH(𝑗) |π‘₯𝑗| ≀

𝑁

Γ•

𝑖=1 𝑁

Γ•

𝑗=1

|π‘₯𝑖| ku(𝑖)k2 ku(𝑗)k2 |π‘₯𝑗|

=

𝑁

Γ•

𝑖=1

|π‘₯𝑖| ku(𝑖)k2

!2

≀ 𝑁

𝑁

Γ•

𝑖=1

|π‘₯𝑖|2 ku(𝑖)k2

2= 𝑁xH diag U UH x.

(2.34) Then, the inequity (2.34) implies that

U UH 𝑁 diag U UH

=β‡’ UHU UHU 𝑁UH diag U UH

U, (2.35) which proves the lower bound due to the fact thatUHU=I. So, Lemma 2.3 implies the following inequality regarding the quantities ¯𝜌and𝜌:

1 𝑁

≀ 𝜌¯ ≀ 𝜌 ≀ 1. (2.36)

For an arbitrary xπ‘˜, let rπ‘˜ denote the residual from the projection of xπ‘˜ onto the column space ofV1. That is,

rπ‘˜ =xπ‘˜βˆ’V1VH1 xπ‘˜ =U UH xπ‘˜. (2.37)

Then, the convergence of xπ‘˜ to an eigenvector of the unit eigenvalue is equivalent to the convergence ofrπ‘˜ to zero. The following theorem, whose proof is presented in Appendix 2.10.2, provides bounds forrπ‘˜ as follows:

Theorem 2.2. The expected squaredβ„“

2-norm of the residual at theπ‘˜π‘‘ β„Ž iteration is bounded as follows:

πœ“π‘˜ kr0k2

2 ≀ E krπ‘˜k2

2

≀ Ξ¨π‘˜ kr0k2

2, (2.38)

where

Ξ¨ = max

1≀𝑗≀𝑁-𝑀

𝑐(πœ†π‘—), πœ“ = min

1≀𝑗≀𝑁-𝑀 𝑐¯(πœ†π‘—), (2.39)

𝑐(πœ†) =1+ πœ‡T 𝑁

|πœ†|2βˆ’1+𝛿T(πœŒβˆ’1) |πœ†βˆ’1|2

, (2.40)

Β―

𝑐(πœ†) =1+ πœ‡T 𝑁

|πœ†|2βˆ’1+𝛿T(πœŒΒ―βˆ’1) |πœ†βˆ’1|2

.

The importance of Theorem 2.2 is twofold: First, it reveals the effect of the eigen- values (πœ†π‘—), the eigenspace geometry (𝜌,𝜌¯), and the amount of asynchronicity of the updates (𝛿T) on the rate of convergence. In the synchronous case 𝛿T =0 and πœ‡T =𝑁, hence we haveΞ¨ =max1≀𝑗≀𝑁-𝑀 |πœ†π‘—|2. This result is consistent with the well-known fact that the rate of convergence of the power iteration is determined by the second largest eigenvalue. However, in the asynchronous case (𝛿T >0), not just the eigenvalues but the eigenspace geometry ofAhas an effect. As a result, similar matrices may have different convergence rates due to their different eigenspaces.

This point will be elaborated in Section 2.4. Furthermore, in order to guarantee that E[krπ‘˜k2

2] β‰€πœ€kr0k2

2 for a given error threshold πœ€, inequalities in (2.38) show that it is necessary to have at leastblog(πœ€)/log(πœ“)citerations, and sufficient to have dlog(πœ€)/log(Ξ¨)eiterations.

Secondly, Theorem 2.2 reveals a region for the eigenvalues such that the residual error through asynchronous updates is guaranteed to convergence to zero in the mean-squared sense. The following corollary presents this result formally.

Corollary 2.2. Assume that all non-unit eigenvalues of A satisfy the following condition:

πœ†βˆ’ 𝛼 𝛼+1

< 1

𝛼+1

, (2.41)

where

𝛼=𝛿T (πœŒβˆ’1). (2.42)

Then,

π‘˜limβ†’βˆž E krπ‘˜k2

2

=0. (2.43)

Proof. From (2.39) it is clear thatΨ < 1 if and only if

|πœ†|2βˆ’1+𝛼|πœ†βˆ’1|2 < 0, (2.44) for all non-unit eigenvaluesπœ†. The inequality in (2.44) can be equivalently written as in (2.41). Since it implies thatΞ¨ < 1, Theorem 2.2 guarantees the convergence ofE[krπ‘˜k2

2] to zero as the number of updates,π‘˜, goes to infinity.

An important remark is as follows: Corollary 2.2 provides a condition under which rπ‘˜ is guaranteed to converge to a point (zero) as π‘˜ goes to infinity. On the other hand, xπ‘˜ itself only converges to a random variable defined over the eigenspace of the unit eigenvalue. This is illustrated in Figure 2.1 where the eigenspace of the unit eigenvalue is spanned by the vector [1 1]H, and x0 =[-1 1]H. In the synchronous case the signal converges to a point through a deterministic trajectory as shown in Figure 2.1a. For the random asynchronous case, Figure 2.1b illustrates the trajectories of the signals for different realizations. Convergence ofrπ‘˜ to zero implies that the limit ofxπ‘˜ always lie in the eigenspace of the unit eigenvalue (with a random orientation). Since any point in the eigenspace is an eigenvector, we can safely say thatxπ‘˜ converges to an eigenvector of the unit eigenvalue.

Notice that the convergence region for the eigenvalues defined in (2.41) is parametrized by 𝛼, and it is a disk on the complex plane with radius 1/(𝛼+1) centered at 𝛼/(𝛼+1). This region is visualized in Figure 2.2. Notice that 0 ≀ 𝛿T ≀ 1 and 0< 𝜌 ≀ 1 always hold true. As a result𝛼satisfiesβˆ’1< 𝛼≀ 0. The key observa- tion is that the region in (2.41) grows as𝛼approachesβˆ’1, and it is the smallest (and corresponds to the unit disk) when 𝛼=0. The quantity 𝛽 and the large circle in Figure 2.2 will be explained after Corollary 2.4.

Corollary 2.2 reveals the combined effect of the eigenspace geometry ofA(quantified with𝜌) and the amount of asynchronicity (quantified with𝛿T) on the convergence of the iterations. In the case of𝛿T =0 the region reduces to the unit disk, which is the well-known condition on the eigenvalues for the synchronous updates to converge.

This is an expected result since the case of𝛿T =0 corresponds to the synchronous

(a) (b)

Figure 2.1: Some realizations of the trajectories of the signal through updates for (a) the non-random synchronous case, (a) the random asynchronous case.

update itself. More importantly, the synchronous updates imply𝛼=0 independent of the eigenspace geometry ofA. Therefore, the convergence is determined entirely by the eigenvalues ofAin the synchronous case.

On the other hand, the case of asynchronous updates results in alargerconvergence region for the eigenvalues. First of all, it should be noted that asynchronous updates increase the convergence region if the eigenspace geometry ofApermits. If 𝜌=1 then𝛼=0, and the region of convergence is not improved by asynchronous itera- tions. However, if𝜌 <1 (which is the case in most practical applications), then it is possible to enlarge the region of convergence using asynchronous iterations. As𝛿T gets larger (less number of nodes are updated concurrently),𝛼 gets smaller, hence the convergence region gets larger. Even if one index is left unchanged in some iterations, we have𝛿T >0, and the residualrπ‘˜can converge to zero,even when non- unit eigenvalues outside the unit circle might exist. This is a remarkable property of the asynchronous updates since the residual (hence the signal itself) would blow up in the case of synchronous updates. Notice that in the extreme case of 𝛿T =1, the region of convergence is the largest possible. That is to say, updating exactly one node in each iteration maximizes the region of convergence of the eigenvalues.

On the other extreme, the synchronous update is the most restrictive case, which is formally stated in the following corollary:

Corollary 2.3. If the synchronous updates onAconverge, then

π‘˜limβ†’βˆž E krπ‘˜k2

2

=0, (2.45)

for random updates onAwith any amount of asynchronicity.

Proof. If the synchronous updates converge, then all non-unit eigenvalues of A satisfy |πœ†|< 1. Hence, they also satisfy (2.41) for any value of 𝛼. Therefore, Corollary 2.2 ensures the convergence of the updates irrespective of the value of

𝛿T.

It should be clear that converse of Corollary 2.3 is not true. Thus consider a scenario in which a signal over a network of nodes with autonomous (asynchronous) behavior stays in the steady-state. If the nodes start to operate synchronously, then it is possible for the signal to blow up. This happens if some of the eigenvalues fall outside of the reduced convergence region due to the reduction in the amount of asynchronicity.

In fact, the study in [142] claims that large-scale synchronization of neurons is an underlying mechanism of epileptic seizures. Similarly, the study in [196] presents the relation between increased neural synchrony and epilepsy as well as Parkinson’s disease. It should be noted that neural networks follow nonlinear models whereas the model we consider here is linear. Thus, results presented here do not apply to brain networks. Nevertheless, these neurobiological observations are consistent with the implications of Corollary 2.2 and Corollary 2.3 from a conceptual point of view.

Apart from the convergence of the iterations, Theorem 2.2 is also useful to charac- terize the case of non-converging iterations. In this regard, the following corollary presents a region for the eigenvalues such that asynchronous updates are guaranteed not to converge.

Corollary 2.4. Assume that all non-unit eigenvalues ofAsatisfy the following:

πœ†βˆ’ 𝛽 𝛽+1

β‰₯ 1

𝛽+1

, (2.46)

where

𝛽 =𝛿T (πœŒΒ―βˆ’1). (2.47)

Then,

E krπ‘˜k2

2

β‰₯ kr0k2

2. (2.48)

Furthermore, if (2.46) is satisfied with strict inequality, thenE[krπ‘˜k2

2] grows un- boundedly asπ‘˜ goes to infinity.

𝑅𝑅𝑅𝑅(Ξ») 𝐼𝐼𝐼𝐼(Ξ»)

βˆ’πŸπŸ 𝟏𝟏

𝜢𝜢 βˆ’ 𝟏𝟏 𝜢𝜢+𝟏𝟏 𝜷𝜷 βˆ’ 𝟏𝟏

𝜷𝜷+𝟏𝟏

Convergence region, synchronous.

Convergence region, random asynchronous.

Both (2.41) and (2.46) are violated.

Convergence inconclusive.

Region of no convergence, random asynchronous.

Figure 2.2: Regions (given in (2.41) and (2.46)) for the eigenvalues such that random asynchronous updates are guaranteed to converge and diverge, respectively.

Proof. From (2.39) it is clear thatπœ“ β‰₯ 1 if and only if

|πœ†|2βˆ’1+𝛽|πœ†βˆ’1|2 β‰₯ 0, (2.49) for all non-unit eigenvaluesπœ†. The inequality in (2.49) can be equivalently written as in (2.46). Since (2.46) implies thatπœ“ β‰₯ 1, Theorem 2.2 indicates thatE[krπ‘˜k2

2] is lower bounded by kr0k2

2. If (2.46) is satisfied strictly, then πœ“ >1. As a result, E[krπ‘˜k2

2]grows unboundedly as π‘˜ goes to infinity.

From the definitions in (2.42) and (2.47) note that𝛼 β‰₯ 𝛽is always true due to the fact that 𝜌 β‰₯ 𝜌¯. Therefore, the conditions in (2.41) and (2.46) describe disjoint regions on the complex plane. See Figure 2.2. Corollary 2.4 also shows that the condition

|πœ†βˆ’π›½/(𝛽+1) | < 1/(𝛽+1)isnecessaryfor the iterations to converge, whereas the condition in (2.41) issufficientfor the convergence (both in the mean square sense).

If there exists an eigenvalue that violates both (2.41) and (2.46), then convergence is inconclusive. This region is also indicated in Figure 2.2.

At this point it is important to compare the implications of Corollary 2.2 with the classical result presented in [14, 32]. Under the mild assumption that all the indices

are selected sufficiently often (see [32] for precise definition), the study [32] showed that the linear asynchronous model in (2.5) converges for any index sequence if and only if the spectral radius of|A|is strictly less than unity, where|A|denotes a matrix with element-wise absolute values ofA. On the other hand, our Corollary 2.2 allows eigenvalues with magnitudes grater than unity. Although these two results appear to be contradictory (whenAconsists of non-negative elements), the key difference is the notion of convergence. As an example, consider the matrix A2 defined in (2.55). Its spectral radius is exactly 1, and [32] proved thatthere existsa sequence of indices under which iterations onA2do not converge. For example, assuming𝑁 is odd, consider the index sequence generated as𝑖 =(2π‘˜ βˆ’1) (mod𝑁) +1. However, Corollary 2.2 proves the convergence in a statisticalmean-square averaged sense.

(See Figure 2.5.) In short, when compared with [32], Corollary 2.2 requires a weaker condition onAand guarantees a convergence in a weaker (and probabilistic) sense.