• Tidak ada hasil yang ditemukan

A dynamic weighted TOPSIS method for identifying influential nodes in complex networks

N/A
N/A
Rahim Mahruf

Academic year: 2024

Membagikan "A dynamic weighted TOPSIS method for identifying influential nodes in complex networks "

Copied!
20
0
0

Teks penuh

(1)

c

World Scientific Publishing Company DOI: 10.1142/S0217984918502160

A dynamic weighted TOPSIS method for identifying influential nodes in complex networks

Pingle Yang∗,†,‡, Xin Liuand Guiqiong Xu∗,§

School of Management, Shanghai University, Shanghai 200444, China

School of Electrical and Information Engineering, Jiangsu University of Science and Technology,

Zhangjiagang 215600, China

[email protected]

§[email protected]

Received 20 December 2017 Revised 26 March 2018 Accepted 24 April 2018 Published 14 June 2018

Identifying the influential nodes in complex networks is a challenging and significant research topic. Though various centrality measures of complex networks have been de- veloped for addressing the problem, they all have some disadvantages and limitations.

To make use of the advantages of different centrality measures, one can regard influen- tial node identification as a multi-attribute decision-making problem. In this paper, a dynamic weighted Technique for Order Preference by Similarity to Ideal Solution (TOP- SIS) is developed. The key idea is to assign the appropriate weight to each attribute dynamically, based on the grey relational analysis method and the Susceptible–Infected–

Recovered (SIR) model. The effectiveness of the proposed method is demonstrated by applications to three actual networks, which indicates that our method has better per- formance than single indicator methods and the original weighted TOPSIS method.

Keywords: Complex networks; influential nodes; dynamic weighted TOPSIS; grey relational analysis; SIR model.

1. Introduction

Complex networks exist widely in natural and social sciences, such as telecommuni- cation networks, biological networks, computer networks and social networks.1,2 It is a significant and challenging task to evaluate node importance and find influential nodes, which enables us to better control the propagation of computer viruses,3,4 diseases4–6and rumors,7identify the opinion leader in social network,8speed up the dissemination of new products,9 etc. Thanks to its great practical value and the- oretical significance, influential node identification has attracted many researchers

§Corresponding author.

Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(2)

from various fields.10–19During the past two decades, a number of ranking methods of node importance have been proposed and developed.13–17,20,21

In the past few years, numerous quantitative analysis methods which have been presented to handle the problem,22,23 generally include systems science analysis methods and social network analysis methods. In systems science analysis methods, node importance is equated with the destructiveness that the nodes are deleted from the network.24 The methods will result in the network topology changes and leave out the information about the relationship between nodes. Social network analysis methods consider that node importance is equivalent to the measurements of node centrality.25 The methods do not destroy the network connectivity and can ensure the integrity of the network. Therefore, we have concentrated on social network analysis methods in our actual research. A series of centrality measures have been presented to rank the nodes in complex networks,26–28including degree centrality (DC),26 closeness centrality (CC),26 betweenness centrality (BC)26–28 and so on. These centrality measures depict the node importance in different ways.

DC describes the node importance through the number of its neighbor nodes, but it does not consider the global network structure. BC identifies the influential nodes according to the number of shortest paths through the node, but the problem is that the information flow is not along the shortest path in most real networks. CC evaluates the node importance in terms of its ease of access to other nodes, but it is not appropriate to non-centralized network. Up to now, many centrality measure applications have been developed to identify influential nodes,8,11 most of which adopted only a single indicator to evaluate the importance of nodes, and thus all these methods have their own disadvantages and limitations.

It is an interesting and important research topic to combine the advantages of various centrality measures and make the evaluation of node importance more reasonable.13–15,17,29 Multi-attribute decision-making (MADM) method is a good choice to address this problem, where each node is regarded as an alternative and each measure is regarded as a performance attribute.

Among the various MADM methods, the technique for order preference by simi- larity to the ideal solution (TOPSIS) is a widely applied and quite effective method to select the best alternative with a number of criteria.30–33 Each performance attribute has a different effect in evaluating the node importance of complex net- works. Furthermore, if the differences of network topology are not considered, the weight of a given attribute will remain unchanged in different networks, which will certainly limit the applicability of the algorithm. Therefore, the attribute weight should be changed dynamically from one network to another.17 In this paper, a novel algorithm is proposed to assign the weight of each attribute reasonably and dynamically.

In terms of network structure, one could obtain different attribute sequences on centrality measures. The attribute sequence which is closer to real node spreading ability sequence should be more important. Motivated by this idea, we propose a Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(3)

dynamic weighted TOPSIS method to evaluate node importance in complex net- works, which is based on the grey relational analysis (GRA)34and the Susceptible–

Infected–Recovered (SIR) model.35 Here, we employ the SIR model to simulate spreading process and calculate node spreading ability on networks.36 As is well known, GRA can more effectively extract the similarity of a changed trend be- tween two sequences, and the similarity between two sequences is derived by the grade of relation.34,37–39 In this paper, the node spreading ability sequence is cho- sen as the reference sequence. According to the GRA method, one can calculate the grey relational degree between each attribute sequence and the node spread- ing ability sequence. The larger the grey relational degree, the more important the attribute is.

Following parts are organized as follows. Section 2 gives a brief overview of several centrality measures, GRA method and SIR model. In Sec. 3, a dynamic weighted TOPSIS method based on GRA and SIR is proposed to evaluate the node importance in complex networks. In Sec. 4, the proposed method is illustrated in detail through a simple example, and the efficiency and practicability of the pro- posed method are demonstrated by applications to three actual complex networks.

Finally, some conclusions and discussions are given in Sec. 5.

2. Preliminaries

In this section, we give some necessary notations on several centrality measures, and more details may refer to Freeman’s work.26In addition, the grey relational analysis method and the SIR model are introduced briefly, which form the foundation of the innovation of this paper.

2.1. Centrality measures

For an undirected networkG= (V, E) withN nodes, where V is the node set and E is the edge set, the DC, CC and BC are defined as follows.

Definition 1 (DC26). The degree centrality of node u is expressed as DC(u), which can be calculated as

DC(u) = X

v∈V\u

a(u, v), (1)

where a(u, v) represents the connection between node u and node v. If node u connects with nodev, a(u, v) is defined as 1 and otherwise 0.

Definition 2 (CC26). The closeness centrality of nodeuis expressed as CC(u), which can be calculated as

CC(u) = N−1 P

v∈V\ud(u, v), (2)

whered(u, v) represents the shortest distance between nodeuand nodev.

Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(4)

However, Eq. (2) only applies to connected network. In order to extend the application to disconnected network, the equation needs to be changed as follows:

CC(u) = X

v∈V\u

1

d(u, v). (3)

Definition 3 (BC26–28). The betweenness centrality of node u is expressed as BC(u), which can be calculated as

BC(u) = X

s6=u6=t∈V

ρ(s, u, t)

ρ(s, t) , (4)

whereρ(s, t) represents the number of the shortest paths connecting nodesto node t,ρ(s, u, t) represents the number of the shortest paths between nodesand nodet passing through nodeu.

2.2. Grey relational analysis

Grey system theory which is originally presented by Deng,34 has been widely used in various fields.36–38 The method is one of the common mathematical tools for studying uncertain information systems. Grey system theory includes GRA, which is suitable for solving problems in the presence of complicated interrelationships among multiple factors and variables.37 Thus, GRA has been successfully used to solve uncertainty problems under discrete data and incomplete information.34,37–39 In essence, GRA is the analysis of correlation coefficients, and is used to determine the similarity between two sequences. In other words, the larger the grey correlation coefficient, the closer the two sequences.

GRA method is made of four steps: grey relational generating, reference se- quence definition, grey relational coefficient calculation and grey relational degree calculation. Various techniques have been developed to calculate grey relational degree, among those the definition proposed by Deng34 has been widely applied to determine the grey relational degree. Assuming that the reference sequence is column vector (Y0)T = {y10, y20, . . . , yN0} and the comparability sequences are column vectors (Yj)T = {y1j, y2j, . . . , yN j} (j = 1,2, . . . , M), the grey relational degreeR(Y0, Yj) ofY0 andYj is defined as follows:

r(yi0, yij) = min∀imin∀j|yi0−yij|+ρmax∀imax∀j|yi0−yij|

|yi0−yij|+ρmax∀imax∀j|yi0−yij| ,

i= 1,2, . . . , N; j= 1,2, . . . , M , (5) R(Y0, Yj) = 1

N

N

X

i=1

r(y0j, yij), j = 1,2, . . . , M , (6) whereρis the distinguishing coefficient,ρ∈[0,1].

Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(5)

2.3. Susceptible–Infected–Recovered model

The SIR model was presented by May and Anderson,35 and has been widely ap- plied to epidemic dynamics on networks.36In the SIR model, each node has three discrete states, susceptible state (S), infected state (I) and recovered state (R). In the beginning, only one node is selected to be infected, and the other nodes are susceptible. At each step, the infected nodes will infect their susceptible neighbors with probabilityα, and will recover from the disease with probabilityβat the next step. The recovered nodes cannot be infected again. The total number of infected and recovered nodes, F(t) = I(t) +R(t), represents the spreading ability of the initially infected node at timet. It is obvious thatF(t) increases withtand finally reaches a stable value, denoted byF. For each initially infected node, the largerF indicates that its influence is greater. In this work, we set α= 0.3 andβ = 1, and F is obtained by averaging over 100 independent implements.

3. Proposed Method

In complex networks, each node can be regarded as an evaluation object and their importance can be evaluated and ranked comprehensively through multiple central- ity measures. Evaluating node importance can be considered as a MADM problem.

TOPSIS was first presented by Hwang and Yoon,40which can be used to rank the alternatives based on their distances to the ideal solution under multi-attributes context. Duet al.13extended the TOPSIS method to evaluate node importance in complex networks, where each attribute has the same weight. Different centrality measures represent different characters of network topology, and thus it is necessary to assign reasonable weight to each attribute. Liuet al.15 adopted subjective an- alytic hierarchy process (AHP) method for assigning weights, where the attribute weights remain unchanged in different networks. Because the epidemic spreading process is different in different networks, one centrality measure should have differ- ent performances in different systems. Hu and his co-workers proposed an algorithm to determine attribute weights dynamically, and presented the weighted TOPSIS method (W-TOPSIS).17 The best advantage of this method is that the attribute weights will change with the changes in the original metrics. However, there is an obvious deficiency in their work, that is, they derive the weights based on the dis- tance between the attribute sequences and the real node spreading ability sequence.

In fact, due to differences in the measurement units, the attribute weight is related to the similarity between the change trends of two sequences. The more consistent the change trend of the attribute sequence and real node spreading ability sequence is, the more important the attribute is.

In this paper, we propose a new dynamic weighted TOPSIS method called the GSW-TOPSIS method. It derives the weights based on the grey relational degrees between the attribute sequences and real node spreading ability sequence. GRA can more effectively extract the similarity of change trends between two sequences, and has been widely used for analysis of correlation coefficients.34,37–39 In general, the Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(6)

node attributes can be divided into three categories: local property, global property and location property of network. DC, CC and BC can represent local property, location property and global property of network, respectively.2Therefore, DC, CC and BC are selected as the node performance attributes of complex networks. In terms of the network structure, we could obtain different attribute sequences on centrality measures. The attribute sequence which is closer to real node spreading ability sequence should be more important. In different networks, due to the dif- ference of information spreading process, the same measure should have different performance.

Inspired by this, a novel algorithm combining the GRA method and the SIR model is presented to assign dynamical weights to attributes. For this purpose, we first apply the SIR model to simulate and calculate the node spreading ability on networks, and the node spreading ability sequence is chosen as the reference sequence. Then we employ the GRA method to extract the similarity between the attribute sequences and the reference sequence. As a result, the centrality measure whose sequence is closer to the real node spreading ability sequence will have a larger attribute weight.

In short, the GSW-TOPSIS method for identifying influential nodes consists of three main phases: (1) constructing and normalizing a multi-attribute matrix of the network, (2) determining the weight of each attribute by GRA and SIR, and (3) evaluating and ranking the nodes of the network with the TOPSIS method.

The process of identifying influential nodes model is schematically shown in Fig. 1, whose detailed description is as follows.

Phase 1:Construct multi-attribute matrix

Step 1: Select several centrality measures as evaluation attributes (such as DC, CC, BC), and calculate the value of these attributes in actual complex networks.

Step 2:Construct a multi-attribute matrix D = (dij)N×M, which consists of M centrality measures with respect toN nodes.

D= (dij)N×M =

DC1 CC1 · · · BC1

DC2 CC2 · · · BC2

· · · · DCN CCN · · · BCN

N×M

. (7)

Step 3:Normalize the multi-attribute matrixD.

uij= dij

qPN k=1d2kj

, i= 1,2, . . . , N; j= 1,2, . . . , M . (8) Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(7)

Step 1: Select and calculate centrality measures Step 2: Construct a multi-attribute matrix X Step 3: Normalize the multi-attribute matrix

Step 1: Construct a temporary matrix via SIR Step 2: Grey relational generation

Step 3: Calculate grey relational degree Phase 2: Determine

attribute weights with GRA and SIR

Phase 3: Evaluate and rank nodes with TOPSIS

Phase 1: Construct multi-attribute matrix

Step 4: Compute the weight coefficient

Step 1:Weight the normalized matrix Step 2: Determine the ideal object Step 3: Calculate the distance to ideal object Step 4: Compute the closeness and rank

Fig. 1. The flowchart of the proposed method.

Phase 2: Determine attribute weights with GRA and SIR

Step 1: Construct a temporary matrix E = (eij)N×(M+1) by appending a new column into the attribute matrixD, with column vector (F)T ={F1, F2, . . . , FN} being generated by SIR model.

E= (eij)N×(M+1)=

DC1 CC1 · · · BC1 F1 DC2 CC2 · · · BC2 F2

· · · · DCN CCN · · · BCN FN

N×(M+1)

. (9)

Step 2:Grey relational generation.

Due to differences of measurement units, it is necessary to process each attribute into a comparability sequence.

yij =eij−min∀ieij

PN k=1ekj

, i= 1,2, . . . , N; j= 1,2, . . . , M+ 1. (10) Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(8)

Step 3:Calculate grey relational degree.

Let column vector (YM+1)T = {y1(M+1), y2(M+1), . . . , yN(M+1)} be reference sequence, and column vectors (Yj)T = {y1j, y2j, . . . , yN j} (j = 1,2, . . . , M) be comparability sequences. The grey relational degreeR(YM+1, Yj) of YM+1 and Yj

is generated by using Eqs. (5) and (6).

Step 4:Compute the weight coefficientwj of thejth attribute.

wj= R(YM+1, Yj) PM

k=1R(YM+1, Yk), j= 1,2, . . . , M . (11) Phase 3:Evaluate and rank nodes with TOPSIS

Step 1: Let each attribute of the normalized matrix be multiplied by the corre- sponding weight.

vij =wj∗uij, i= 1,2, . . . , N; j= 1,2, . . . , M . (12) Step 2:Determine the positive ideal objectA+ and negative ideal objectA.

A+=

maxi vi1, . . . ,max

i vij, . . . ,max

i viM

={v+1, . . . , v+j, . . . , v+M}, i= 1,2, . . . , N . (13) A =

mini vi1, . . . ,min

∀i vij, . . . ,min

∀i viM

={v1, . . . , vj , . . . , vM}, i= 1,2, . . . , N . (14) Step 3: Based on Euclidean distance, calculate the distance (S+, S) between existing nodes and the corresponding ideal objects.

Si+= v u u t

M

X

j=1

(vij−vj+)2, i= 1,2, . . . , N . (15)

Si= v u u t

M

X

j=1

(vij−vj)2, i= 1,2, . . . , N . (16) Step 4: Compute the relative closenessC ={C1, C2, . . . , CN} to the ideal object by Eq. (17), and rank the node influence in complex network by the descending order of C. Obviously, the larger value of Ci indicates that the node i is more important.

Ci = Si

Si++Si , i= 1,2, . . . , N . (17) Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(9)

4. Case Study

4.1. A simple illustrative example

In what follows, we choose a simple example to illustrate the proposed method given in Sec. 3. The example network is shown in Fig. 2.

Phase 1:Construct multi-attribute matrix

Step 1: Select centrality measures DC, CC and BC as evaluation attributes, and calculate the value of these attributes by using Gephi software. Meanwhile, the node spreading ability F is derived from SIR model. The values of DC, CC, BC andF are shown in Table 1.

Step 2:Construct a multi-attribute matrixD= (dij)6×3={DC,CC,BC}.

Step 3: Normalize the matrix D by Eq. (8), and obtain a normalized matrix U = (uij)6×3.

U = (uij)6×3=

0.6489 0.5225 0.9231 0.1662 0.3134 0 0.3244 0.3482 0.0839 0.3244 0.3918 0.1678 0.4867 0.4479 0.3357 0.3244 0.3918 0

6×3

.

1 2

3 4

5 6

Fig. 2. An example network with 6 nodes and 7 edges.

Table 1. Centrality measures DC, CC, BC and node spread- ing abilityF in the example network.

Node No. DC CC BC F

1 4 0.1667 0.55 2.5475

2 1 0.1 0 1.6770

3 2 0.1111 0.05 2.0197

4 2 0.125 0.1 2.0616

5 3 0.1429 0.2 2.3686

6 2 0.125 0 2.1159

Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(10)

Phase 2:Determine attribute weights with GRA and SIR

Step 1:Construct a temporary matrixE= (eij)6×4={DC,CC,BC, F}.

Step 2:By using Eq. (10), one can get the matrixY = (yij)6×4as follows:

Y = (yij)6×4=

0.2143 0.0865 0.6111 0.0681

0 0 0 0

0.0714 0.0144 0.0556 0.0268 0.0714 0.0324 0.1111 0.0301 0.1429 0.0557 0.2222 0.0541 0.0714 0.0324 0 0.0343

6×4

.

Step 3: Let the 4th column (Y4)T = {y14, y24, . . . , y64} be reference sequence, the former three columns (Yj)T ={y1j, y2j, . . . , y6j} (j= 1,2,3) be comparability sequences. Here, we setρ= 0.5, then the grey relational degree can be generated by Eqs. (5) and (6).

R={R(Y4, Y1), R(Y4, Y2), R(Y4, Y3)}={0.8350,0.9785,0.7522}.

Step 4: By using Eq. (11), we obtain W ={wDC, wCC, wBC} = {0.3254,0.3814, 0.2932}.

Phase 3:Evaluate and rank nodes with TOPSIS

Step 1: By using Eq. (12), the weighted multi-attribute matrix V = (vij)6×3 is obtained as

V = (vij)6×3=

0.2111 0.1993 0.2707 0.0528 0.1195 0 0.1056 0.1328 0.0246 0.1056 0.1494 0.0492 0.1584 0.1708 0.0984 0.1056 0.1494 0

 .

Step 2:Determine the positive ideal objectA+ and negative ideal objectA. A+={v+1, v2+, v+3}={0.2111,0.1993,0.2707}.

A={v1, v2, v3}={0.0528,0.1195,0.0000}. Step 3:Calculate the distance (S+, S).

S+ ={S1+, S2+, . . . , S6+}={0,0.3236,0.2759,0.2503,0.1824,0.2948}. S ={S1, S2, . . . , S6}={0.3236,0,0.0597,0.0781,0.1532,0.0607}. Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(11)

Table 2. The ranking lists of node importance by the descending order of DC, BC, CC andC.

DC CC BC C (GSW-TOPSIS)

Value Nodes Value Nodes Value Nodes Value Nodes

4 1 0.1667 1 0.55 1 1 1

3 5 0.1429 5 0.2 5 0.4793 5

2 4 0.125 4 0.1 4 0.2508 4

2 3 0.125 6 0.05 3 0.1954 3

2 6 0.1111 3 0 6 0.1924 6

1 2 0.1 2 0 2 0 2

Step 4:Compute the relative closenessC to the ideal solution.

C={C1, C2, . . . , C6}={1,0,0.1954,0.2508,0.4793,0.1924}.

Then, the ranking lists by the descending order of DC, CC, BC and C (GSW- TOPSIS), are shown in Table 2 when compared with the proposed method and DC or BC, they have the same ranking. There is only a different member in the ranking lists between the proposed method and CC.

4.2. Three actual complex networks

In this subsection, three real networks are used to verify the effectiveness and practi- cability of the proposed method. In principle, the node importance is closely related to its spreading ability. The node with bigger spreading ability is more influential.

To evaluate the performance of different ranking methods, we use SIR model which is described in Sec. 2 to simulate the real spreading process. On the one hand, we compare the proposed method with three centrality measures. On the other hand, the proposed method is compared with W-TOPSIS method developed by Hu and his co-workers.17

(1) Yeast network

Yeast network is a protein–protein interaction network, which contains 2361 nodes and 7182 edges. Each node represents a protein and each edge repre- sents the interaction between two proteins. The data can be downloaded from

“http://vlado.fmf.uni-lj.si/pub/networks/data/bio/Yeast/Yeast.htm”.

Table 3 shows the top-15 lists derived by DC, CC, BC, W-TOPSIS17 and the proposed method, respectively. There are the same 13 members in the top-15 lists between the proposed method and DC. Our method and BC have the same fourteen members in the top-15 lists. Comparing GSW-TOPSIS with W-TOPSIS, they have the same nodes in the top-14 lists, and their 15th node is different. In addition to the 15th node, we focus on these nodes with different ranking orders that both appear in the top-14 lists.

Firstly, in our method, Node 566 Node 147, where “” represents “more important than”, but the situation is opposite in W-TOPSIS. In Fig. 3(a), one can see that Node 566 infects other nodes faster and is larger than Node 147.

Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(12)

Table 3. The top-15 lists ranked by degree centrality (DC), closeness centrality (CC), betweenness centrality (BC), W-TOPSIS and the proposed method (GSW–

TOPSIS) in Yeast.

Node No.

Rankings DC CC BC W-TOPSIS GSW-TOPSIS

1 566 951 147 1443 1443

2 302 694 1443 784 784

3 209 1162 784 147 566

4 1443 1364 566 566 147

5 784 2158 209 209 209

6 147 104 549 549 549

7 492 636 302 302 302

8 120 866 508 508 508

9 644 915 2022 2022 2022

10 252 922 120 120 120

11 508 1195 199 199 492

12 2022 1238 492 492 252

13 549 1299 252 252 199

14 61 1301 290 290 290

15 442 1360 283 6 61

0 200 400 600 800 1000

1 4 7 10 13 16 19

Infected nodes F(t)

Timest (a)

α=0.2 Node 147 Node 566

0 200 400 600 800 1000

1 3 5 7 9 11 13 15 17 19

Infected nodes F(t)

Times t (b)

α=0.2 Node 199 Node 492

0 200 400 600 800 1000

1 4 7 10 13 16 19

Infected nodes F(t)

Timest (c)

α=0.2 Node 199 Node 252

0 200 400 600 800 1000

1 4 7 10 13 16 19

Infected nodes F(t)

Times t (d)

α=0.2 Node 6 Node 61

Fig. 3. (Color online) The vertical axis represents the spreading ability of the initially infected node at timet. Results are obtained by averaging over 100 independent implements in Yeast.

Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(13)

Secondly, in our method, Node 492Node 252Node 199, but in W-TOPSIS, Node 199Node 492Node 252. Therefore, we examine the spreading ability of two pairs of nodes, namely, Node 492 and Node 199, Node 252 and Node 199. The comparisons are shown in Figs. 3(b) and 3(c), from which one can see that Node 492 infects other nodes faster than Node 199, and Node 252 outperforms Node 199 slightly. Besides, it should be noticed that these two pairs of nodes reach the similar steady states.

Finally, we compare the importance of the nodes which appear either in the top-15 lists by the proposed method, or in the top-15 lists by W-TOPSIS method.

Node 61 only appears in the top-15 lists by our method, and Node 6 only appears in the top-15 lists by W-TOPSIS method. The simulation result of these two nodes is depicted in Fig. 3(d), from which one can observe that Node 61 has better per- formance than Node 6. The above analysis indicates that the proposed method performs better than W-TOPSIS in Yeast.

(2) Netscience network

Netscience network is a scientific collaboration network on network theory and network experiments,41which contains 1589 nodes and 2742 edges. Each node rep- resents a scientist, and each edge represents the collaboration between two scien- tists. The data can be downloaded from “http://www-personal.umich.edu/mejn/

netdata/”.

Table 4 shows the top-15 lists generated by DC, CC, BC, W-TOPSIS17and the proposed method, respectively. The proposed method and DC, CC, BC have 4, 13 and 15 same nodes in the top-15 lists, respectively. Comparing GSW-TOPSIS with W-TOPSIS, they all have the same nodes in the top-15 lists. Here, we focus on

Table 4. The top-15 lists derived by degree centrality (DC), closeness centrality (CC), between- ness centrality (BC), W-TOPSIS and the proposed method (GSW–TOPSIS) in Netscience.

Node No.

Rankings DC CC BC W-TOPSIS GSW-TOPSIS

1 34 79 79 79 79

2 35 151 151 151 151

3 79 517 517 517 517

4 55 282 282 282 282

5 295 217 217 217 35

6 1430 35 35 35 217

7 1431 757 757 757 757

8 1432 302 302 302 302

9 217 132 132 204 34

10 63 204 204 132 204

11 646 152 152 34 132

12 1433 34 34 152 152

13 1434 47 47 47 47

14 1435 31 220 220 220

15 1436 308 31 31 31

Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(14)

0 80 160 240 320 400

1 4 7 10 13 16 19

Infected nodes F(t)

Times t (a)

α=0.85 Node 217 Node 35

0 80 160 240 320 400

1 4 7 10 13 16 19

Infected nodes F(t)

Times t (b)

α=0.85 Node 204 Node 34

0 80 160 240 320 400

1 4 7 10 13 16 19

Infected nodes F(t)

Times t (c)

α=0.85 Node 132 Node 34

Fig. 4. (Color online) The vertical axis represents the spreading ability of the initially infected node at timet. Results are obtained by averaging over 100 independent implements in Netscience.

these nodes with different ranking orders that both appear in the same top-15 lists by comparing GSW-TOPSIS with W-TOPSIS.

First of all, in our method, Node 35Node 217, but the situation is opposite in W-TOPSIS. Figure 4(a) shows that Node 35 can spread the information (or disease) faster and wider than Node 217.

Next, in our method, Node 34 Node 204 Node 132, but in W-TOPSIS, Node 204Node 132Node 34. Therefore, we compare the spreading ability of two pairs of nodes, namely, Node 34 and Node 204, Node 34 and Node 132, which are shown in Figs. 4(b) and 4(c). In Fig. 4(b), Node 34 does not perform as well as Node 204 betweent= 5 andt= 11, but other than that the situation is opposite. In Fig. 4(c), although Node 132 performs slightly better than Node 34 betweent= 4 andt= 8, our method outperforms W-TOPSIS at the beginning, besides reaching a slightly larger steady state. In conclusion, the above experiment indicates that our method performs slightly better than W-TOPSIS in Netscience.

Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(15)

Table 5. The top-15 lists ranked by degree centrality (DC), closeness centrality (CC), be- tweenness centrality (BC), W-TOPSIS and the proposed method (GSW-TOPSIS) in Geom.

Node No.

Rankings DC CC BC W-TOPSIS GSW-TOPSIS

1 36 6322 1812 36 36

2 1812 5190 36 1812 1812

3 405 5583 79 79 79

4 1897 207 405 405 405

5 1207 5444 658 658 1897

6 79 7319 1897 1897 997

7 5864 823 2918 2918 1207

8 3991 5682 1414 1414 5864

9 712 3429 997 997 1414

10 997 4446 5864 5864 3991

11 3031 5102 3991 1207 2918

12 546 5156 1207 3991 658

13 2542 7034 1623 3125 712

14 1571 5666 3125 1623 3125

15 1414 6504 1103 712 546

(3) Geom network

Geom network is a collaboration network in computational geometry, which consists of 7343 nodes and 11,898 edges. Each node represents an author in computational geometry, and each edge represents the joint work between two au- thors. The data can be downloaded from “http://vlado.fmf.uni-lj.si/pub/networks/

data/collab/geom.htm”.

Table 5 shows the top-15 lists given by DC, CC, BC, W-TOPSIS17 and the proposed method, respectively. The number of the same members in the top-15 lists between the proposed method and other centrality measures are 12, 0 and 13, respectively in DC, CC, BC.

Comparing GSW-TOPSIS with W-TOPSIS, they have the same fourteen mem- bers in the top-15 lists. On the one hand, in our method, Node 1897Node 658, Node 997Node 2918, Node 5864 Node 1414, but the situation is opposite in W-TOPSIS. On the other hand, Node 546 only appears in the top-15 lists by our method, and Node 1623 only appears in the top-15 lists by W-TOPSIS method.

Thus, we compare the spreading ability of four pairs of nodes, namely, Node 1897 and Node 658, Node 997 and Node 2918, Node 5864 and Node 1414, Node 546 and Node 1623, which are shown in Fig. 5.

In Fig. 5(a), Node 1897 performs, as well as Node 658. In Figs. 5(b)–5(d), we can see that Node 997, Node 5864, Node 546 can spread the information (or disease) faster than Node 2918, Node 1414, Node 1623, respectively. Moreover, Node 997 and Node 546 reach the larger steady states than Node 2918 and Node 1623, respectively. As discussed above, one can see that our method outperforms W-TOPSIS in Geom.

Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(16)

0 200 400 600 800 1000 1200

1 4 7 10 13 16 19

Infected nodes F(t)

Times t (a)

α=0.2 Node 658 Node 1897

0 200 400 600 800 1000 1200

1 4 7 10 13 16 19

Infected nodes F(t)

Timest (b)

α=0.2 Node 2918 Node 997

0 200 400 600 800 1000 1200

1 4 7 10 13 16 19

Infected nodes F(t)

Times t (c)

α=0.2 Node 1414 Node 5864

0 300 600 900 1200 1500 1800

1 4 7 10 13 16 19

Infected nodes F(t)

Times t (d)

α=0.2 Node 1623 Node 546

Fig. 5. (Color online) The vertical axis represents the spreading ability of the initially infected node at timet. Results are obtained by averaging over 100 independent implements in Geom.

4.3. Correlation analysis

The node with larger node spreading ability is more important. Kendall’s Tau coefficient is widely used for correlation analysis. The larger correlation coefficient between the list derived by a ranking method and the node spreading ability list ranked by SIR model, indicates that the method has a better performance. In order to further compare the proposed method with W-TOPSIS, we use Kendall’s Tau coefficient to perform the correlation analysis.

Assuming a network hasnvertices, Kendall’s Tau correlation is defined as42: τ= Nc−Nd

n(n−1)/2, (18)

where Nc and Nd represent the number of concordant and discordant pairs, re- spectively. Kendall’s Tau considers a set of joint observations{(x1, y1),(x2, y2), . . . , (xn, yn)}from two random variablesX={x1, x2, . . . , xn}andY ={y1, y2, . . . , yn}.

If bothxi< xj andyi < yj or if both xi> xj andyi> yj, the pair of observation (xi, yi) and (xj, yj) are said to be concordant. If xi < xj andyi > yj or ifxi> xj Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(17)

andyi< yj, the pair of observation (xi, yi) and (xj, yj) are said to be discordant.

Ifxi=xj oryi=yj, the pair is neither concordant nor discordant.

It is clear that larger τ value indicates more accurate ranked list a ranking method could generate. The most ideal case is τ = 1, where the method and the real spreading process have exactly the same ranking lists. With a large value ofα, the spreading would cover almost all the network.11 In this experiment, we set the spreading probabilityαof SIR model as it is gradually increasing from 0.01 to 0.1.

In Fig. 6(a), GSW-TOPSIS does not perform as well as W-TOPSIS between α = 0.04 and α = 0.06 in Yeast, but other than that the situation is opposite.

From Fig. 6(b), one can see that GSW-TOPSIS performs completely better than W-TOPSIS in Netscience. Besides, the Kendall’s Tau value is as high as 0.89 when α = 0.08. In Fig. 6(c), our method outperforms W-TOPSIS except α = 0.03 on strongly positive correlation with real spreading process in Geom.

0.55 0.59 0.63 0.67 0.71

0.01 0.03 0.05 0.07 0.09

τ

α (a) Yeast

W-TOPSIS GSW-TOPSIS

0.6 0.67 0.74 0.81 0.88

0.01 0.03 0.05 0.07 0.09

τ

α (b) Netscience

W-TOPSIS GSW-TOPSIS

0 0.15 0.3 0.45 0.6

0.01 0.03 0.05 0.07 0.09

τ

α (c) Geom

W-TOPSIS GSW-TOPSIS

Fig. 6. (Color online) The Kendall’s correlation coefficients generated by comparing the lists ranked by the two methods with the list ranked by SIR model on three real networks. The results are obtained by over 100 independent runs, withαbeing ranged from 0.01 to 0.1.

Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(18)

5. Computing Complexity Analysis

In general, there are many nodes in real complex networks. Thus, the ranking methods of node importance should be both highly efficient and reasonable. In this part, we discuss the computing complexity of the proposed method, assuming that an unweighted network hasmedges andnvertices.

The new proposed method consists of three main phases. In Phase 1, one should construct multi-attribute matrix by calculating the value of centrality measures DC, CC and BC. The computational complexity for degree centrality isO(n). BC can be calculated using Brandes’ algorithm43 in time O(n∗m) = O(n2hki), where hki represents the average degree of the network. The calculation of CC takes computational complexityO(n3) with Floyd’s algorithm.44 Normalizing the multi- attribute matrix can be done using Euclidean standardization in time O(n). In Phase 2, we focus on the computing complexity of determining the weights of given attributes by GRA and SIR. Node spreading abilityF can be computed using SIR model in timeO(n∗m)≤O(n3). Grey relational degree can be calculated using GRA method in timeO(n). In Phase 3, we finally observe evaluating and ranking the nodes of the network with the TOPSIS method, whose computing complexity isO(n). In conclusion, the above analysis indicates that the computing complexity of the proposed method isO(n3) on the whole.

6. Conclusions

Influential node identification is an important and challenging issue in complex networks. There are some limitations to rank nodes of networks by only one index. In order to evaluate node importance comprehensively, we propose a dynamic weighted TOPSIS method, based on consideration of local, global and location information of nodes. The proposed method and the W-TOPSIS17 method are based on a common idea that the attribute sequence which is closer to real node spreading ability sequence will be more important. Compared with the weighted algorithm given in Ref. 17, the GRA model is more effective in extracting the similarity between two sequences, and is apparently more suitable to extract the consistence of trend between the attribute sequences and real node spreading ability sequence.

Therefore, in the proposed method, we make use of GRA method to determine the attribute weights dynamically by combining SIR model, namely, one can assign the corresponding weight to each attribute by the grey relational degree between the attribute sequence and node spreading ability sequence.

The comparative experiments are conducted in three actual complex networks like “Yeast”, “Netscience” and “Geom”. To evaluate the performance of different methods, the SIR model is adopted to measure node influence. In three actual complex networks, the proposed method can rank nodes more accurately than the W-TOPSIS17method, which suggests that the weights derived from GRA and SIR are more reliable. The proposed method can also be used to other actual complex Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(19)

networks. The more study on multi-attribute ranking method and its application is worthwhile in the future.

Acknowledgments

The authors are very grateful to the editor and the referees for their valuable com- ments and suggestions that led to an improved version of this paper. This work was supported by National Natural Science Foundation of China (No. 11201290), Shang- hai Science and Technology Development Funds Soft Science Research Project (No.

17692104500) and Natural Science Research Project of Jiangsu Province Colleges and Universities (No. 16KJD520002).

References

1. R. Albert and A. L. Barabasi,Rev. Mod. Phys.74(2002) 47.

2. M. E. J. Newman,Networks. An Introduction(Oxford University Press, Oxford, 2010).

3. R. Pastorsatorras, A. V´azquez and A. Vespignani,Phys. Rev. Lett.87(2001) 258701.

4. R. Cohen, S. Havlin and D. Ben-Avraham,Phys. Rev. Lett.91(2001) 247901.

5. R. Pastor-Satorras and A. Vespignani,Phys. Rev. E65(2002) 8.

6. A. L. Barabasi, N. Gulbahce and J. Loscalzo,Nat. Rev. Genet.12(2011) 56.

7. J. Borge-Holthoefer and Y. Moreno,Phys. Rev. E85(2012) 026116.

8. L. Y. L¨u, Y. C. Zhang, C. H. Yeung and T. Zhou,PLoS One6(2011) e21202.

9. J. Leskovec, L. A. Adamic and B. A. Huberman,ACM T. Web1(2007) 39.

10. D. H. Zanette,Phys. Rev. E65(2002) 041908.

11. M. Kitsak, L. K. Gallos, S. Havlin, F. Liljeros, L. Muchnik, H. E. Stanley and H. A.

Makse,Nat. Phys.6(2010) 888.

12. G. Lawyer,Sci. Rep.5(2015) 9.

13. Y. X. Du, C. Gao, Y. Hu, S. Mahadevan and Y. Deng,Physica A399(2014) 57.

14. T. Bian and Y. Deng,Chaos28(2018) 043109.

15. Z. H. Liu, C. Jiang, J. Y. Wang and H. Yu,Knowl.-Based Syst.84(2015) 56.

16. L. Y. Lu, D. B. Chen, X. L. Ren, Q. M. Zhang, Y. C. Zhang and T. Zhou,Phys. Rep.

650(2016) 1.

17. J. T. Hu, Y. X. Du, H. M. Mo, D. J. Wei and Y. Deng,Physica A444(2016) 73.

18. Z. X. Wang, C. J. Du, J. P. Fan and Y. Xing,Neurocomputing260(2017) 466.

19. L. G. Fei, H. M. Mo and Y. Deng,Mod. Phys. Lett. B31(2017) 17.

20. P. Hu, W. L. Fan and S. W. Mei,Physica A429(2015) 169.

21. T. Bian and Y. Deng,Chaos Soliton. Fract.103(2017) 101.

22. F. Bauer and J. T. Lizier,Europhys. Lett.99(2012) 68007.

23. W. J. Yuan, J. F. Zhou, Q. Li, D. B. Chen and Z. Wang, Phys. Rev. E 88(2013) 022818.

24. Y. H. Li, Z. A. Bandar and D. McLean,IEEE T. Knowl. Data En.15(2003) 871.

25. S. P. Borgatti, A. Mehra, D. J. Brass and G. Labianca,Science323(2009) 892.

26. L. C. Freeman,Soc. Networks1(1978) 215.

27. L. C. Freeman, S. P. Borgatti and D. R. White,Soc. Networks13(1991) 141.

28. M. E. J. Newman,Soc. Networks27(2005) 39.

29. W. L. Fan, P. Hu and Z. G. Liu,IET Gener. Transm. Dis.10(2016) 2027.

30. S. Opricovic and G. H. Tzeng,Eur. J. Oper. Res.156(2004) 445.

31. F. E. Boran, S. Genc, M. Kurt and D. Akay,Expert Syst. Appl.36(2009) 11363.

32. J. M. Merigo and A. M. Gil-Lafuente,Inform. Sciences180(2010) 2085.

Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

(20)

33. D. Kannan, A. Jabbour and C. J. C. Jabbour,Eur. J. Oper. Res.233(2014) 432.

34. J. L. Deng,Syst. Control Lett.1(1982) 288.

35. R. M. May and R. M. Anderson,Nature280(1979) 455.

36. C. Dye,Trends Ecol. Evol.6(1991) 340.

37. J. Moran, E. Granada, J. L. Miguez and J. Porteiro,Fuel. Process. Technol.87(2006) 123.

38. D. Yamaguchi, G. D. Li and M. Nagai,Inform. Sciences177(2007) 4727.

39. G. W. Wei,Expert Syst. Appl.38(2011) 4824.

40. C. Hwang and K. Yoon,Multiple Attribute Decision Making Methods and Applica- tions: A State of the Artsurvey(Springer, Berlin, 1981).

41. M. E. J. Newman,Phys. Rev. E74(2006) 036104.

42. M. G. Kendall,Biometrika30(1938) 81.

43. U. Brandes,J. Math. Sociol.25(2001) 163.

Mod. Phys. Lett. B Downloaded from www.worldscientific.com by UNIVERSITY OF CALIFORNIA @ SANTA BARBARA on 06/23/18. For personal use only.

Referensi

Dokumen terkait