Lifetime Maximizing Data Gathering Tree in Wireless Sensor Networks K-means based Prim’s Algorithm

(1)

Lifetime Maximizing Data Gathering Tree in Wireless Sensor Networks K-means based Prim’s Algorithm

1Rupali Rohankar & ²C. P. Katti Email : ¹[email protected], ²[email protected] Abstract - The energy efficiency is significant while

collecting data in wireless sensor network (WSN) due to source energy limitations and high transmission cost. In this paper, construction of data gathering tree using k- means based Prim’s algorithm is done. A lifetime Maximizing Minimum Spanning Tree (LMMST) for the complete network is constructed based on N sensing points.

The initial neighboring nodes are partitioned based on k- means clustering. Theoretically it speeds the execution of calculating MST as it reduces the unnecessary partitions as compared with original Prims and Kruskal algorithms.

The prim’s algorithm calculates minimum spanning tree by selecting the minimum weighted edge from the graph.

The complexity of LMMST calculation is around O (N^1.5).

Keywords - Wireless sensor network, Minimum Spanning Tree, Network lifetime maximization, k-means clustering, Prim.

I

. INTRODUCTION

The advancement in micro-electronic have enhanced the capabilities of integrating sensors, processor, and radio devices together on a transducer. An important application of wireless sensor network is continuous monitoring and periodic reporting [1]. The sensors collect the data from surrounding and then transmit to a central processing node called sink. This category of applications includes habitat monitoring, structure maintenance and so on. The amount of data generated in the network is huge. Data aggregation can be applied to fuse the redundant data and thus avoid unnecessary transmissions. Additionally, main issue in data gathering is conservation of node energy and prolonging network lifetime, which is an NP-hard problem. As the transmission of data consumes highest energy, it is therefore essential that the unnecessary transmissions be avoided [2]. Similar to data aggregation, topology control is another crucial method to avoid unnecessary transmissions. In this the nodes in the network are organized to obtain a predetermined structure or graph of the network and then apply routing algorithms. As described in [5] SEECDS is an energy efficient method based on connected dominating set to prolong network lifetime. Also, [7] describes a lifetime maximizing dominating set based clustering method (LMDSC). As described in [17], energy efficient duty cycle scheduling can be done to prolong network lifetime by putting the

lifetime maximizing data gathering tree based on k- means. The network is divided in √N clusters [15]. The minimum spanning tree of these clusters is constructed using Prim‟s algorithm. The initial partitioning using k- means clustering improvises the speed of MST construction.

The outline is as follows. In section II, related work is described. In the section III, system model is given. The KDAC algorithm is described in section IV along with its theoretical analysis. In the section V the simulation results are shown with conclusion in section VI.

II. RELATED WORK

A cluster and tree based data gathering algorithms have been proposed in the past. In [3], [4] cluster heads are decided on probably and then clusters are formed. Using bottom-up approach the clusters are connected to next higher level cluster. many of the algorithms proposed using bottom up approach have large overhead [4], [7], [8]. In [9], top down method is used, where clusters are formed starting from the root of the tree. But this still can result in the cluster overlapping leading to redundancy. The cluster size should remain uniform and topology control should be easily achieved. [10] – [12].

In a highly energy constrained sensor network, it is required that the data gathering protocol should be energy efficient and the hop count should be minimal.

Minimum spanning tree algorithms like Prims and Kruskal are well known algorithms in literature [13], [14].

The work proposed in [15] uses k-means based divide and conquer for data mining. The work refers to the calculation of data points in the data set. LMMST is similar to [15]. The LMMST algorithm proposed here calculates minimum spanning tree using k-means and prim‟s algorithm for wireless sensor network. LMMST proposed here is novel for its implementation in calculating a data gathering tree for WSN. K-means is a robust and easy to implement clustering algorithm [16].

(2)

_______________________________________________________________________________________________

ISSN (Print): 2319-2526, Volume-4, Issue-2, 2015 7

III. NETWORK MODEL, ENERGY MODEL

& PROBLEM DEFINITION

A. Network Model

In an undirected graph depicting a wireless sensor network, N nodes are deployed. If a pair of nodes (i, j) are within the communication distance then a link of weight w_i,j exists between them in the graph.. w_i,jis the communication cost in terms of energy consumed to transmit data between (i, j) and is proportional to equation (1). T_sis the spanning tree consisting of edges from original graph and a path to a node is selected based on the lower weight of the link. The nodes continuously monitor the environment and generate periodic data. They generate at a certain interval t-bit messages. The messages are to be forwarded to the sink node.

B. Energy Model

Each node has an initial energy level and the transmission energy consumption is similar to [11] given as –

E_t =m *E_elec+ m *E_fs * r² (1)

Where, E_elec is the transmission energy, r is the transmission radius, E_fs transmitter amplifier energy and it depends on r. To receive k-bits:

Also the reception of a packet by node consumes the energy equal to – E_r = m*E_elec (2)

C. Problem Definition

The problem of computing minimum weight spanning tree is studied in this paper. The topology control protocol for „Lifetime Maximizing Minimum Spanning Tree‟ (LMMST) is proposed herewith. The LMMST is a distributed algorithm and implements K-means based divide and conquer as described in section IV.

IV. DESIGN OF LMMST ALGORITHM

A. LMMST Algorithm

Minimum spanning tree is widely studied by researchers to improve its efficiency. The problem of building lifetime maximizing data gathering tree is a construction of minimum spanning tree while considering the node‟s residual energy. Essentially, LMMST is comprised of phases where k-means and prim‟s algorithm are employed one after the other to obtain an approximate minimum spanning tree.

Phase 1: Partition network into clusters using k-means Due to dense deployment in WSN, the sensor nodes are usually spatially correlated. While applying clustering this spatial correlation has to be preserved. Hence, k- means can be applied to partition the network on the basis of this spatial correlation. From [5], the number of clusters has a maximum value of √N. Hence, we set k =

√N. Also, this value of k gives a minimum time

complexity of proposed LMMST algorithm. K-means clustering algorithm is given in algorithm 1 in Annexure I.

Phase 2: MST construction

After dividing the network in to k-cluster using k-means we find the MSTs by implementing the Prim‟s algorithm as shown in is Algorithm 2 in Annexure I. Prim‟s algorithm builds a unique spanning tree with minimum edge cost. In each iteration of the algorithm, a new vertex is included in the spanning tree T^k such that it forms a minimum weighted edge with some existing vertex in the tree T^k.

Phase 3: Combining MSTs of k-clusters

This is a combining phase where the connectivity of different k-MSTs, T^k, is achieved. Once, the MSTs of subsets are constructed then in combine step the neighboring MSTs are connected together by finding the minimum transmission cost between two nodes. The Combine algorithm is Algorithm 3 in Annexure I.

B. Working of LMMST algorithms

The clusters are formed using k-means, algorithm 1 Annexure I. In second phase, the cluster heads are connected in a tree by constructing MSTs of individual clusters using Prim‟s algorithm. The cluster head collects data from each individual sub-MST and forwards it to other cluster heads towards sink node.

Since, the Prim‟s algorithm selects a vertex set based on the minimum transmission cost, the average transmission range is reduced through construction of MST. This reduces the energy consumption due to transmission while data gathering, along the path towards sink node.

Lemma 1: LMMST is more energy efficient than simple MST construction.

MSTs successfully reduce the transmission range, but LMMST is more energy efficient. This is due to the fact that it clusters spatially related nodes and allows more data fusion of redundant data, reducing the amount of transmitted data. Also, the average transmission range minimizes reducing the energy consumption during data gathering in a cluster. Further, LMMST essentially covers minimum distance between two clusters, reducing the overall transmission range.

Lemma 2: LMMST is scalable.

When the network area becomes larger, the clustering of spatial nodes improves. As per lemma 1, LMMST is significantly better than in terms of energy efficiency as compared to simple MSTs. Larger the network area more is the efficiency of LMMST. Hence, LMMST is scalable.

C. Analysis of LMMST i. Time Complexity Analysis

(3)

LMMST consists of 3 phases: divide phase, MST construction, and conquer phase. Initially, k-partitions are formed in phase 1 and k-MSTs are designed in second phase. The time complexity of prim‟s algorithm using binary heaps with S iterations is O (N log N) ≈ O (N). The additional time complexity is of Conquer stage which is O (k). Hence, time complexity of all 3 phases taken together

C = k + k * N + k = 2k + k N

≈ k*N (if, 2k << k*N) (3) Since, k = √N, therefore, (3) becomes

C = √N * N = N^1.5. (4)

Hence, from equation (4) we conclude that the complexity of LMMST is N^1.5.

V. SIMULATION

A. Simulation Environment

The simulation is performed in Matlab. 300 nodes are randomly distributed in the area of 100m x 100m. Initial energy level varies from 5pJ to 10pJ. The energy consumption parameters are as given in Table I.

Parameters Values

E_elec 30nJ / bit

E_fs 10pJ /bit/m²

Packet size (in bits) 1500

Table I. Energy consumption parameters and their values

The simulation results are compared with original Prim‟s algorithm to show the effect of k-means clustering on data gathering and lifetime maximization.

The algorithms are compared on metrics energy consumed in data gathering, lifetime and data delivery ratio.

B. Simulation Results

i) Energy Consumed (Hop Counts) –

It depicts the energy consumed in data gathering. The number of hops is set to 5 to 25 at an interval of 5. The

number of nodes is fixed to 100 and 300 in an area of 150m x 150m.

As shown in fig.1. the energy consumption rises with the number of hops. Further, the energy consumption in LMMST is low as compared to Prim is due to the fact that, LMMST collects data in clusters and fuses it before forwarding the entire data. Moreover, the distributed LMMST algorithm creates a hierarchical structure before initiating the data gathering in the network. Fig. 2 depicts the energy consumption during data gathering when the nodes are increased to 300. As number of nodes grows thereby the density of the network also increases which causes more message exchange within a cluster, leading to more energy consumption.

ii) Network Lifetime (seconds) – It is the lifetime in terms of rounds before the first node depletes its energy completely. In a 150m x 150m area, the number of nodes will vary from 20 to 180nodes.

(4)

_______________________________________________________________________________________________

ISSN (Print): 2319-2526, Volume-4, Issue-2, 2015 9

In Fig. 3, it is shown that the network lifetime grows with the increase in number of nodes. The increase results in more spatially related nodes. This influences the clustering and hence, data gathering improvises due to redundant data fusion. As shown LMMST has greater network lifetime compared to Prim, which is due to the initial cluster formation of spatially related nodes

Fig. 4 shows the network lifetime of LMMST and Prim‟s algorithm when network area varies. The node density is fixed to 20 and 300 sensor nodes are deployed. As seen in fig. 4 the network lifetime increases with the area but the growth is halved as compared to fig.3. this is because, although MSTs are used but to maintain connectivity the minimum transmission range is higher when compared to fig. 3, which causes more energy consumption.

iii) Delivery ratio (packets) – It is the number of packets received at sink node as compared to the number of packets produced by the nodes of the network. We calculate the ratio of data delivery when the network nodes vary from 20 to 180 in an area of 150m x 150m.

As shown in Fig. 5, the data delivery ratio is higher for LMMST. As compared with Prim‟s algorithm, the LMMST algorithm delivers 33.6% more packets. The delivery ratio also increases with the network size (in terms of nodes). This proves that LMMST is scalable and efficient for larger networks.

VI. CONCLUSION

In this paper, “Lifetime Maximizing Minimum Spanning Tree” (LMMST) algorithm is proposed. The algorithm

generates initial clusters based on k-means. The minimum spanning trees with minimum weighted edges are constructed for each k-cluster. A shorter path is chosen for connecting the distinct k-spanning trees. The minimum weight of edge corresponds to the transmission energy required on that link between two nodes. Hence, through spanning tree construction we achieve a reduced minimum transmission range for entire network. Further, due to clustering the spatial nodes are explored and data fusion is applied. This further reduces the transmission energy consumption, thus making LMMST more efficient when compared to Prim‟s algorithm. As proved in lemma 2, LMMST is scalable. As shown from results in Fig. 5, LMMST has higher data delivery ratio and is efficient for larger networks.

VII. REFERENCES

[1] Rupali Rohankar, C. P. Katti, S. Kumar,

“Comparison of Energy Efficient Data Collection Techniques in WSN” in Elsevier Procedia of ICRTC‟15, Modinagar, India, pp. 5, 2015.

[2] W. Dargie, Christian P, “Fundamentals Of Wireless Sensor Networks: Theory And Practice”, Wiley Series on Wireless Communications and Mobile Computing, pages.

7-10, 2010.

[3] S. Bandopadhyay, E. Coyle, “An energy efficient hierarchical clustering algorithm for wireless sensor networks”, IEEE INFOCOM, 3, pp. 1713- 1723, 2003.

[4] A. Manjeshwar and D. P. Agrawal, “TEEN: a routing protocol for enhanced efficiency in wireless sensor networks,” in Proceedings of the 15th International Parallel and Distributed Processing Symposium, pp. 2009–2015, San Francisco, Calif, USA, April 2001.

[5] Rupali Rohankar, C. P. Katti, “Single Phase Energy Efficient Connected Dominating Set”, Accepted ICAEET 2015.

[6] K. M. Alzoubi, P-J.Wan, O. Frieder, “Distributed Construction of Connected Dominating Set in Wireless Ad Hoc Networks”, Mobile Networks and Applications, 9 , pp.141-149, 2004.

[7] Rupali Rohankar, C. P. Katti, “Lifetime Maximization Using Dominating set based clustering in WSN, Accepted ICAEET 2015.

[8] M. Maeda and E. D. Callaway, “Cluster tree protocol (ver. 0.6),” April 2001, ttp://www.ieee802.org/15/pub/2001/

May01/01189r0P802-15 TG4-Cluster-Tree- Network.pdf.

[9] H.M. D. Bandara, A. P. Jayasumana, “An enhanced top-down cluster and the cluster tree formation algo for WSN,” in Proc. ICIIS ‟07, pp.

565–570, Peradeniya, Sri Lanka, Aug. 2007.

[10] H. Chan and A. Perrig, “An emergent algorithm for highly uniform cluster formation,” in Proc. of 1st European Workshop on Wireless Sensor Networks, pp. 154–171, 2004.

(5)

[11] X. Wang and T. Berger, “Self-organizing redundancy-cellular architecture for wireless sensor networks,” in Proc. Of IEEE Wireless Comm. and Net. Conf. (WCNC ‟05), pp. 1945–

1951, New Orleans, La, USA, March 2005.

[12] M. Demirbas, A. Arora, V. Mittal, and V.

Kulathumani, “Afault-local self-stabilizing clustering service for wireless ad hoc networks,”

IEEE Trans. on Parallel and Distributed Systems, vol. 17, no. 9, pp. 912–922, 2006.

[13] R. C. Prim, Shortest connection networks and some generalizations, Bell System Technical Journal, Vol. 36, pp. 1389-1401, 1957.

[14] B. Y. Wu, K.-M. Chao, Spanning Trees and Optimization Problems, Chapmann & Hall/CRC, 2004.

[15] C. Zhong, M. Malinnen, D. Miao, P. Franti, Fast Approx. MST Algo. Based on k-means, Springer – Verlag, CAIP 2013, Part I, LNCS 8047, pp. 262 – 269, 2013.

[16] A.K. Jain and R.C. Dubes, Algorithms for Clustering, prentice Hall, 1988.

[17] Rupali Rohankar, C. P. Katti, “Energy Efficient Duty Cycle Scheduling in WSN, Accepted ICAEET 2015.

ANNEXURE I

Algorithm 1: Clustering using k-means Input: Graph containing N sensor nodes.

Output: K-clusters

Begin Procedure

1. Set k = √N (for KDAC algorithm)

2. To obtain k-clusters, k-centroids are found randomly.

3. By finding Euclidean distance from each node to the centroids, the nodes are clustered around the centroid closer to it.

4. This forms k initial clusters.

End Procedure

Algorithm 2: Prim’s Algorithm Input:

 Initial vertex set V_k of graph G(V_k, E_k) where k is the clusters formed in the graph.

 VX^q represents the vertex set, when a vertex is included in the q spanning tree of q-cluster.

 Weighted edge w_i,j in k-clusters.

Output:

 ST^k  K-Minimum Spanning trees Begin Procedure

1. In a cluster q, the set of Vertices VX^q = {CH^q}, where, q = 1, .. k. Remove CH^q from V_q.

2. A minimum weighted edge w_i,j is selected V_k such that node i is in VX^q of spanning tree ST^q and node j is not in VX^q.

3. Then, the node j is added to VX^q for inclusion in spanning tree ST^q. Remove node j from V_q.

4. The steps 2 and 3 are repeated till V_qbecomes empty.

5. The steps 1 through 4 are repeated for all k- clusters.

End Procedure

Algorithm 3: Combine algorithm Input: k-MSTs, ST^k.

Output: Approximate MSTs T^k.

Begin Procedure

1. In a spanning tree ST^q in q-cluster, find a vertex v₁ such that distance D between v₁ and c_rwhich is a centroid of r-cluster is minimum.

2. Find a vertex v₂ in r-cluster such that distance D between v₂ and c_qwhich is a centroid of q-cluster is minimum.

3. Connect the edge (v₁, v₂) to connect two MSTs, ST^qand ST^r.

4. Repeat steps 1 through 3 for all the spanning trees to obtain an approximate spanning tree T^k formed by joining all k-MSTs.

End Procedure

