Characterization and Mitigation of the Energy Hole Problem of Many ...

Report 1 Downloads 38 Views
1

Characterization and Mitigation of the Energy Hole Problem of Many-to-One Communication in Wireless Sensor Networks Heitor S. Ramos1,2 , Eduardo M. R. Oliveira1 , Azzedine Boukerche3 Alejandro C. Frery2 , Antonio A.F. Loureiro1 1 Department of Computer Science, Federal University of Minas Gerais, Belo Horizonte, MG, Brazil 2 Institute of Computing, Federal University of Alagoas, Macei´ o, AL, Brazil 3 School of Information Technology and Engineering, University of Ottawa, Ottawa, ON, Canada {hramos, edumucelli, loureiro}@dcc.ufmg.br, [email protected], [email protected]

Abstract—In Wireless Sensor Networks (WSN), the lifetime of a uniformly deployed WSN is impaired by the sensors that are close to the sink, by a phenomenon called energy hole problem. Those nodes are more likely to relay more packets than those that are further. This problem is strongly related to the topology induced by the deployment of nodes along the sensor field. In this work, we explore the complex network properties exhibited by the graphs that represent WSNs, and propose the use of a new centrality measure called Sink Betweenness. We use this metric to devise a new data-collection algorithm able to alleviate the energy hole effects by evenly balancing the relay load, and thus, increasing the network lifetime. Simulation results show that our solution can substantially decrease the difference of the number of transmissions among the nodes that are situated close to the sink.

I. I NTRODUCTION Wireless Sensor Networks suffer from constraints that do not necessarily appear in other networks. For instance, the lifetime of a uniformly deployed WSN is impaired by the sensors that are close to the sink, by a phenomenon called energy hole problem. Those nodes are more likely to relay more packets than those that are further. This problem is strongly related to the topology induced by the deployment of nodes along the sensor field. Wu et al. [1] and Li and Mohapatra [2] present an analytical model for this problem, and investigate approaches to mitigate it. They conclude that just randomly increasing the number of nodes cannot desirably prolong the network lifetime when a totally random deployment is used. They show that the entire network lifetime can be improved by spreading more nodes close to the sink. In this work we propose a new measure of centrality, called Sink Betweenness, which stems from the theory of Complex Networks but is adapted to WSN to capture relevant information of them. We apply this measure to the energy hole problem, where energy balancing is the most important parameter to address. To the best of our knowledge, there is no similar centrality measure able to correlate well with the energy consumption in a wide variety of scenarios for WSNs. We use this metric to devise a new data-collection algorithm and the simulation results show that this is a promising solution.

The rest of this paper unfolds as follows. Section II presents some relevant related works present in the literature. Section III motivates the problem we are dealing with. Section V details our proposal, while section VI presents the methodology of evaluation and the results obtained in the scenarios of interest. The conclusions are discussed in Section VII. II. R ELATED W ORK Li and Mohapatra [2] present the fist mathematical model towards the characterization of the energy hole problem. They consider sensor nodes distributed following the URP (Uniform Random Placement) law in a circular region divided in concentric coronas. They observe the impact of the following four factors on the energy hole problem: node density, hierarchical deployment, source bit rate and traffic compression. They show that simply adding more nodes to the network does not solve the problem. Liu et al. [3] propose a different approach to the energy hole problem; they consider a nonuniform node deployment. They derive a placement function based on the distance to the sink, in hops. An extension of this idea is presented by Wu et al. [1] who show that nearly balanced energy depletion is possible by increasing the density in geometric progression from the outer to the inner coronas. Thus, they propose a nonuniform node distribution strategy: the Q-Model. Song et al. [4] strive to mitigating the energy hole problem performing power control, i.e., by adjusting the transmission range in order to increase the node density surrounding the sink node. Liu et al. [5] deal with a similar problem but they consider hierarchical networks where cluster heads perform data aggregation. Cluster heads that are closer to the sink node will suffer from energy hole problem. They use different cluster sizes in order to mitigate the energy hole. Kim et al. [6] propose a dynamic routing algorithm that relies on a mobile sink in order to balance the workload generated by the relay task and thus alleviate the energy hole problem. Other strategy that relies on mobile nodes is present in [7]. The authors propose a routing algorithm that uses mobile relay nodes in order to alleviate the energy hole problem.

2

URP

Percent of Total

40 20 0

IV. C ENTRALITY M ETRICS Centrality indexes [10] aim at estimating the importance of a vertex, that means, to rank it by its topological importance.

50 40 30 20 10 0

0

III. M OTIVATION The most common solutions present in the literature to alleviate the energy hole problem rely on new deployment models that tend to place a higher node density surrounding the sink node. Although those strategies present some drawbacks such as lower coverage and high probability of collisions within the high density areas, the energy hole problem can be alleviated once there are more nodes to share the relay task nearby the sink. The Q-model introduced in Liu et al. [3] is a good example of this strategy. Consider a WSN that represents the usual situation where sensor nodes report information to the base station periodically through a tree-based routing infrastructure. For this initial analysis, we consider a simple routing strategy based on shortest paths, where nodes choose their parents after a flooding initiated on the sink node [8], [9]. We used a network consisting of 300 Mica2 CC1000 nodes, with their transmission power level set to −10 dbm, deployed on a square field of side 150 m. By conducting this simulation, we observe that the Q-model is able to distribute the relay task more evenly than the URP model. In the case of URP, more than 60% of the transmissions were performed by nodes situated 1-hop from the sink. When the Q-model was used, the nodes situated in the first hop transmitted less than 30% of the total number of packets. Despite the more even distribution of the transmitted packets among the nodes located in different hop levels, the deployment strategy cannot do much about the distribution among the nodes situated on the same hop distance. For instance, in Figure 1 we observe that both URP and Q-model lead to an uneven number of transmissions among the nodes of the same hop distance. In this example, we depicted the first hop only but this behavior is observed in all levels with different intensities. It means that the workload is more intensely uneven in the first hop and progressively decreases until the last hop, once the nodes of the last hop only transmit their own sensed packets. Observe that when we use the URP model some few nodes transmit much more packets than other nodes and the difference can reach about 5 times more packets. For Q-model, the total number of transmissions decreases because there are much more nodes only one hop distant to the sink but the general behavior is similar, and the difference is about 5 times more packets as well. This situation illustrates the fact that the deployment strategy can alleviate the energy hole among the hop levels, but it is still necessary to develop a routing protocol to balance the workload of the relays nodes.

Q−model 60

60 Percent of Total

This work is focused upon the energy hole problem and proposes a new data-collection algorithm that aims at mitigating the energy hole. Apart from other works on this area, we don’t use different deployment strategies nor mobile nodes. We estimate a centrality metric that is able to characterize the energy hole problem and we devise an algorithm that uses this metric in order to balance the relay task workload.

2000 4000 6000 Transmissions

(a) Fig. 1.

100 200 300 400 500 Transmissions

(b)

Uneven distribution of transmissions

Vertices positioned in central areas generally possess higher structural importance than the border ones. Whenever data flows across the network, those central vertices are natural and significant information brokers. There are several indexes of centrality based on different graph features such as distance between vertices, degree, and neighborhood importance. Another widely used concept in indexes of centrality is the graph shortest path; for example, the Shortest path Betweenness Centrality [11] calculates the centrality of vertex i based on the proportion of the number of geodesic paths between any pair of vertices that falls on i by the total number of geodesic paths in the graph. A. Shortest Path Centrality Consider a network whose topology is represented by the graph G(V , E), where V = {v1 , . . . , vn } is the set of |V | = n nodes, and E is the set of edges. Let us define the neighborhood of node vi as Ni = {vj : eji = eij ∈ E}. The degree of a vertex vi is defined as ki = |Ni |. Edges may be weighted, i.e., there may be a function W : E → R which associates a weight we to every e ∈ E. betweenness PThus, Pthe n n of node v can be defined as B(v) = s=1 t=1 σst (v)/σst , where σst is the number of shortest paths from s to t, {s, t} ∈ V , and σst (v) is the number of shortest paths from s to t that pass through vertex v ∈ V , s 6= v 6= t. Locating and counting shortest paths is difficult with large networks [11], and computational resources are limited in WSNs. The most efficient centralized algorithm to calcu late betweenness has running time O nm + n2 log n for weighted graphs, and O (nm) for non-weighted graphs, where n and m are the number of vertices and edges respectively. In WSN scenarios, communication typically takes place between sensor nodes and the sink, and vice versa. In order to consider this characteristic, in [12] we introduce a new centrality metric, which we term Sink Betweenness (SB), which considers only the shortest paths that include the sink as one of thePterminal nodes. It is defined, for every v ∈ V , as SB(v) = v,t∈V σts (v)/σts , where s is the sink, σts , the v6=t number of shortest paths from t to the sink, and σts (v), the number of shortest paths from t to the sink that contain v. For the sake of simplicity, in this work we consider that WSNs can be represented by non-weighted graphs. In some scenarios, it is more appropriated to use weighted graphs, and both betweenness and sink betweenness can easily support

3

(a) Betweenness

(b) SB

(c) Betweenness

(d) SB

Fig. 2. Betweenness and SB values for two sink positions, center (Figs 2(a) and 2(b)) and corner (Figs 2(c) and 2(d)) (the sink is represented by a triangle)

such feature. Observe that the flow is defined from the regular nodes to the sink in order to capture the data-collection behavior. For other applications such as data-dissemination, the SB can be easily adapted as well. B. Sink Betweenness and Wireless Sensor Networks SB is similar to betweenness in the sense that it represents the centrality in terms of shortest paths, with added nifty properties for the WSN context. For example, SBet is much cheaper to calculate than betweenness. In our previous work involving the SB metric [12] we present a distributed algorithm that uses only 2n messages to calculate SB in nonweighted graphs. Although it is necessary to perform two floods, commonly, shortest path routing algorithms require at least one flooding to set up the routing structure. Thus, the first flooding can piggyback the necessary data, and only another flooding is necessary to complete our algorithm. To calculate the betweenness by a similar approach, 2n2 messages are needed. This overhead is excessively for large WSNs. Moreover, the SB represents better the traffic pattern in WSNs than the betweenness. For instance, Figure 2 shows a panorama of the distribution of betweenness and SB. The gray level of the nodes is proportional to its betweenness or SB. The greater the betweenness or SB is, the darker the point is. The sink node is represented by a triangle. Figures 2(a) and 2(b) show the betweenness and SB when the sink node is positioned in the center of the network. Observe that both metrics are able to distinguish the nodes that concentrate more routes toward the sink node. Notice, also, that SB is more selective and presents high values only in nodes that, in fact, participate in more paths to the sink. Betweenness presents more nodes with high values far from sink, once it considers paths among all nodes. Those figures offer an interpretation of why SB is more likely to be related to energy depletion than betweenness. When the sink is located far from the center (Figures 2(c) and 2(d)), betweenness fails to represent nodes that participate in more paths to the sink, and lacks the ability of characterizing the energy hole problem, while the SB maintains this ability. The last and possibly the most important SB’s property in the context of the energy hole problem is its correlation with the energy spent by the nodes. In the following case the transmission is the task that spends more energy, thus, we are looking for a metric able to indicate the node that is more likely to transmit more packets. We used the Spearman’s rank correlation because it is robust and is recommended if the data does not necessarily come from a bivariate normal distribution. The number of samples

we used granted that the largest standard deviation of these estimates is less than 3·10−4 [13]. We conducted a simulation with 400 nodes randomly placed on the sensor field, the sink placed at the center of the network, and a shortest-path based data collection algorithm for continuous data application. The SB presented high correlation (0.85) while the betweenness presented only a slight correlation (0.58). When the sink is located far from the center the SB maintains the same level of correlation while the correlation presented by the betweenness degrades. V. P ROPOSED S OLUTION In order to mitigate the observed uneven workload among the nodes located into the same hop distance (c.f. Section III), we propose a new data collection algorithm where every node vi ∈ V knows the SB of the nodes that belong to its neighborhood (Ni ). As the nodes with high SB are more likely to be used as relay by its neighbors, our algorithm uses the SB to decrease the probability of those nodes to be chosen, and thus, balance the load among the neighbors. The decision rule used to choose the next hop of the node vi ∈ V in the data-collection algorithm is performed by the following steps: (i) Sink betweenness calculation and announcement: during this phase the sink betweenness of each node is calculated. We used the distributed algorithm described in Oliveira et al. [12], and we piggyback the SB announcement within the packet of the second flooding (the node have already calculated its own SB before send this packet). (ii) Neighbor filtering: in order to create shortest paths, only the neighbors in Ni that are closer to the sink than the node vi are eligible as relay, where Ωi is the set of eligible nodes. Thus, we ensure the use of shortest paths only. (iii) SB normalization: the SB is normalized based on the Ωi . This metric is called nSB (normalized sink betweenness). We observe that each neighborhood presents values of SB of different magnitudes, thus this process makes the values comparable. (iv) Neighbor probability assignment: a probability that uses the nSB metric is assigned to every eligible node vj ∈ Ωi . This is the probability of the node vj be chosen as relay of the node vi . (v) Relay selection: the relay node is randomly selected according to the probability of the item (iv). To define the probabilities of the item (iv), we use a parameter called “temperature” that controls how intensely the node vi will try to avoid the choice of a neighbor with high SB. Thus, the probability of node vi ∈ V choose node vj ∈ Ωi as its relay at temperature P T is Pr iT (j) ∝ exp (− nSB(j)T ), i thus, we calculate kT = j∈Ωi exp (− nSB(j)T ), and use both equations to calculate the following probabilities: Pr iT (j) =

1 exp (− nSB(j)T ). kTi

(1)

As this process is executed in a per packet basis, it is expected a load balance of the relay task pondered by the SB of the node. The hypothesis we will verify is that making this choice in a per packet basis and balancing the load among its neighbors guided by the SB, the workload among the neighbors can be distributed as even as possible.

4

The rationale behind the use of the exponential function in this context can be explained as follows. Firstly, the exponential function is strictly positive, as probabilities are, and facilitates the probability calculation. Moreover, the exponential function presents different decreasing rates for different regions. As the SB is normalized, so the values lie in the interval [0, 1]. The parameter T is a scale factor that is used to control what region of the exponential function is being used and, thus, controls how different the probability of a node with high SB is compared to a node with low SB. Thus, Equation (1) provides a consistent and versatile rule to choose the relay nodes, controlled by the parameter T . VI. E VALUATION A. Methodology The main goal of our simulations is to assess the performance of the data-collection algorithm described in Section V. To do so, we performed a Monte Carlo simulation in order to evaluate the behavior of the number transmissions of the nodes located on the first hop distance (one hop from the sink) through the following metrics: (i) Max number of transmissions: This metric gives an indication how our protocol is performing in terms of decreasing the maximum number of transmissions and thus increasing the lifetime of the nodes. Observe that the main goal is to decrease the dispersion of the number of transmitted packets. The ideal situation is when all nodes at same hop distance transmit the same number of packets. Although the maximum is very sensitive to outliers, it gives a good idea how our solution is performing. An analysis of dispersion metrics is less sensitive to outliers and provides another perspective of the performance evaluation. (ii) Dispersion analysis: We use the IQR (interquartile range), i.e., the difference between the 3th and 1st quartiles, in order to measure the dispersion of the number of transmissions of the nodes located on the first hop distance. In this case, any other measure of dispersion could be used. We adopted IQR because it is intuitive. We conducted our assessment by comparing the performance of the proposed algorithm, namely randomSbetTree, with two other data-collection algorithms namely simpleTree and randomTree. All three algorithms are based on SPT (shortest path tree), and the main difference among them is how to choose the best shortest path to delivery the data. The simpleTree algorithm is a routing structure where every node chooses a fixed parent forming a tree. The tree formation is started by the sink node, whereas it floods the network. When a node receives the first flooding packet, it chooses its parent and forwards the flooding. This algorithm is also known as delay tree. The randomTree algorithm is a modification of the simpleTree idea to try to balance the relay load of the nodes. Thus, instead of making the choice of the parent by the first flooding packet, the node waits to receive the flooding packet from every neighbor and creates a vector of eligible neighbors Ωi . For every packet it uniformly randomly chooses one node vj from Ωi as the relay of that specific packet. This approach is the same as T = 0 in the randomSbetTree. Table I presents the simulation scenarios we evaluated. A

TABLE I S IMULATION SCENARIOS Parameter sink node network size sensor field deployment model simulation time data rate collision model Routing Algorithms Temperature (T) Sensor model Transmission Power

Value 1, the center-most node {100, 200, 300, 400} nodes Square, {750, 1000, 1250} m of side URP 3600 s 1 packet/min per node Additive Model randomSbetTree, simpleTree, randomTree −2, −1, 1, 2, 3, 5, 10, 15 Mica 2 CC 1000 10 dbm

Monte Carlo simulation was performed, replicating independently 30 times for each situation indexed by the parameters shown in Table I. This number of replications was considered sufficient for hypothesis testing sample mean differences the 95% significance level. We use the R package version 2.121 for node deployment and statical analysis, Omnet++ simulator version 3.3p1 for discrete event simulation, and Castalia version 2.3b2 for WSN models. Both wireless channel and MAC models are already available in Castalia; the routing and application models were implemented as specified above. We also used the Mica 2 CC 1000 radio module available in Castalia. Seeds, sources and scripts can be obtained upon request from the first author. B. Results In the following, we present the evaluation results of the routing protocol proposed in this work. All the following results show the behavior of the number of transmissions of the nodes located one hop from the sink. This behavior is similar for the other hop distances but smoother, once the workload tends to be more even when the distance to the sink increases. We evaluate the behavior of our protocol for a wide range of values of the parameter T . We observed that for our scenarios T = 3 represents a good choice. Thus, for the rest of our analysis, we consider the randomSbetTree under this value. Figure 3 shows the maximum number of transmitted packets observed on nodes located one hop away from the sink for all evaluated scenarios. Observe that the randomSbetTree presents a desirable property of do not increase the maximum number of packets when the number of nodes increase, while the other algorithms lack of this ability. It means that even when we increase the number of nodes and all those nodes are transmitting packets to the sink node, the network lifetime will be maintained when the randomSbetTree is used. This fact can be explained once the randomSbetTree performs a more even distribution of the relay task among the nodes of the same routing level. We expected that the number of nodes present in a given routing level will increase proportionally to the increase of the total number of nodes. Thus, an even routing 1 R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Available: http://www.r-project.org 2 Castalia: A simulator for Wireless Sensor Networks, http://castalia.npc.nicta.com.au

5

randomSbetTree randomTree

simpleTree 100

Transmissions (max)

Field Size: 750

200

300

400

Field Size: 1000

Field Size: 1250

3000 2500 2000 1500 1000 500 100

200

300

400

100

200

300

400

NumNodes

Fig. 3.

Max number of transmissions upon varying the number of nodes randomSbetTree randomTree

simpleTree 100

Transmissions (IQR)

Field Size: 750

200

300

400

Field Size: 1000

Field Size: 1250

1200 1000 800 600 400 200 100

200

300

400

100

200

300

400

NumNodes

Fig. 4.

IQR of transmissions upon varying the number of nodes

typical scenarios of WSNs, thus, the routing algorithm is used to alleviate the energy hole problem. This measure’s ability, in contrast to the relative insensitivity of the classical Betweenness, suggests other possibilities. Sink Betweenness can be used in wide variety of applications, both in the design and operation of WSNs. For instance, the designer can assess the best deployment strategy to create graphs with more uniform Sink Betweenness distribution. Such assessments should improve the understanding and management of the network lifetime, since the energy consumption becomes more evenly distributed among the nodes. The proposed routing algorithm takes advantage of the knowledge of the SB metric to perform an evenly distribution of the relay task and thus alleviate the energy hole. Our simulation results show that this routing algorithm is a promising solution. Regarding the WSN operation, we envision other topologyaware algorithms that use each node’s value of Sink Betweenness to increase the network performance. For instance, MAC algorithms can take advantage of using this measure to optmize the nodes’ duty-cycle to alleviate the energy consumption. R EFERENCES

strategy shows this property. The difference of the maximum number of transmitted packets observed when the randomSbetTree is used increases when the network density increases as well. For instance, for 400 nodes the randomSbetTree the max number of transmitted packets is 70%, 40% and 25% lower for a field size 750, 1000, and 1250, respectivly, than when other algorithms are used. With 100 nodes, we cannot see a substantially difference among the three routing algorithms once the density is too low (about 4 to 6 neighbors in average), so there are not enough possibility to balance the workload. It’s noteworthy that the simpleTree and randomTree do not show considerable differences in terms of the maximum number of transmitted packets. Figure 4 shows the behavior of the IQR in function of the adopted routing protocol for all evaluated scenarios. This metric is more robust to outliers than the maximum number of packets. We observe that the randomSbetTree consistently leads to a lower IQR. It reinforces our hypothesis that the distribution of the relay task is more even than the other algorithms. With this metric we show that not only the maximum value is decreased but the minimum value is increased accordingly. Thus, all nodes tend to relay a number of packets closer to the average value than when other algorithms are used. For instance, when the number of nodes is 400, the randomSbetTree presents IQR 80%, 55% and 35% lower for a field size 750, 1000, and 1250, respectivly, than the randomTree. Observe that for IQR, the randomTree performs slightly better than simpleTree. VII. F INAL R EMARKS In this work we proposed the use of a novel centrality measure called Sink Betweenness to devise a routing algorithm. As shown in our assessment, this metric is able to capture the behavior of the energy spent by the relay task in many

[1] X. Wu, G. Chen, and S. K. Das, “Avoiding Energy Holes in Wireless Sensor Networks with Nonuniform Node Distribution,” IEEE Transactions on Parallel and Distributed Systems, vol. 19, no. 5, pp. 710–720, 2008. [2] J. Li and P. Mohapatra, “Analytical modeling and mitigation techniques for the energy hole problem in sensor networks,” Pervasive and Mobile Computing, vol. 3, no. 3, pp. 233–254, 2007. [3] J. Liu, M. Chu, and J. Reich, “Multitarget Tracking in Distributed Sensor Networks,” IEEE Signal Processing Magazine, vol. 24, no. 3, pp. 36–46, May 2007. [4] C. Song, J. Cao, M. Liu, Y. Zheng, H. Gong, and G. Chen, “Mitigating energy holes based on transmission range adjustment in wireless sensor networks,” in Proceedings of the 5th International ICST Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness, ser. QShine ’08, 2008, pp. 32:1—-32:7. [5] A.-F. Liu, X.-Y. Wu, Z.-G. Chen, and W.-H. Gui, “Research on the energy hole problem based on unequal cluster-radius for wireless sensor networks,” Computer Communications, vol. 33, no. 3, pp. 302–321, Feb. 2010. [6] T.-h. Kim, H. Adeli, S.-Y. Choi, J.-S. Kim, S.-J. Han, J.-H. Choi, K.-W. Rim, and J.-H. Lee, Advances in Computer Science and Information Technology, ser. Lecture Notes in Computer Science, T.-h. Kim and H. Adeli, Eds. Springer Berlin Heidelberg, 2010, vol. 6059. [7] F. F.-Y. Leu, W.-C. W. Wu, Wen-Chin, and H.-W. Huang, “A Routing Scheme with Localized Movement in Event-Driven Wireless Sensor Networks,” in Proceedings of the Joint International Conferences on Advances in Data and Web Management, ser. APWeb/WAIM ’09. Berlin, Heidelberg: Springer-Verlag, 2009, pp. 576–583. [8] J. N. Al-Karaki and A. E. Kamal, “Routing techniques in wireless sensor networks: a survey,” IEEE Wireless Communications, vol. 11, no. 6, pp. 6–28, 2004. [9] E. F. Nakamura, H. S. Ramos, L. A. Villas, H. A. de Oliveira, A. L. de Aquino, and A. A. F. Loureiro, “A reactive role assignment for data routing in event-based wireless sensor networks,” Computer Networks, vol. 53, no. 12, pp. 1980–1996, Aug. 2009. [10] L. d. F. Costa, F. A. Rodrigues, G. Travieso, and P. R. Villas Boas, “Characterization of complex networks: a survey of measurements,” Advances in Physics, vol. 56, pp. 167–242, 2007. [11] L. C. Freeman, “A Set of Measures of Centrality Based on Betweenness,” Sociometry, vol. 40, no. 1, pp. 35–41, Mar. 1977. [12] E. M. R. Oliveira, H. S. Ramos, and A. A. F. Loureiro, “Centralitybased routing for Wireless Sensor Networks,” in 2010 IFIP Wireless Days. IEEE, Oct. 2010, pp. 1–5. [13] D. G. Bonett and T. A. Wright, “Sample size requirements for estimating Pearson, Kendall and Spearman correlations,” Psychometric, vol. 65, no. 1, pp. 23–28, 2000.