Energy Map: Mining Wireless Sensor Network Data Song Ci
Mohsen Guizani
Department of Computer Science University of Massachusetts Boston Boston, MA 02169 Email:
[email protected] Department of Computer Science Western Michigan University Kalamazoo, MI 49008 Email:
[email protected] Abstract— With more and more wireless sensor networks being deployed, how to design and manage these large-scale sensor networks, especially for energy efficiency, becomes a very important issue. The current sensor network management tools only offer limited management data such as energy level and location of each sensor. However, these data are isolated over time, not enough to reflect the dynamic nature of the sensor networks. In this paper, we propose a new framework for the network management of a large-scale sensor network, called Energy Map. Based on nonlinear manifold learning algorithms, we will be able to not only visualize the energy level and location of each sensor in a network but also to find the dynamic patterns from a large volume of sensor network data such as which set of sensors has been significantly consuming its energy or how far the cluster members are from the current cluster head. All these information is usually very important to develop a good sensor network protocol stack such as clustering algorithms and routing protocols. Our contribution made in this paper is to introduce nonlinear manifold learning methods to derive the energy distribution of a wireless sensor network. We also show several interesting results and discuss their significance to the energy-efficient designs of wireless sensor networks.
I. I NTRODUCTION With hundreds even thousands of wireless sensors configured to communicate through sophisticated protocols, we have a very powerful platform with a huge potential to change the remote sensing applications. The power of Wireless Sensor Network (WSN) lies in their ability to monitor the physical environment through ad-hoc deployment of numerous selfconfiguring sensor nodes. Wireless sensor networks are useful in a wide spectrum of applications ranging from environmental and biological monitoring to military and homeland security. Unlike traditional wired networks, the deployment of WSNs is relatively simple and inexpensive. Moreover, such networks can be easily extended by simply adding more devices without any rework or complex reconfiguration. The sensor nodes can ideally run for over a year on a single set of batteries. Given the cost of these sensor nodes it is not feasible to discard dead sensor nodes and it is also not possible to replace the batteries on these sensor nodes. Hence, there is a great need for energy efficient protocols that can greatly reduce power consumption and increase the life time of wireless sensor nodes [1], [2]. There has been a lot of research done in developing efficient hardware and software platforms for WSNs in different perspectives such as MAC [3]–[6], routing [7]–[9], and system design [10]. On the other hand, in wireless sensor networks
abundant sensing data are collected by a large number of sensors. How to process this large amount of data with redundancy is very critical to prolong the life span of the wireless sensor networks because each sensor is constrained by its processing power, storage, and battery life and the longest life span of a network would be expected when the energy consumption of each sensor is maintained at the same level. Therefore, it is important to extract the distribution of energy consumption from a large volume of management data collected from the sensor network and then use the extracted information to dynamically adapt the network protocol stack for a high energy efficiency in the wireless sensor network. Due to the multidimensional and nonlinear nature of collected sensing data, post-processing of these data could be very difficult and prohibitive computation-intensive. Therefore, data mining has been gaining more and more research efforts in sensor data processing. The goal of using data mining in sensor networks is to perform dimensionality reduction such as data aggregation or to discover the regularities and irregularities of the networks such as network intrusion. Many techniques from the field of multidimensional data series analysis in other scientific areas such as nearest neighbor search, principal component analysis (PCA), and multidimensional scaling (MDS) have been adopted for mining the network data. In [11], two neural-network algorithms have been applied to reduce the communication cost of wireless sensor networks by data aggregation. In [12], a data aggregator using principal component analysis (PCA) compression technique was proposed to fuse the information from multiple sensors. In [13], an expectation maximization (EM) algorithm was proposed to handle the missing values while maximizing the the complete likelihood. Manifold learning (ML)-based tool for the visualization of large sets of data with correlations collected from the Internet was developed and discussed in [14] for network intrusion detection. In this paper, we take a different perspective to look at mining the sensor network data. Different from previous work, we will focus on mining the sensor data to obtain and visualize the current energy distribution in wireless sensor networks. The rationale behind this is that for all aspects of energy-efficient sensor network designs such as routing, cluster, positioning, broadcasting and media access control, it is very crucial to find the current residual energy distribution of the sensor network based on the historical usage information of each
3525 1-4244-0355-3/06/$20.00 (c) 2006 IEEE This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2006 proceedings.
sensor. However, deriving such an energy distribution is very difficult due to the problem of how to process a large volume of high-dimensional sensor data to get a smaller number of perceptually relevant features. Therefore, in this paper we will use nonlinear dimensionality reduction algorithms based on manifold learning such as ISOMAP [15] to extract the distribution of residual energy from the large volume of highdimensional sensor data. Our results show that the proposed approach will not only visualize the energy level and location of each sensor in a network but also find the dynamic energy consumption pattern from the collected sensor network data and infer many useful information out of the derived pattern such as which set of sensors is now consuming more energy than others and how close the cluster members are from the current cluster head. All these information are usually very important to develop a good sensor network protocol stack for wireless sensor networks. For example, when dynamic clustering algorithms are being used, we can use the derived energy pattern to determine who will be the best cluster head in the near future. In this paper, some interesting results will be demonstrated and their significance to the sensor network designs will be discussed. The rest of this paper is organized as follows. In Section 2, we describe the network model assumed in this work and in Section 3, we present our preliminary results on extracting the energy distribution from a high volume of high-dimensional sensor data, and some insights from the results are discussed. Then, we show our simulation results in Section4 and conclude with Section 5. II. BACKGROUND Nonlinear dimensionality reduction (NLDR) is foundational to this paper. Before we go any further, it is necessary for us to briefly review the current research having been carried out in this area. Manifold learning has been widely used for dimensionality reduction, where the goal is to discover the degrees of freedom m from a large number of observations M , normally, m