Optimal Distributed Detection in Clustered Wireless Sensor Networks: The Weighted Median Qingjiang Tian and Edward J. Coyle Center for Wireless System and Applications School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 47907-2035 {tianq, coyle}@ecn.purdue.edu Abstract − In a clustered, multi-hop sensor network, a large number of inexpensive, geographically-distributed sensor nodes each use their observations of the environment to make local hard (0/1) decisions about whether an event has occurred. Each node then transmits its local decision over one or more wireless hops to the clusterhead. When all local decisions have been gathered by the clusterhead, it fuses them into a final hard decision about the event. Two sources of error affect the clusterhead’s final decision: (i) local decision errors made by the sensor nodes because of noisy measurements or unreliable sensors, and (ii) bit errors affecting each hop on the wireless communication channel. Previous work assumed error-free communication or a single-hop cluster. We show that if both of these sources of error are considered, then the optimal data fusion algorithm at the clusterhead is a weighted median. The optimal weights are shown to be functions of the bit error probability of the channel and the ring from which the local decision originated. We determine: the error probability of this optimal fusion algorithm; the effect of adding more nodes or rings to the cluster; and the tradeoff between energy consumed in the network and the decision error probability. This paper thus provides tools to add the effect of measurement and communication errors to other tradeoffs in the design of clustered sensor networks.
1. INTRODUCTION In one scenario envisioned for wireless sensor networks, many small, inexpensive nodes with sensing, processing and wireless communication capabilities are scattered over a field. They self-organize into a clustered network; sense the environment; make decisions based on their observations; and send these decisions to their clusterheads via multi-hop wireless communication. Each clusterhead (CH) fuses the decisions of the nodes in its cluster to determine a final decision for that region of the network. A clusterhead’s decision may be shared with other clusterheads or forwarded to a processing center. Figure 1 shows a single multi-hop cluster in a sensor network. The CH that collects and fuses the local decisions of the sensor nodes is shown in the center. In this paper we consider distributed binary hypothesis detection problems for clusters in a sensor network. We assume there are two hypotheses, “0” and “1”, with possibly different prior probabilities, from which the node must choose. At some point in time, determined perhaps by an event, each node makes a local decision based on the sensor measurements it has been collecting and transmits this
decision bit to the CH. The CH makes the final decision based on all the decision bits it received from nodes in its cluster. We assume that phenomenon being measured by the nodes in a single cluster has a correlation length that is larger than the diameter of a single cluster. In the ideal case of noise-free measurements and error-free computation, each node should then be making the same decision about the phenomenon. The decisions made by the sensor nodes and the clusterhead may, however, be incorrect. We focus on two sources of incorrect decisions: (i) local measurement and local decision errors by each node in a cluster; and (ii) decision fusion errors at the clusterhead due to communication errors that corrupt the decisions transmitted by the nodes in the cluster. For communication induced errors, we take into account the multi-hop communication strategies used within clusters in clustering algorithms like those in [1,2]. Clearly, decisions forwarded from nodes at the outer edge of a cluster will travel more hops, and thus be more vulnerable to communication error, than those from nodes within one hop of the clusterhead. An appropriate weighting of these decisions should thus depend on the number of communication hops they take.
Figure 1. An example showing MICA2 sensor dots that have self-organized into a two-hop cluster around the clusterhead CH. Each node has a transmission range that depends on many conditions in its local environment but it is generally less than 70 meters.
Given the situation described above, we derive the following results: • When each hypothesis is equally likely, the weighted median [8-13] is optimal in the MAP sense for fusing the nodes’ local hard decisions into a single, cluster-wide hard decision. The optimal weights are shown to be functions of the number of hops and the communication error probability. When the hypotheses are not equally likely, a weighted order statistic is optimal. • The decision error probability (DEP) resulting from the optimal weighted median is determined and the dependence of the error exponent on the number of nodes is determined. To demonstrate the importance of the weighting in different circumstances, we use the (unweighted) median as a benchmark. • The tradeoff between the energy consumed collecting local decisions from the cluster’s nodes and the overall decision error probability at the clusterhead is determined. The cases of additional rings (hops) in the cluster and additional nodes are both evaluated. These results can be used in the design of clustering algorithms, such as those in [1,2], which bound the number of communication hops per cluster. This bound should be a function of: (i) the energy used by the entire network, which typically decreases as the number of clusters and the number of levels of clustering increases [1]; (ii) the correlation length expected for the phenomenon about which decisions are to be made; and (iii) the desired or acceptable probabilities of error for the decisions made by the clusterheads, which can be determined with the results in this paper. The rest of the paper is organized as follows. In Section 2, related work is reviewed. We study a fixed cluster and determine the optimal decision strategy in Section 3. In Section 4, based on the proposed estimator, the decision error probability for a general cluster is analyzed; then tradeoff between total energy consumed to collect one packet from each node in the cluster and the decision error probability is investigated. Conclusions and further work are discussed in Section 5. 2. RELATED WORK In [3-7], a number of aspects of universal decentralized estimation problems are investigated. An excellent tutorial on this subject is provided in [3]. In [4], a single-hop sensor network is considered in which the sensor nodes are subject to a joint power constraint and must transmit data to their CH over a channel that is corrupted by additive noise. The limiting behavior that arises when the number of nodes approaches infinity is determined; it is shown that having each sensor node use the same transmission scheme is optimal. The cases of binary and analog sensor nodes are both considered. The authors of [5] determine an optimal distributed detection strategy in a single-hop network under the assumption of spatially and temporally i.i.d. observations by the sensor nodes. Different types of AWGN channels are
considered and both soft and hard decision strategies are evaluated. In all cases, error probabilities that decay exponentially as a function of the number of sensors are obtained. A decentralized detection problem under bandwidthconstrained communication is investigated in [6]. Each sensor transmits a 1-bit message to the CH and the CH makes the final decision by averaging the received messages. It is assumed that the noise affecting each node may be different, but the fusion center still treats all the data equally. In [7], a linear estimation scheme has been proposed based on the characteristics of the sensor measurement noise. The proposed linear estimator is optimal under the MSE criterion and an error-free channel. Though the channel between sensor nodes is assumed to be error-vulnerable, this linear estimation scheme does not account for communication errors. There has been a long history of research on orderstatistics-based nonlinear filters, such as stack filters [8,9] weighted median filter [10,11], and morphological filters[12]. In such filters, the minimum mean absolute error criterion is usually used [13]. It is shown in [9] that optimal stack filtering under the mean absolute error criterion is analogous to optimal linear filtering under the mean squared error criterion. A good tutorial on weighted median filters can be found in [10] and optimal weighted median filters are also discussed. Various clustering algorithms for sensor networks have been proposed recently [1,2]. In [1], a multi-hop hierarchical clustering algorithm is proposed. The clustering strategy bounds the number of communication hops in each cluster and then minimizes the total energy required to collect one data packet from each of the nodes in the cluster. All clusters in this paper are assumed to have been generated by this algorithm. 3. THE OPTIMAL WEIGHTED MEDIANS FOR FIXED CLUSTERS In this paper, we investigate sensor networks that have been organized into multi-hop clusters. We assume identical sensor nodes, which is justified within each ring by the results in [5] and throughout the cluster by the desire to use inexpensive nodes. We assume the CH knows the probability of error of each sensor node and of the communication channel. We first consider the binary detection problem when the number of nodes in each ring of the cluster is pre-determined and known by the CH [1]. Assume that each node makes a decision between two hypotheses, s 0 = 0 and s1 = 1, where “1” might denote that an event has occurred and “0” that it has not. Assume: (i) that each sensor node makes its own decision based on noisy measurements it has made with its possibly unreliable sensors; (ii) the probability this decision is wrong is p m ; and (iii) the decision bits are spatially and temporally i.i.d. cross all the sensor nodes. Each node then begins the process of transmitting its decision bit to the CH. The CH collects these local decisions and makes the final decision. Assume that the communication channel between any pair of sensor nodes and between the CH and nodes in the first hop
is a Binary Symmetric Channel (BSC). In other words, if s and r are the transmitted and received signals, respectively, then the probability of an error in transmission is given by p(r = s1 | s = s 0 ) = p (r = s 0 | s = s1 ) = p c and the probability of a correct decision received is given by p(r = s 0 | s = s 0 ) = p (r = s1 | s = s1 ) = 1 − p c . We assume 1 1 and p c < . 2 2 Once the decision bits from every node have been received by the CH, it uses them to make a final decision based on the Maximum A Posteriori (MAP) criterion. That is, given the vector of received bits r , the decision bit is the signal corresponding to the maximum of the set of posterior probabilities { p ( si | r )}. From this point, we use p( s i | r ) as an abbreviation of p( s i occurred | r is received) . pm
1 is thus equivalent to 2c − N − χ > 0. Thus, if γ > 1, the best decision bit is rˆ = s 0 . On the other hand, when
⎛ 1 − p e,1 ⎞ ⎛1− p ⎞ ⎟. Define W = ⎡ N + χ ⎤. Then ⎟⎟ = χ ⋅ ln⎜ that ln⎜⎜ ⎢2 ⎥ ⎟ ⎜ ⎢ ⎥ ⎝ p ⎠ ⎝ p e,1 ⎠ the MAP-based decision bit at the CH is given by rˆ = (r ) (W ) = W th order statistic of (r1 , r2 ,..., rN ). The decision
N + χ . Since c must be an 2 ⎡N ⎤ integer, we know that c ≥ W = ⎢ + χ ⎥ which leads to the ⎢2 ⎥
error probability at the CH, p CH , is given by: p CH = p (rˆ ≠ s | s, N )
W th order statistic of (r1 , r2 ,..., rN ) = s 0 .
⎛ N = (1 − p )⎜⎜ ⎝ i =W
⎛N⎞
∑ ⎜⎜⎝ i ⎟⎟⎠( p
e ,1
⎞ ) i (1 − p e,1 ) N − i ⎟⎟ ⎠
⎛ N ⎛N⎞ ⎞ ⎜⎜ ⎟⎟( p e,1 ) i (1 − p e,1 ) N − i ) ⎟ + p⎜⎜ ⎟ ⎝ i = N −W ⎝ i ⎠ ⎠
∑
Proof: For each of the possible transmitted bits, s 0 or s1 , the posterior probability is given by Bayes’ rule: 1 p ( r | s i ) p ( si ) . Here p(r ) = ∑ p (r | si ) p ( si ) . With p ( si | r ) = i =0 p (r )
2c − N − χ > 0 , we see that c >
Similarly, W
γ 0 when the number of nodes is finite. Thus, E (rˆ | s, N ) = s ⋅ p (rˆ = s | s, N ) + (1 − s ) ⋅ p (rˆ = 1 − s | s, N ) ≠ s. The estimate is thus biased. 1 1 However, since p m < and pc < , we know that 2 2 p(rˆ ≠ s | s, N ) → 0 as N → ∞. This leads to: p(rˆ = s | s, N ) → 1 and E (rˆ | s ) = s as N → ∞. The estimate is thus asymptotically unbiased. ■ 1 , the 2 MAP-based decision bit is rˆ = Median(r1 , r2 ,..., rN ) and the decision bit error probability is given by N ⎛N⎞ ⎜⎜ ⎟⎟( p e,1 ) i (1 − p e,1 ) N −i . p(rˆ ≠ s | s, N ) = i⎠ N ⎡ ⎤⎝
Corollary 2. In Theorem 1, if p( s = s 0 ) = p ( s = s1 ) =
∑
i=⎢ ⎥ ⎢2⎥
Since
p( s i | r ) =
p (r | s i ) p ( s i ) , with p (r )
the
) i (1 − pe,1 ) N −i .
Due to the symmetry of the decision error, we can conclude that N ⎛N⎞ ⎜⎜ ⎟⎟( p e,1 ) i (1 − p e,1 ) N −i . p(rˆ ≠ s | s, N ) = (4) i⎠ N ⎡ ⎤⎝
∑
i=⎢ ⎥ ⎢2⎥
■
Proof:
e ,1
⎡N⎤ i=⎢ ⎥ ⎢2⎥
∑
⎞ ⎛ N ⎛N⎞ ⎜⎜ ⎟⎟( pe,1 )i (1− pe,1 ) N −i ) ⎟ + p⎜⎜ ⎟ i ⎠ ⎝ i =N −W ⎝ ⎠
⎛N⎞
∑ ⎜⎜⎝ i ⎟⎟⎠( p
given
1 , selecting the maximum 2 of the posterior probabilities p( si | r ) is equivalent to finding
condition p( s = s 0 ) = p( s = s1 ) =
the largest conditional probability p(r | si ) . In this case, the MAP criterion is the same as the Maximum Likelihood (ML) criterion. 1 ⎡N ⎤ If p( s = s 0 ) = p( s = s1 ) = , χ = 0 and W = ⎢ ⎥. From 2 ⎢2⎥ Theorem 1, we obtain rˆ = Median(r1 , r2 ,..., rN ) . The decision error probability at the CH is given by
■ From Corollary 2, we see that the form of the optimal decision for the case with unequal prior probabilities is similar to that with equal prior probabilities; they are both order statistics. Thus, for simplicity, we assume in what follows that 1 the prior probabilities are equal: p( s = s 0 ) = p( s = s1 ) = . 2 3.1.1. The case of large number of sensor nodes Sensor nodes are usually densely deployed in order to obtain either high information resolution or increased energy efficiency [14]. The decision error probability when there are N nodes within one hop of the CH is given by (3) and (4). In this subsection, we let the number of nodes grow large to obtain asymptotic results. Since p e,1 is fixed, we apply the Gaussian approximation, and find for large N that the decision error probability can be well approximated as N ⎛N⎞ ⎜⎜ ⎟⎟( pe,1 ) i (1 − pe,1 ) N −i pCH = p(rˆ ≠ s | s, N ) = i⎠ ⎡N⎤⎝
∑
i=⎢ ⎥ ⎢2⎥
1 ⎛ ⎞ ⎜ N ( − pe,1 ) ⎟ ⎛ N (1 − p ) ⎞ e , 1 ⎟ − G⎜ 2 ⎟ ≅ G⎜ ⎜ Np (1 − p ) ⎟ ⎜ ⎟ − Np p ( 1 ) e ,1 e ,1 ⎠ e ,1 e ,1 ⎟ ⎝ ⎜ ⎝ ⎠ (5) 1 ⎛ ⎞ ⎜ N ( − pe,1 ) ⎟ ⎛ N (1 − p ) ⎞ e ,1 ⎟ − G⎜ 2 ⎟ = G⎜ ⎜ p (1 − p ) ⎟ ⎜ ⎟ p ( e ,1 e ,1 ⎠ e ,1 1 − p e ,1 ) ⎟ ⎝ ⎜ ⎝ ⎠ 1 ⎛ ⎞ ⎜ N ( − pe,1 ) ⎟ 1 ⎛ N (1 − pe,1 ) ⎞⎟ 1 2 ⎟ − erfc⎜ = erfc⎜ ⎜ ⎟ 2 ⎜ 2 pe,1 (1 − pe,1 ) ⎟ 2 ⎝ 2 pe,1 (1 − pe,1 ) ⎠ ⎜ ⎟ ⎝ ⎠ where, x 1 2 exp( − y / 2)dy x > 0, G ( x) = −∞ 2π and
∫
2
∞ 2 ∫x exp(− y )dy.
x > 0.
π For a better estimate of p CH , we apply the continuity correction in [15] to (5) and find: 1 1 ⎞ ⎛ ⎜ N ( − p e,1 ) − ⎟ 2 1 2 N ⎟ ⎜ p CH = erfc⎜ 2 2 p e,1 (1 − p e,1 ) ⎟⎟ ⎜⎜ ⎟ ⎝ ⎠ 1 ⎛ ⎞ ⎜ N (1 − p e,1 ) + ⎟ 1 N 2 ⎜ ⎟. − erfc⎜ ⎟ 2 − p p 2 ( 1 ) e ,1 e ,1 ⎜⎜ ⎟⎟ ⎝ ⎠
From (5), the decision bit error probability converges to zero as the total number of nodes, N , becomes large, as shown in Corollary 1. In Figure 2, we plot the decision error probability at the CH with different numbers of nodes, N . The variable p in the figure is shorthand for the one hop bit error probability p e,1 . To achieve a given DEP at the CH as p e,1 increases, more local decisions, and thus more sensor nodes are required. These curves show the tradeoffs between the quality of node’s measurements, the quality of the communication channel, and the total number of nodes. The lower the cost of the sensor nodes, the higher the decision and communication error probabilities will be. Thus, many cheap nodes may be required if the goal is to obtain the same DEP at the CH that is achieved by a small number of expensive nodes. Results like those above allow this tradeoff to be quantified. The figure also shows that increasing the total number of nodes, N , will always improve the system’s performance; on the other hand, as N becomes large, the DEP at the CH becomes less sensitive to further increases in N . From (5), the ′ with respect to N is given by: first order derivative, pCH ⎡ ⎤ 1 ⎛ 2 ⎞ 2 ⎜ N ( − p e,1 ) ⎟ ⎥ ⎛ ⎞ − N p ( 1 ) 1 ⎢ e , 1 2 ⎟⎥, ⎟ − c 2 exp⎜ − ⎢c1 exp⎜ − ⎜ 2 p e,1 (1 − p e,1 ) ⎟⎥ ⎜ 2 p e,1 (1 − p e,1 ) ⎟ N ⎢ ⎝ ⎠ ⎜ ⎟ ⎝ ⎠ ⎣⎢ ⎦⎥ where 1 ( − p e,1 ) (1 − p e,1 ) 2 c1 = − , c2 = − . 2 2πp e,1 (1 − p e,1 ) 2 2πp e,1 (1 − p e,1 )
For fairly large N , 1 ⎛ 2 ⎜ N ( − p e,1 ) 2 exp⎜ − ⎜ p e,1 (1 − p e,1 ) ⎜ ⎝ This leads to
⎞ ⎟ ⎛ N (1 − p e,1 ) 2 ⎞ ⎟. ⎟ >> exp⎜ − ⎜ p e,1 (1 − p e,1 ) ⎟ ⎟ ⎝ ⎠ ⎟ ⎠
′ ≈ p CH
1 ⎛ 2 ⎜ N ( − p e,1 ) 2 c1 exp⎜ − ⎜ p e,1 (1 − p e,1 ) N ⎜ ⎝
1
⎞ ⎟ ⎟ ⎟ ⎟ ⎠
(6)
and ′ ) = −c 3 N − log( p CH
1 log( N ) − log(c1 ). 2
(7)
2
⎛1 ⎞ ⎜ − p e,1 ⎟ 2 ⎝ ⎠ . where c 3 = p e,1 (1 − p e,1 ) ′ ) ≈ −c 3 N . Each Thus, when N >> log(N ) , log( p CH additional reduction of p CH by 3dB beyond this point thus requires adding an essentially fixed number of additional nodes. 10
10
0
p=0.45 -1
p=0.4 CH DEP
erfc( x) =
10
10
10
10
-2
-3
-4
-5
0
p=0.3 p=0.45 Approximition p=0.45 Simulation p=0.4 Approximation p=0.4 Simulation p=0.3 Approximation p=0.3 Simulation 20
40 60 Number of Nodes
80
100
Figure 2. The one-hop cluster Decision Error Probability (DEP) as the number of nodes in the cluster and the probability of error are varied.
3.2. The Multi-Hop Cluster Case Now consider the case of a multi-hop cluster. We still assume that the decision error is p m and the one hop communication error is pc . At the CH, an error occurs with probability pe,1 = pc (1 − p m ) + pm (1 − pc ) for a local decision bit transmitted from a node that is one hop from the CH. For local decision bits from nodes in the second hop, the BER is pe, 2 = pe,1 (1 − pc ) + (1 − pc ) pe,1 at the CH because of the extra communication hop. For a node that is k hops from the CH, the local decision bit received at the CH has a BER of p e, k = p e, k −1 (1 − p c ) + (1 − p e, k −1 ) p c , k ≥ 2. This is a first order difference equation whose solution is: 1 1 p e, k = − (1 − 2 p m )(1 − 2 p c ) k , k ≥ 1 . (8) 2 2 In Figure 3, we plot the individual decision bit BER for a sensor node at the CH vs the distance of the sensor nodes from
Decision Bit Cumulative BER at the CH
the CH in terms of hops. Here, we set p c = 0.01, 0.001 and p m = 0.01 . It is easy to see that a node’s local decision becomes more error-prone at the CH as it moves away from the CH. As a result, local decisions from closer hops should be weighted more heavily during the CH’s decision-making process. In other words, the CH should trust decisions received from inner hops more than those from outer hops.
10
10
-1
probability at the p(rˆ ≠ s | s ) = p (rˆ ≠ s1 | s1 )
CH
is
given
by
K ⎛N ⎞ k c N −c ∏ ⎜⎜ ⎟⎟( pe, k ) k (1 − pe, k ) k k . Wk ( 2 c k − N k ) > 0 k =1⎝ c k ⎠ Here ck stands for the number of occurrences of s 0 in vector
=
∑
rk , k = 1,2 ,...,K .
Proof: Similar to the proof of Theorem 1, we only need to compare p(r1 , r2 ,..., rK | s i ) for i = 0, 1. Assume that ck stands for the number of occurrences of s 0 in vector rk ,k = 1,2,...,K . It follows that
Pc=0.01 Pc=0.001
K
p(r1 , r2 ,..., rK | s0 ) = ∏ ((1 − pe, k ) ck ( pe, k ) N k − ck ) k =1
and
K
p(r1 , r2 ,...rk | s1 ) = ∏ ((1 − p e,k ) N k −ck ( p e, k ) ck ). k =1
Define
γ = p(r1 , r2 ,..., rK | s 0 ) / p(r1 , r2 ,..., rK | s1 ) , we get: γ=
K
∏
K
∏ (( p
((1 − pe, k ) ck ( pe, k ) N k − ck ) /
k =1
-2
5 10 15 20 Distance of sensor from its CH in term of hops
25
Figure 3. Decision BER vs the distance of the sensor from the CH in terms of hops. Here p = 0.01, 0.001 and pm = 0.01 . c
What choice of weights minimizes the overall decision error probability at the CH? Before this question can be answered, some definitions are needed. Suppose the CH receives bits r1,1 , r1, 2 ,..., r1, N1 from the N 1 nodes in the first hop; bits r2,1 , r2, 2 ,..., r2, N 2 from the N 2
nodes in the second hop, and so on. Also assume that the error probabilities are identical for decisions from the same hop and that the error probabilities for decisions from different hops are p e,1 , p e, 2 , ..., p e, K respectively. Define rk = (rk ,1 , rk , 2 ,..., rk , N1 ),
K
=
k =1
⎛ 1 − pe, k ⎞ ⎟ pe, k ⎟⎠
.
(9) With the given condition χ 1 : χ 2 : ... : χ K = W1 : W 2 : ... : W K , we ⎛ 1 − p e, k know ln⎜ ⎜ p e, k ⎝
⎞ ⎛ 1 − p e, K ⎟ / ln⎜ ⎟ ⎜ p e, K ⎠ ⎝
⎞ Wk ⎟= . That is ⎟ WK ⎠
⎛ 1 − p e, K =⎜ ⎜ p p e, k ⎝ e, K Substitute (10) into (9), we obtain 1 − p e, k
γ =
k = 1, 2 ,..., K . We also
K
⎛ 1 − pe, k ⎞ ⎟ pe, k ⎟⎠
∏ ⎜⎜⎝ k =1
introduce the notation W ◊ x, which means x should be duplicated W times. Thus, f (3◊ x ) = f ( x, x, x ). This is standard notation in the literature on weighted order statistics; see, for example, [16].
⎛ 1 − pe , K =⎜ ⎜ p ⎝ e, K
Theorem 2. In a K hop cluster, suppose the correct decision ⎛ 1 − pe, k ⎞ ⎟, and assume these χ k′ s can be is s. Define χ k = ln⎜ ⎜ p ⎟ ⎝ e,k ⎠
⎛ ⎜ ⎜ ⎛ 1 − pe, K = ⎜⎜ ⎜ ⎜ ⎝ pe, K ⎜ ⎝
scaled such that χ1 : χ 2 : ... : χ K = W1 : W2 : ... : WK , where the W k′s are positive integers with gcd(W1 , W 2 ,..., W K ) = 1. The MAP-based decision bit is then given by rˆ = Median(W1 ◊r1 , W 2 ◊r2 ,..., W K ◊rK ). The decision error
) ck (1 − pe, k ) N k − ck )
2 ck − N k
∏ ⎜⎜⎝ k =1
e, k
K
2 ck − N k
K
=
∏ k =1
Wk
⎞ WK ⎟ . ⎟ ⎠ ⎛ ⎜ ⎛ 1 − pe, K ⎜ ⎜⎜ ⎜ ⎝ pe, K ⎝
(10)
⎞ ⎟ ⎟ ⎠
Wk WK
⎞ ⎟ ⎟ ⎟ ⎠
2 ck − N k
⎛ ⎛ Wk ⎞ ⎞ ⎜⎜ ⎟ ⋅( 2 c k − N k ) ⎟ ⎟ ⎠
⎜⎜W ⎟ ⎞∑ ⎝⎝ K ⎠ ⎟ k =1 ⎟ ⎠
1
⎞ ⎟ ⎟ ⎠
K ∑ (Wk .( 2 ck − N k )) k =1
⎞ WK ⎟ ⎟ ⎟ . ⎟ ⎟ ⎠
So K
(Wk .( 2 c k − N k )) ⎞∑
⎛ 1 − pe, K k =1 ⎟ (γ )WK = ⎜ ⎜ p ⎟ ⎝ e, K ⎠
(11)
Since
1 − p e, K
> 1 for all finite K , from (11) we can see
p e, K K
that γ > 1 if and only if ∑ (W k .(2c k − N k )) > 0. In this case, k =1
we know that rˆ = s 0 and the weighted median of the received vectors is Median(W1 ◊r1 , W 2 ◊r2 ..., W K ◊rK ) = s 0 . K
Similarly, if γ < 1, we know that ∑ (W k .(2c k − N k )) < 0. k =1
This leads to rˆ = s1 , and it agrees with the weighted median since Median(W1 ◊r1 , W 2 ◊r2 ..., W K ◊rK ) = s1 . In both cases, we have shown that the optimal decision at the CH is given by rˆ = Median(W1◊r1 , W2 ◊r2 ..., WK ◊rK ) . The decision bit error probability is given by p(r ≠ s | s ) = p(rˆ = s 0 | s1 ) ⋅ p ( s1 ) + p (rˆ = s1 | s 0 ) ⋅ p( s 0 ) . With the symmetry of the decision bit error probability, we know that p(r ≠ s | s ) = p(rˆ = s 0 | s1 ) = p (rˆ = s1 | s 0 ). From (11), we know that p(rˆ ≠ s | s ) = p (rˆ = s0 | s1 ) = p (γ > 1 | s1 was transmitted) K
= p(
∑ W ( 2c k
− N k ) > 0 | s1 was transmitted)
k
(12)
k =1
K
=
⎛ Nk ⎞ ⎟⎟( pe, k ) ck (1 − pe, k ) N k − ck k ⎠
∑ ∏ ⎜⎜⎝ c
Wk ( 2 c k − N k ) > 0 k =1
■ Equation (12) may be numerically evaluated with moderate computing resources at the CH. If the sensor nodes are densely deployed -- i.e. the number of nodes in each hop becomes large – then the Gaussian Approximation can be used to simplify (12) as follows. K ⎛ Nk ⎞ ⎜⎜ ⎟⎟( pe,k ) ck (1 − pe, k ) N k −ck p(rˆ ≠ s | s) = c Wk ( 2ck − N k ) > 0 k =1 ⎝ k ⎠
∑ ∏
K
∑ ∏
≅
Wk ( 2 ck − N k ) > 0 k =1
⎛ =⎜ ⎜ ⎝
K
∏ k =1
⎛ (ck − N k pe,k ) 2 ⎞ ⎟ exp⎜ − ⎜ 2 N k pe,k (1 − pe,k ) ⎟ 2πpe,k (1 − pe,k ) ⎝ ⎠ 1
⎞ ⎛ K (ck − N k pe,k ) 2 ⎞ ⎟ ⎟ exp⎜ − ∑ ⎜ k =1 2 N k pe, k (1 − pe,k ) ⎟ 2πpe,k (1 − pe, k ) ⎟⎠Wk ( 2ck − N k ) >0 ⎝ ⎠ 1
Remark: In ⎛ 1 − p e, k W K ⋅ ln⎜ ⎜ p ⎝ e, k WK
∑
Theorem 2, ⎞ ⎛ 1 − p e, K ⎟ = W ⋅ ln⎜ k ⎜ p ⎟ ⎝ e, K ⎠ Wk
we assumed that ⎞ ⎟. This is equivalent to ⎟ ⎠
⎛ 1 − p e, K ⎞ ⎛ 1 − p e, k ⎞ W ⎟ where k is a rational number. ⎟ =⎜ ⎜ ⎟ ⎟ ⎜ p ⎜ p WK e, k e, K ⎠ ⎠ ⎝ ⎝ If the corresponding number is irrational, then, since rational numbers are dense in the reals, we can always find some
rational number to approximate each χ k . A detailed proof of this fact can be found in the Appendix. 3.2.1. A comparison: the weighted median vs the median In Theorem 2, we proposed the weighted median of the local decision bits as the optimal decision bit for a fixed multihop cluster. A simpler scheme in such network, if we do not account for communication errors, would be to ignore the multi-hop structure and just use the (un-weighted) median of all the local decision bits. This is equivalent to treating the multi-hop cluster as if it were a one-hop cluster. In Fig. 4, we compare the decision error probability when the decision is computed by the weighted median in Theorem 2 with the case when it is computed by the median. We assume sensors that are uniformly distributed, so that the average number of local decisions from ring k , N k , is given by N k = 5(2k − 1). The decision error and the one-hop communication error probabilities are set at p c = 0.1, p m = 0.2 for Fig. 4(a) and p c = p m = 0.2 for Fig. 4(b). The weighted median and median perform exactly the same for one-hop cluster because they are identical in this case – Theorem 2 reduces to Theorem 1. For a multi-hop cluster, we see that the weighted median estimator outperforms the median estimator dramatically. As the cluster size increases, the improvement becomes more significant. As the cluster size increases, more local decisions arrive at the CH through multiple hops. These decisions are less reliable than the decisions from the first hop. As these unreliable decisions eventually dominate the overall set of decisions for a multi-hop cluster, the decision error probability with the median becomes larger and larger. As shown in Fig. 4(b), as the number of rings in the cluster becomes larger than 3, the overall CH decision error probability actually increases if the median is used. With the weighted median, decisions from the inner hops are weighted more heavily, by duplicating them multiple times, than those from outer hops. This is the reason that the weighted median estimator can perform much better than the median – it can account correctly for the more error-prone, but still valuable, decisions from the outer hops. Comparing Fig. 4(a) with Fig. 4(b), the weighted median provides greater improvement for cases in which the communication error is larger. While the overall decision error comes from both p c and p m , as seen in (8), multi-hop communication accumulates only the errors due to p c . If p c is large compared to p m , χ k will change significantly from hop to hop, and this results in relatively larger Wk s for smaller k . These larger Wk s in the weighted median can compensate more for the unreliable decisions from outer rings while the median can not. Thus, the weighted median provides more improvement as the one hop communication error p c increases. In the extreme case in which p m = 0, the weighted median will always perform better than the median in a multihop cluster.
10
region of area A , the number of sensor nodes in this area is a Poisson random variable with parameter A ⋅ λ .
-1
4.1. Decision Bit Error Probability in a General Multi-Hop Cluster In Theorem 2, we determined the optimal decision based on the MAP criterion for multiple-hop clusters. There we assume the CH received local decisions from a fixed number of decisions in each ring of the cluster. However, sensor nodes are usually randomly scattered over a field and can be assumed to form a homogeneous spatial Poisson process. The number of nodes in each ring is thus a random variable. Define r1 = (r1,1 , r1, 2 ,..., r1, N1 ) as the received vector of local
Weighted Median Median -2
CH DEP
10
10
10
-3
decisions from ring 1, where N 1 is Poisson random variable
-4
1
2
3 4 5 6 7 Number of Rings in the Cluster
8
9
with parameter λ1 = πr 2 λ . For simplicity, we abbreviate this as N 1 ~ Poisson(λ1 ). Similarly, define rk = (rk ,1 , rk , 2 ,..., rk , N k ) , 2 ≤ k ≤ K ,
(a). pc = 0.1 , pm = 0.2 .
CH DEP
where N k ~ Poisson(λ k ) with λ k = π (2k − 1)r 2 λ. The goal is now to find the best decision the clusterhead can make when it receives these vectors of local decisions. We continue to assume, without loss of generality, that 1 p ( s = s0 ) = p ( s = s1 ) = . 2 10
Weighted Median Median
-1
1
2
3 4 5 6 7 Number of Rings in the Cluster
8
9
(b). pc = 0.2 , pm = 0.2 Figure 4. Comparisons between the decision error probabilities (DEP) at the clusterhead when the Median and Weighted Median are used. The dependence on the number of hops in the cluster, which is the number of rings in the cluster, is shown.
4. SENSOR NETWORK DESIGN: ENERGY EFFICIENCY vs SYSTEM PERFORMANCE Many different clustering strategies have been proposed in the literature [1,2]. In [1], a multi-hop clustering algorithm is proposed that minimizes the overall energy used by the processing center to collect one local decision from each sensor in the field. In this section, we investigate the tradeoff between energy consumption and network detection error probability. This will provide useful guidelines for determining the number of hops to be allowed to each cluster. Since sensor nodes are battery-powered and thus have limited transmitting power, we assume that the transmission range of each sensor node is r. Furthermore, we assume the sensor nodes are randomly distributed over a circular area of radius R with node density λ / unit area . These nodes thus form a homogeneous spatial Poisson process. For a given
Theorem 3. In a K hop circular cluster, suppose the correct decision is s . Assume that the error probabilities are identical for those decisions from the same hop and the error probabilities for decisions from different hops are ⎛ 1 − p e, k ⎞ ⎟, and pe,1 , pe, 2 , ..., pe, K respectively. Let χ k = ln⎜ ⎜ p ⎟ e, k ⎝ ⎠ assume these χ k s can be scaled up such that χ 1 : χ 2 : ... : χ K = W1 : W 2 : ... : W K , where the Wk′s are positive integers and gcd(W1 ,W2 ,...,WK ) = 1 . The optimal decision bit based on the MAP criterion is given by rˆ = Median(W1 ◊r1 , W 2 ◊r2 ,..., W K ◊rK ). The decision error probability at the CH is approximately ⎛ K ⎞ ⎜ ∑ W λ (1 − 2 p ) ⎟ k k e k , 1 ⎜ ⎟ p(rˆ ≠ s | s ) = erfc⎜ k =1 ⎟. K 2 ⎛ ⎞ 2 ⎜ 2⎜ ∑ Wk λk ⎟ ⎟⎟ ⎜ ⎝ k =1 ⎠ ⎠ ⎝
Proof: First note that the cluster in Theorem 2 is just one realization of the general cluster considered here. From Theorem 2, the weight coefficients, the W k′ s , are functions of pe, k only and are independent of the number of decisions from the ring. We thus know that for any realization of r1 , r2 ,..., rK , the optimal decision bit at the CH is always given by rˆ = Median(W1◊r1 , W2 ◊r2 ,..., WK ◊rK ) .
We now calculate the decision bit error probability. Due to the symmetry of the error state space, we know that p(r ≠ s | s) = p(rˆ = s0 | s1 ) ⋅ p( s1 ) + p (rˆ = s1 | s0 ) ⋅ p ( s0 ) .
p(rˆ = s1 | s 0 ) = p ( N s1 > N s0 | s 0 ) = p( N s1 − N s0 > 0 | s 0 ) 2 K ⎛ ⎛ ⎜ ⎜ x − ∑ W k λ k (2 p e, k − 1) ⎞⎟ 1 k =1 ⎠ ⎝ exp⎜⎜ − K K ⎛ ⎞ 2 2⎜ ∑ W k λ k ⎟ ⎜⎜ 2π ⎛⎜ ∑ W k2 λ k ⎞⎟ ⎝ k =1 ⎠ ⎝ k =1 ⎠ ⎝
= ∫0∞
= p(rˆ = s1 | s0 ) Assume at this moment the correct decision is s0 . We
⎞ ⎛ K ⎜ ∑ W λ (1 − 2 p ) ⎟ k k e,k 1 ⎟ ⎜ = erfc⎜ k =1 ⎟. K 2 2 ⎛ ⎞ ⎜ 2⎜ ∑ W k λ k ⎟ ⎟⎟ ⎜ ⎝ k =1 ⎠ ⎠ ⎝
know N 1 ~ Poisson(λ1 ) , and that each local decision from the first hop reaches the CH as s 0 with probability 1 − pe,1 independently. Define N1,s0 to be the number of local decisions from the first hop that reach the CH as s 0 . It is easy
(13) ■
to show that N1, s ~ Poisson(λ1 (1 − pe ,1 )). Define N1,s1 to be 0
the number of decisions reaching the CH as s1 , we know that N 1, s1 ~ Poisson(λ1 p e,1 ). Furthermore, we can show that N 1, s0 and
are independent. Defining
N 1, s1
2≤k ≤K N k , s0
in
a
N k ,s0 and
N k ,s1
similar
fashion, we know that ~ Poisson(λ k (1 − p e, k )) and N k , s1 ~ Poisson(λk pe , k )
are also independent random variables. One observation is that ring k1 is disjoint from ring k 2 if k1 ≠ k 2 , and it thus follows that N k1 ,s0 is independent from N k 2 ,s0 and N k 2 ,s1 . K
K
Let N s0 = ∑ W k N k , s0 and N s1 = k =1
∑W N k
k =1
k , s1
. Note that
N s0 and N s1 are not Poisson random variables any more
since each N k ,s0 is multiplied by an integer. While we can numerically evaluate the probability mass function of N s0 and N s1 , it is computationally difficult. K
Note that N k ,s0 can be written as N s0 =
Nk
∑∑ X
k ,i
In Figure 5 we provide numerical results on the decision error probability for different cluster sizes with fixed λ and one hop transmission error probability pc . Here we assume that λ = 10 and pc = p m = 0.2 . From the figure we see that, for a small cluster, increasing the cluster size can improve the system performance. However, for certain values of λ , p c and p m , after a certain threshold, increasing the clustersize and collecting decisions from more nodes will not improve system performance very much. The problem is the decisions from outer rings of the cluster travel more hops and thus become more unreliable. In other word, as K increases, more and more local decisions arriving at the CH are essentially random variables that are independent of the true hypothesis. Since erfc(.) is a strictly decreasing function, with (13), we can easily determine the minimum cluster size K to achieve a specified QoS (Quality of Service) for any given λ , p c and p m .
where
k =1 i =1
p( X k ,i = 1) = 1 − p e, k and p( X k ,i = 0) = p e, k . Since sensor nodes are usually densely deployed for information redundancy and network robustness, we know that λ is fairly large. So N k ,s0 is the sum of a large number of random K
K
k =1
k =1
show that N s0 ~ N ( ∑ W k λ k (1 − p e, k ), ∑ W k2 λ k (1 − p e, k )) and K
K
k =1
k =1
10
CH DEP
variables. Using the Central Limit Theorem [17], it is easy to
10
-2
-3
N s ~ N (∑ Wk λk pe , k , ∑ Wk2 λk pe , k ). Furthermore, N s0 is 1
independent from N s1 . It can thus be concluded that K
K
k =1
k =1
N s1 − N s0 ~ N ( ∑ W k λ k (2 p e, k − 1), ∑ W k2 λ k )) and
⎞ ⎟ ⎟dx ⎟ ⎟⎟ ⎠
10
-4
10
0
10 Number of Rings in the Cluster
(a). λ = 10 and p c = p m = 0.2
1
1.109
x 10
Assume a network routing infrastructure has been established such that it takes k hops for nodes in ring k to communicate with the CH. Since N k , the number of nodes in
-4
1.1085
ring k , is Poisson (π (2k − 1)r 2 λ ), the total energy consumed to collect one packet from all nodes in the field is given by
1.108 CH DEP
K
Etotal = ∑ kN k r α . The average energy consumed is given by
1.1075
k =1
K
K
E ( Etotal ) = E ( ∑ kN k r α ) = ∑ kE ( N k )r α
1.107
k =1
1.1065
K
K
= ∑ kπ (2k − 1)r λr α = ∑ πλ (2k − 1)kr α + 2 k =1
1.106 1.1055 10
k =1
12
14
16 18 20 22 24 26 Number of Rings in the Cluster
28
30
(b). λ = 10 and p c = p m = 0.2 Figure 5. Performance of the weighted median decision rule as a function of the number of hops in the cluster.
4.2. Sensor Network Design: Energy vs Network Performance In a sensor network, energy efficiency is one of most important issues since most sensor nodes are battery-powered. It is usually difficult, if not impossible, to replace these batteries. Suppose the power loss coefficient for transmissions is α and α ∈ [2, 4]. The power level received at a distance d from ⎛ P ⎞ the transmitter will then be Pr (d ) = c⎜⎜ αt ⎟⎟, where Pt is the ⎝d ⎠ transmission power used. From this, we can see that a multihop transmission scheme is usually more energy-efficient than single-hop transmission for the same level of receiver sensitivity. On the other hand, multi-hop transmissions accumulate communication errors. This reduces the overall system performance at the CH. From the viewpoint of system performance, one-hop communication is thus preferable to multi-hop communications. There is thus a tradeoff between energy efficiency and the decision error rate at the clusterhead that should be characterized. Consider a circular cluster with radius R. Nodes in this cluster form a homogeneous spatial Poisson process with density λ ; i.e., the total number of nodes in the field, N , is a Poisson random variable: N ~ Poisson(πR 2 λ ). Assume now that a node’s transmission range is r , then the cluster size in ⎡R⎤ terms of the number of hops, or rings, is given by K = ⎢ ⎥. ⎢r⎥ We further assume that the transmission power is chosen such that the one-hop communication error probability is fixed at p c . At a distance of r , assume that the power required to achieve error probablity p c is proportional to r a . For simplicity, assume that the required power is actually equal to r a .
2
(14)
k =1
From (13), the average decision error probability at the CH is given by: ⎞ ⎛ K ⎜ ∑ W λ (1 − 2 p ) ⎟ k k e, k 1 ⎟ ⎜ p(rˆ ≠ s | s ) = erfc⎜ k =1 (15) ⎟. K 2 2 ⎛ ⎞ ⎟ ⎜ 2⎜ ∑ W k λ k ⎟ ⎟ ⎜ ⎝ k =1 ⎠ ⎠ ⎝ In (15), to account for an edge effect that may cause the outermost ring of the cluster to be only a partial ring of for depth R − ( K − 1)r , we have λ k = π (2k − 1)r 2 λ 1 ≤ k ≤ K − 1 and λ K = π ( R 2 − (( K − 1)r ) 2 )λ . In Fig. 6, the tradeoff between the energy consumed to empty the cluster and the decision error probability is provided. The cluster has radius R=10, node density λ = 6, and p c = p m = 0.2. For each curve in the figure, different data points on the curve stand for different transmission ranges, as r takes on the values 1, 3, 5, 7, and 9. For the same system performance requirement, a smaller path loss coefficient α usually leads to less total energy consumed to empty the whole cluster. To reduce the overall energy consumed, the transmission range of each sensor node can be reduced. This results in more decisions traveling multiple hops to reach the CH, and reduces the overall system performance, as shown in Fig. 6. For r = 1 , the energy consumed and the system performance will be identical for all values of α . The reason is that for this particular transmission range, the one hop communication energy cost is the same for all three α ' s. 5. CONCLUSION In this paper, we studied the binary event-detection problem in clustered sensor network. We proposed the optimal weighted median estimator for decision bit at the CH for muliti-hop cluster. The performance of the weighted median estimator is compared with median estimator and the weighted median scheme outperforms the median estimator significantly. Based on the optimal weighted median strategy, tradeoff between total energy consumed to collect one decision from each node and the system performance at the CH is derived and analyzed. Further work includes extending the weighted median estimator into a multi-clusters case. Extending the optimal weighted median scheme to estimation problem in sensor
network is also an interesting problem, where the source may assume continuous value of certain range. 10
8
Total Energy Consumed
Alpha=4 Alpha=3 Alpha=2 10
10
10
7
6
5
r=9
r=1
4
10 -100 10
10
-80
10
-60
10 CH DEP
-40
10
-20
10
0
Figure. 6. Tradeoff between total energy consumed to empty the cluster and the decision error probability at the CH. Field radius R = 10, communication
and decision error p c = p m = 0.2 and node density λ = 6.
6. REFERENCES [1]
S. Bandyopadhyay and E. J. Coyle, “Minimizing Communication Costs in Hierarchically clustered Networks of Wireless Sensors,” Computer Networks, Vol. 44, Issue 1, pp. 116, January 2004. [2] S. J. Baek, G. D. Veciana and X. Su, “ Minimizing Energy Consumption In Large-scale Sensor Networks Through Distributed Data Compression And Hierachical Aggregation,” IEEE Journal on Selected Areas in Communications, Vol. 22, No. 6, pp. 1130-1140, Aug. 2004. [3] R. S. Blum, S. A. Kassam and H.V. Poor, “Distributed Detection with Multiple Sensors: Part II – Advanced Topics,” Proceedings of the IEEE, Vol. 85, No. 1, pp. 64-79, 1997. [4] J.-F. Chamberland and V. V. Veeravalli, “Asymptotic results for decentralized detection in power constrained wireless sensor networks,” IEEE J. Select. Areas Commun., Vol. 22, No. 6, pp. 1007–1015, Aug. 2004. [5] K. Liu and A. M. Sayeed, “Optimal Distributed Detection Strategies for Wireless Sensor Networks,” Proceedings of 42nd Annual Allerton Conference on Communications, Control and Computing, Monticello, IL, October 2004. [6] Z. Q. Luo, “Universal Decentralized Detection in a Bandwidth Constrained Sensor Network,” IEEE Trans. Signal Processing, 2004. [7] J. Xiao, S. Cui, Z.-Q. Luo, and A. J. Goldsmith, “Joint Estimation in Sensor Networks under Energy Constraint,” IEEE Trans. Signal Processing, to appear. [8] P. D. Wendt, E. J. Coyle, and N. C. Gallagher, Jr., “Stack filter,” IEEE Trans. Acoust., Speech, Signal Process. Vol. ASSP-34, pp. 898-91 1, Aug. 1986. [9] E. J. Coyle and J-H. Lin, “Stack Filters and the Mean Absolute Error Criterion,” IEEE Trans. Acoustics, Speech and Signal Processing, Vol. 36, No. 8, Aug. 1988. [10] L. Yin, R. Yang, M. Gabbouj and Y. Neuvo, “Weighted Median Filters: A Tutorial,” IEEE Trans. Circuits and Systems, Vol. 43, No.3, Mar. 1996.
[11] G.R. Arce, T.A. Hall and K.E. Barner, “Permutation weighted order statistic filters,” IEEE Trans. Imagine Processing, Vol. 4, 1995. [12] P. Maragos and R. W. Schafer, “Morphological filters - Part I: Their set theoretic analysis and relations to linear shift-invariant filters,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-35, pp. 1153-1 169, Aug. 1987. [13] P. Bloomfield and W.L. Steiger, “Least Absolute DeviationsTheory: Applications and Algorithms,” Birkhauser Boston, Inc., 1983. [14] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam and E. Cayirci, “Wireless Sensor Networks: A Survey,” Computer Networks, Vol. 38, No. 4, pp. 393-422, March 2002. [15] A. Papoulis and S. U. Pillai, “Probability, Random Variables and Stochastic Processes,” 4th Edition, McGraw Hill, 2002. [16] G.R. Arce, “Nonlinear Signal Processing-A statistical Approach,” John Wiley &Sons Inc, 2004. [17] P. Billingsley, “Probability and Measure,” 3rd Edition, John Wiley &Sons Inc, 1995.
7. APPENDIX Lemma 1. In a K hop cluster, suppose the correct decision is s. Assume the CH receives N1 decisions from nodes in the first hop; N 2 decisions from nodes in the second hop, and so on. Assume that the error probabilities are identical for those decisions from same hop and the error probabilities for decisions from different hops are p e,1 , p e, 2 , ..., p e, K , respectively. Let c k stands for the number of times that s 0 is the decision from nodes in hop k . Obviously, 0 ≤ c k ≤ N k .
⎛ 1 − p e, k Define χ k = ln⎜ ⎜ p ⎝ e, k integers W1 , W 2 ,..., W K that
⎞ ⎟. There is always a set of positive ⎟ ⎠ such that gcd(W1 ,W2 ,...,WK ) = 1 and
K
∑ (Wk .(2ck − N k )) > 0
k =1
if
and
only
if
K
∑ ( χ k .(2c k − N k )) > 0.
k =1
Proof: If all χ k s are rational, let χ k = integers.
K
Take Wk1 = wk ⋅ ∏ q k
qk
are
Wk =
Wk1 . Obviously, gcd(W1 ,W2 ,...,WK ) = 1 gcd(W , W21 ,..., WK1 )
and
positive
wk , here wk and qk k =1
and
1 1
K
∑ (Wk .(2ck − N k )) > 0
k =1
is
equivalent
to
K
∑ ( χ k .(2c k − N k )) > 0.
k =1
Suppose that on the other hand, there are some irrational
χ k′ s. Without loss of generality, let χ1 be irrational. We show below that we can always find a rational number ν 1 such that
K
∑ ( χ k .(2c k − N k )) > 0
if
k =1
and
only
if
K
ν 1 (2c1 − N 1 ) + ∑ ( χ k .(2c k − N k )) > 0. k =2
Assume ν 1 = χ 1 + α 1 , α 1 > 0 and consider non-negative K
integers c1 , c 2 ,..., c K such that ∑ ( χ k .(2c k − N k )) > 0. Since k =1
rational numbers are dense in the reals, there exists some K
positive rational number µ 1 such that ∑ ( χ k .(2c k − N k )) > µ 1 . k =1
Since 2c1 − N 1 ≤ N 1 , if we take α 1 such that α 1 ≤
µ1 N1
, we
know that | α 1 (2c1 − N 1 ) |≤ µ 1 . This leads to K
ν 1 (2c1 − N 1 ) + ∑ ( χ k .(2c k − N k )) k =2
K
= ∑ ( χ k .(2c k − N k )) + α 1 ⋅ (2c1 − N 1 )
(A-1)
k =1
> µ1 + α 1 ⋅ (2c1 − N 1 ) > 0 For any non-negative integers c1 , c 2 ,..., c K such that K
∑ ( χ k .(2c k − N k )) < 0. We know there is some positive
k =1
K
rational number µ 2 such that ∑ ( χ k .(2c k − N k )) < − µ 2 . If k =1
α1 ≤
µ2 N1
, we know that: K
ν 1 (2c1 − N 1 ) + ∑ ( χ k .(2c k − N k )) k =2
K
= ∑ ( χ k .(2c k − N k )) + α 1 ⋅ (2c1 − N 1 )
(A-2)
k =1
< − µ 2 + α 1 ⋅ (2c1 − N 1 ) < 0 From (A-1) and (A-2), α 1 can
that α 1 ≤ min{
be
taken
such
µ1 µ 2
, }. Thus for any non-trivial set of N1 N1 K
integers c1 , c 2 ,..., c K , if ∑ ( χ k .(2c k − N k )) > 0, we know that k =1
K
ν 1 (2c1 − N 1 ) + ∑ ( χ k .(2c k − N k )) > 0. k =2
K
Also, ∑ ( χ k .(2c k − N k )) < 0 leads to k =1
K
ν 1 (2c1 − N 1 ) + ∑ ( χ k .(2c k − N k )) < 0. k =2
Thus
K
∑ ( χ k .(2c k − N k )) > 0
k =1
if
and
only
if
K
ν 1 (2c1 − N 1 ) + ∑ ( χ k .(2c k − N k )) > 0 . k =2
For other irrational χ k′ s, we continue this process. We can ν k′ s such that always find rational number K
K
k =1
k =1
∑ ( χ k .(2c k − N k )) > 0 if and only if ∑ (ν k .(2c k − N k )) > 0 .
After that, the rational-valued numbers ν k can easily be scaled up to get the integer-valued W k′s. ■