Anda di halaman 1dari 11

1

Distributed Detection with Censoring Sensors under Physical Layer Secrecy


Stefano Marano, Vincenzo Matta, Peter Willett, Fellow, IEEE
Abstract We consider distributed binary detection problems in which the remote sensors of a network implement a censoring strategy to fulll energy constraints, and the network works under the attack of an eavesdropper. The attacker wants to discover the state of the nature scrutinized by the system, but the network implements appropriate countermeasures to make this task hopeless. The goal is to achieve perfect secrecy at the physical layer, making the data available at the eavesdropper useless for its detection task. Adopting as performance metric certain AliSilvey distances, we characterize the detection performance of the system under physical layer secrecy. Two communication scenarios are addressed: parallel access channels and a multiple access channel. In both cases the optimal operative points from the network perspective are found. The most economic operative solution is shown to lie in the asymptote of low energy regime. How the perfect secrecy requirement impacts on the achievable performances, with respect to the absence of countermeasures, is also investigated. KeywordsDistributed detection, physical layer security, censoring nodes, attack countermeasure.

I. I NTRODUCTION ISTRIBUTED detection in Wireless Sensor Networks (WSNs) deals with the problem of efcient decision making based upon remotely collected observations, taking into account the communication constraints implied by data transmission [1][7]. As is well known, wireless networks are usually vulnerable to attacks by intruders of various nature, and there is no doubt that including security issues in the system design would be very desirable. The emerging school of thought suggests that security in WSNs has to be addressed at the physical layer [8]. There exists a relevant bulk of works about physical layer security which is currently available, taking a genuine information theoretical perspective, and focusing on transmission-only problems, instead of detection ones. This problem can be traced back to the seminal 75 work by Wyner [9] on the wiretap channel, where the tradeoff between the communication rate achievable by the legitimate user and the amount of information intercepted by the wiretapper is addressed. Obviously, Wyners viewpoint was certainly not focused on modern wireless networks, while, curiously, his models, formalization and results form the solid basis for the modern approach to physical layer security [10][13].

S. Marano and V. Matta are with DIIIE, Universit` degli Studi di Salerno, a via Ponte don Melillo I-84084, Fisciano (SA), Italy. E-mails: {marano, vmatta}@unisa.it. P. Willett is with ECE Department, U-2157, University of Connecticut, Storrs, CT 06269 USA. E-mail: willett@engr.uconn.edu. Peter Willett was supported by the Ofce of Naval Research under contract N0001407-1-0055.

The reason for addressing the security issue at the physical layer is traditionally motivated by the fact that encryption and other higher-layer methods are typically unsuited to wireless networks, given the lack of infrastructure that characterizes them. However, we are interested in detection and, intriguingly, in this context the physical layer approach arises even from the opposite perspective. Indeed, suppose that the message content is either (i) perfectly secret thanks to messageencryption protocols, or (ii) inherently inaccessible due to structural limitations of the eavesdropper monitoring. Even in these cases, it is certainly very difcult for the system designer to avoid that an intruder may have access to some much coarser information, such as the channel accesses, and/or the transmission epochs of the messages in the network. In many scenarios of practical interest, as rough as they could be, such information is still relevant for detection purposes, especially when the number of sensors is large. In this context, it is worth mentioning the emerging perspective, referred to as the detection of information ows, and the detection of encrypted stepping-stone connections, where, similarly to our setup, the only timing information is available, see the recent works by He and Tong [14][16]. They consider the problem of detecting a ow of information among the single units of a network, by simply having access to the time epochs of the packets at the different nodes. Also in that case, there is no possibility of having access to the message contents, but, nevertheless, non-trivial detection performances are shown to be achievable even with the rough information represented by the packet epochs. In this work we address a detection problem in an energyconstrained wireless network, under physical layer secrecy constraints. We do not adhere to an information theoretical viewpoint, and instead elaborate on the commonly accepted paradigm of censoring sensors [17], [18] for energy-efcient distributed detection [19], [20] in sensor networks, by introducing the additional requirement for the system to be secure at the physical layer. This is made in the following way. There is a communication channel connecting the sensors to a central unit (fusion center), and the aim of this latter is not to recover the remotely observed data, but to carry out a binary hypothesis test about the state of the nature that underlies the statistics of the observations. We shall consider both the cases of Parallel Access Channels (PAC) and of a Multiple Access Channel (MAC). In the former, each remote sensor is connected to the fusion center by a dedicated (and ideal) channel. In the latter, neighboring sensors are forced to share the same communication resource; we still assume that the common channel is ideal (in the sense of being noiseless), and focus on

i-th sensor likelihood

Yi

SENSOR TX POLICY DEGRADED CHANNEL

FUSION CENTER EAVESDROPPER

i-th sensor likelihood

Yi
IDLE BUSY

FUSION CENTER

0
EAVESDROPPER

Fig. 1. Conceptual scheme of the addressed problem, with reference to a single sensor of the network. Top panel. The sensor implements a certain transmission rule to comply with its energy limitations. The attacker (eavesdropper) accesses a degraded version of the data available to the fusion center. Bottom panel. The above general scheme is specialized to our model. Each sensor sends the observed likelihood (outer linear region), or sends nothing (inner zero level). On the eavesdropper side, a binary random variable representing the idle/busy state is available at the output of its degraded channel.

the collision MAC with a simple slotted ALOHA protocol. The relevance of this model for detection in WSNs on a collision MAC is grounded in the recent cross-layer perspective adopted in [21]. As to the security aspects, we assume that an unauthorized intruder wants to make inference about the state of the nature exactly as the network does. Such an attacker senses the channel by monitoring the transmission activity of the sensors. It can perfectly discern between the idle and the busy state of the channel, and the collection of the transmission activities of the remote nodes constitutes the only set of data available to the eavesdropper to take its own decision. Therefore, the WSN adopts proper countermeasures to make the intruder data (idle/busy channel) useless for inference. This amounts to a perfectly secret system, which is the best operative modality for the network. As said, the remote units of the system sense the environment and send messages to the fusion center. However, this data delivery is not for free. The nodes of WSNs are usually severely constrained in terms of their on-board power supplies, so that the energy spent for delivering data to the central unit is a serious constraint for the system designer [20]. According to the viewpoint pioneered by [17], we adopt as measure of the cost the transmission probability , which is proportional to the per-sensor energy consumption, see also [18], [19], [22], [23]. This model is suitable for WSNs where the overhead usually associated to the communication task (channel access, connection setup, security/encryption, if any, etc.) is not negligible, and where each time a sensor wakes up attempting to transmit, a signicant amount of energy is spent. Note that, as standard in similar studies, our model obeys the perfect knowledge paradigm: the eavesdropper knows that

the network is energy-constrained and implements the best censoring strategy and, on the other side, in adopting the proper countermeasures the network is aware of the operating modalities of the eavesdropper (i.e., its message-counting capability). Figure 1 offers a conceptual scheme of the problem addressed, with reference to a single sensor of the network. The picture describes a single dedicated channel between the considered sensor and the fusion center, for the PAC architecture. On the other hand, it is also intended as representative of the MAC case, once the sensors transmission attempt has been successful. In the top panel, the general viewpoint of an intruder less powerful than the fusion center is emphasized by the presence of a degraded channel, while the energy constraints that each sensor must obey are embodied in the transmission policy block. In the bottom panel we expand the above general scheme: The binary output of the degraded channel models the idle/busy state; the sensor transmission policy amounts to sending the observed likelihood (outer linear region), or sending nothing (inner zero level). We nally mention that, in the present paper, the detection performance metric is chosen within a sub-class of the AliSilvey distances, whose relevance for inference purposes is discussed in [24]. As a consequence, we shall see that a divergence-cost function naturally arises as one of our main analysis tool. Such a function, and the way we look at it, mirrors the well-known capacity-cost function that arises in the study of communication channels [25], [26]. Along the same line, the notion of divergence per unit cost, to be introduced shortly, is directly inspired to the concept of capacity per unit cost, introduced and popularized by [27]. Summarizing, the main actors on the scene are: (i) the performance metric, namely the divergence, say D; (ii) the energy consumption, quantied through the transmission probability ; (iii) the physical layer secrecy of the system with respect to unhautorized intruders having access to the network transmission activity. The goal of the designer is to maximize the overall divergence measured at the fusion center, with a constraint on the energy spent by the sensors, and with adequate protection against the intruder. A. Main results This paper formalizes the problem of an energy-constrained WSN engaged in a detection task, that we want to make perfectly secret against detection attempts made by an unauthorized eavesdropper. For the PAC scenario, we introduce and characterize the divergence-cost function D(), that quanties the system performance under perfect secrecy. It is proved that D() is strictly increasing and strictly concave in the interior of (0, 1). We also show that, for prescribed cost , each sensor transmits only if its (local) likelihood ratio lies outside a no-send region in the form (l (), u ()), and there exists a unique pair of censoring thresholds l (), u (), with l () 1 and u () 1. These functions are shown to be monotonic, and in the limit of vanishingly small , the two thresholds converge to the extremes of the likelihood-pdf

support. This corroborates and complements similar previous ndings in the context of censoring, see e.g., [17]. Once the divergence-cost function has been characterized, we introduce the divergence per unit cost D = sup

D() ,

whose operational meaning is that of characterizing the most economic way to convey information specically suited for the detection task. It is shown that the divergence per unit cost is attained in the limit of vanishingly small per-sensor energy1 : 0. This also reveals that the most economic way to deliver information for detection amounts using more and more sensors, with vanishing energy expense per each sensor. For the MAC scenario, we specically refer to a simple ALOHA collisions protocol in which m sensors share a common channel. For a xed , larger values of m give a potential growth of information in the system, but, at the same time, increase the probability of collisions. We therefore take a cross-layer perspective in order to regulate the transmission activities. In the limit of m , we prove that the optimal operational point of the network is D/e, thus providing further operational meaning to the divergence per unit cost. The remainder of this paper is organized as follows. Section II, which addresses the PAC scenario, formalizes and solve the problem. In Sect. III the same tools are exploited for a multiple access environment. Section IV summarizes the main ndings, while some mathematical derivations are postponed to the appendix. II. S ECURE D ETECTION OVER PAC We consider a detection problem in which the remote units (nodes) of a WSN sense the surrounding environment and collect data to be delivered to a fusion center that must make the nal decision about a binary hypothesis test of H0 against H1 . The data collected at the remote nodes are assumed continuous random variables Xi , i = 1, 2, . . . , n, where n is the number of sensors. These variables are independent and identically distributed (iid) under both the hypotheses, a simplifying assumption often adopted in similar studies, and generally accepted as a reasonable compromise between the accuracy of the mathematical model and its analytical tractability [1]. We denote by f1 (x) and f0 (x) the probability density functions corresponding to hypotheses H1 and H0 , respectively. Further dene the likelihood ratio Yi = f1 (Xi )/f0 (Xi ), whose pdfs under the hypotheses will be denoted by p1 (y) and p0 (y). It is immediate to recognize that Yi are non-negative iid random variables. Whenever needed, we assume that p1 (y) and p0 (y) are sufciently regular and smooth functions, in particular, they are continuous and well-behaved over the whole support, and there is no point-mass under either hypothesis. All the following analysis refers to the case that the common support of p1 (y) and p0 (y) is a connected set (line segment) dened
1 The capacity per unit cost studied by Verd [27] shares the same property, u when the channel has a free input symbol.

as (yl , yu ), where clearly it is not excluded that yl = 0 and/or yu = +. As to the fusion center, it receives the vector of messages from the sensors and computes the global likelihood, say Yf c , which is used for a likelihood ratio test. This latter, as well-known, ensures the best performances according to many commonly adopted optimization criteria (e.g., NeymanPearson, Bayesian, minimax) [28]. As a proxy of the detection performances, we may refer to the the class of Csisz r a f divergences [24], [29], that is EH0 [C(Yf c )], where C(y) is a continuous, convex , real-valued function dened for y > 0, with C(1) = 0. Throughout the paper, however, we restrict the analysis only to additive metrics with strictly convex C(y). This implies that the overall divergence perceived at the fusion center is simply the sum of the divergences measured at the remote nodes, and also that, thanks to Jensens inequality, the divergence is zero if and only if f1 (x) = f0 (x), but for zero-measure sets. The two Kullback-Leibler numbers D01 (set C(y) = y log y, logarithms are to base e) and D10 (set C(y) = log y) and the J-divergence (the sum of the former two) fall in this class. The operative meaning of D01 and D10 is directly implied by Steins lemma [30], [31]. As to the remote nodes, we assume that each sensor has two possible alternatives: To transmit the observed likelihood yi or to stay silent. The result is that it may be convenient for the sensors not to deliver those observations that appear less informative for discriminating between H0 and H1 . This concept is quantied by the locally computed likelihood ratio yi : Sensors whose observations give very small or very large likelihoods send data (i.e., the likelihoods yi themselves) to the fusion center, while the others stay silent, thus saving energy. Of course, this is the standard operative modality of WSNs with censoring nodes [17][19], [22]. The sending region of the ith sensor will be denoted by Ri , and the corresponding transmission probabilities by j (i) = Pr(yi Ri |Hj ), j = 0, 1. We impose a balanced constraint such that both costs 0 (i) and 1 (i) at each sensor do not exceed a common value . The local divergence perceived at each sensor is Di =
Ri

( p0 (y)C(y)dy + (1 0 (i))C

1 1 (i) 1 0 (i)

) , (1)

and, according to the above discussion, the goal of the designer n is that of maximizing the global divergence i=1 Di with the per-sensor constraint max{0 (i), 1 (i)} , i. The eavesdropper has no access to the data received from the fusion center, but it can monitor the transmission activity of the sensors. In this respect, its detection capabilities are those of testing the binary pmfs corresponding to the idle/busy channel information. The divergence achieved by the eavesdropper is accordingly Deav =
n i=1

( 0 (i)C

1 (i) 0 (i)

) + (1 0 (i))C

1 1 (i) 1 0 (i)

) .

To take into account the different constraints imposed to our

system, let us consider the following optimization problem: n maximize i=1 Di subject to max{1 (i), 0 (i)} i (2) Deav . We note explicitly that, as to the energy constraint , different choices deserved attention in the topical literature. For instance, one could impose the constraint only under H0 , that is, 0 , motivated by physical considerations, as reported in [17]. With respect to (2), it is understood that the specic case that the intruder is completely blind deserves particular attention. As a matter of fact, in the wiretap channel literature it is typical to consider the maximum degree of information achieved by the main user (fusion center, in our terminology), while the information of the eavesdropper is exactly zero. This is commonly referred to as a regime of perfect secrecy [9]: We want to maximize the divergence seen at the fusion center subject to max{1 (i), 0 (i)} , i, and = 0. Addressing this issue is the main theme of this work. It is worth noting that designing the system to get the maximum divergence compatible with only an energy limitation, would in general give to the intruder some detection capability. To avoid this, we are here faced with a maximization on the subspace allowing perfect secrecy. This reduction of the search space, which arises as a natural consequence of the physical layer security issue, has the additional benet of simplifying the mathematical analysis, providing clean analytical results, as detailed in the forthcoming sections. A. Divergence-cost function In order to ensure perfect secrecy, = 0, we must impose 0 (i) = 1 (i), i. The optimization of the global divergence in eq. (2) can be accordingly recast as n maximize i=1 Di subject to 1 (i) = 0 (i) i. Due to the additive structure of the objective function, and the balanced per-sensor cost used here, the problem decouples such that it sufces to maximize the local divergence Di , yielding: def D() = max p0 (y)C(y)dy, (3)
1 =0 R

0.3 0.25 0.2

DH bL

0.15 0.1 0.05 0 0 0.2 0.4 0.6 0.8 1

Fig. 2.

Typical behavior of the divergence-cost function D().

the intruder is given no chance to infer the state of the nature (H0 or H1 ). This setting is formalized by the optimization problem (3), where the search is carried out over all possible sending regions R. However, in the standard censoring literature it is shown that the optimal censoring strategy amounts activating the transmission of the local likelihoods yi falling outside a certain singly-connected region. This property holds for a large class of performance gures, including those here considered, and is a consequence of the convexity of the function C(y), see [17, Th. 3]. For all , the maximum divergence is thus always obtained with a no-send region which is a single interval, say y [l , u ]. Accounting for the above property, the costs become yu l pj (y)dy + pj (y)dy, j = 0, 1, j =
yl u

and the objective function to be maximized is l yu p0 (y)C(y)dy. p0 (y)C(y)dy +


yl u

The following lemma characterizes the censoring regions. L EMMA. Let (0, 1) be the energy constraint. The unique pair of censoring thresholds (l (), u ()) is such that u () is a strictly decreasing function ranging from yu to 1, while l () is a strictly increasing function ranging from yl to 1. The proof is given in the appendix. We are now ready for stating the main result of this section. T HEOREM 1. The divergence-cost function D() is strictly increasing and strictly concave , for (0, 1). The proof is given in the appendix. The typical shape of D() is shown in Fig.2. Remark 1. The concavity of D() has important implications. For example, let us suppose that a fraction of the sensors works with a cost constraint , and the remaining fraction = 1 works with constraint . With + = an overall cost of is achieved, and the total divergence result in D( )+ D( ). The strict concavity of the divergencecost implies that this latter quantity is less than D(), which

where index i is omitted, and the dependence of the maximum available divergence on the maximum available cost has been made explicit. Note that the second term in eq. (1) disappears due to the condition 0 (i) = 1 (i). The function dened in (3) will be referred to as divergencecost function and plays a major role in our analysis: under perfect secrecy each sensor contributes by an amount of D() to the overall detection performance, which therefore amounts to nD(). Before stating the main theorem pertaining to the PAC case, we need to characterize the censoring regions. Recall that we want to maximize the detection performance of the WSN, with a prescribed constraint on the energy spent by the sensors (transmission probability ), and also ensuring that

can be attained by imposing one and the same constraint to all the sensors. More generally, there is no advantage in working with heterogeneous cost constraints. Remark 2. An immediate way to obtain a perfect secret system would be to make the sensors transmission probability independent of the measured data. Suppose for instance that no censoring is imposed and that each node simply delivers its local likelihood with an a-priori xed probability . The average performance of the system is D(1), because sensors are active (on the average) and the contribution of uncensored delivering amounts to the divergence-cost function, when the cost is set to its upper limit 1. Clearly, such a dumb scheme obeys a cost constraint . Exactly the same constraint, however, is attained by using the censored strategy with results, as we have seen, in a detection performance of D(). Now, a simple consequence of Theorem 1 is that D()/ decreases with , which implies D() > D(1), meaning that the less clever approach of a blind limitation of the sensors transmissions is by a large margin sub-optimal. It is easily seen that the same remains true also if one obtains the desired cost by transmitting with probability the likelihoods censored at level . More explicitly, suppose that we design the censoring strategy in order to obtain a per-sensor performance of D(), which correspond to a cost > . Then to t the cost constraint, the transmission is actually enabled (irrespective of the data) with a probability such that = . The performance of such scheme is D(), which is less than D(), since D() = D()/ < D(). Therefore, Theorem 1 ensures that no advantage can be obtained. Remark 3. We explicitly note that the proved properties of the divergence-cost D() rely essentially on the convexity of C(y), that is, they hold true for the whole class of Csisz r a f divergences [29]. What, instead, fails to be true in general, is that the maximization of the overall divergence available at the fusion center, is achieved by maximizing the local divergence as in eq. (3). Two points should be emphasized. On one hand, focusing on D() as a global performance metric certainly makes sense for all the separable f divergences [32], not necessarily only the additive ones. On the other hand, even when the global optimization does not decouple, maximizing, e.g., the sum of the local divergences may be appealing for practical reasons, especially when the number of sensors is large, see [19].

The characterization of the divergence per unit cost D is now in order. Recalling denition (4), since D() is nonnegative, strictly concave , and D(0) = 0, we get D = lim
0

D() ,

(5)

provided that this latter limit exists (nite or innite). Now, while D is a measure of the detection performance of the WSN per single sensor, D characterizes the performance per unit cost, and from eq. (5) we see that the most economic way to deliver information for discriminating between the two hypotheses, is that of working in the limit of vanishing . As goes to zero, the no-send region coincides with the support of the likelihoods pdf, implying that, for any xed n, the overall divergence goes to zero: Delivering zero-energy cannot provide any successful transfer of information, as we expect. More interesting is the case that 0 and n , in such a way that the product n stays constant. Assuming for simplicity that = 1/n, the overall system performance is
0

lim nD() = lim

D()

(6)

so that D rules the asymptotic detection capabilities of the WSN. The divergence per unit cost D can be evaluated by using the rst derivative () =
def

dD() , d

since lim0 D()/ = lim0 (). From the explicit expression of () in eq. (16) of the appendix, by recalling that, as vanishes, the two censoring thresholds tend to the extremes of the pdfs support, we can work out the required limit. First, assuming that lim0 () is nite, we get D= 1 yl yu 1 C(yu ) + C(yl ). yu yl yu yl (7)

On the other hand, it may also happen that the above limit is innite when, e.g., yu = and/or yl = 0. For instance, let yl = 0 and yu = and consider C10 (y) = y log y, 1 C01 (y) = log , y

namely, the two Kullback-Leibler distances D(H1 ||H0 ) and D(H0 ||H1 ), respectively [31]. For vanishingly small , we get 10 () log u () and 01 () log 1 , l ()

B. Divergence per unit cost We dene the divergence per unit cost as the best value of the ratio D()/: D = sup
def

which both diverge for 0. C. Examples

D() .

(4)

This quantity mirrors the capacity per unit cost popularized by Verd in his seminal work [27], whose denition relies upon u the well-known capacity-cost function (see [26]), exactly as we dene D through D().

The characterization provided in the previous sections is now exploited to address study cases of relevance for practical applications. On one hand, (i) this allows checking the predicted behavior of the divergence function D() and the relevant properties of the divergence per unit cost; on the other hand, (ii) inspection of particular examples will help

in understanding how much is lost by imposing the secrecy requirement. It seems quite natural to address the classical Gaussian shiftin-mean problem, and we made this for different problem parameters (e.g., different signal-to-noise ratios). However, as far as we can tell, the evaluation of the censoring thresholds needed to work out the divergence-cost function requires numerical computations. It turns out that little physical/general insights are obtained, and we accordingly choose not to detail the numerical evidences, limiting ourselves to summarize the main ndings. As to point (i), the expected properties of D() have been of course veried. More importantly, as to issue (ii), we made the comparison of D() with the divergence corresponding to the case that the perfect secrecy constraint is relaxed, say Dunc (). For the Gaussian case with the elected symmetric cost constraint max{0 , 1 } , we found, for different signal-to-noise-ratios, that the maximum divergence is always obtained for equal costs 0 = 1 : this implies that nothing is lost by imposing the secrecy constraint. As a further check, we also considered the asymmetric cost constraint imposed only under the null hypothesis, that is 0 , and, as expected, for this case Dunc () > D(), and the equality is lost. From the previous numerical evidences, it seems that the absence of loss in working with Deav = 0 is related to the symmetries of both the Gaussian shift-in-mean detection problem, and the imposed cost constraint, while, in general, we expect that something should be paid by requiring perfect secrecy. This pushes us to investigate a different case of practical interest, that is, the well-known exponential shiftin-scale problem. We assume f0 (x) = ex u(x), f1 (x) =
x 1 1+ e u(x), 1+

DETECTION PERFORMANCE

DETECTION PERFORMANCE

0.3 0.25 0.2 0.15 0.1 0.05 0 0 0.2 0.4


b

0.15 0.1 0.05 0 0 0.2 0.4


b

0.6

0.8

0.6

0.8

(a)

(b)

Fig. 3. The divergence-cost function D() (curve in bold), versus the cost , for the exponential example with = 1. Also shown are the curves pertaining to a system without security countermeasure: Dunc () is the divergence experienced by the network (dashed curve), while () is the correspondent (nonzero) divergence attainable by the eavesdropper (shown by the tiny solid curve). Panel (a) refers to the Kullback-Leibler distance between H0 and H1 , while panel (b) shows the case of the Kullback-Leibler distance between H1 and H0 .

Let us now consider the case of a Kullback-Leibler distance between H1 and H0 , i.e., C(y) = y log y. Substituting into the divergence, we have: log l () q1 (z)zez dz. q1 (z)zez dz + D() =
log yl log u ()

where u(x) is the unit-step function. The corresponding likelihood ratio is thus computed as y=
f1 (x) 1 = e 1+ x u(x), f0 (x) 1+

such that the extremal points are yl = (1 + )1 and yu = . Referring for simplicity to z = log y, it is easily recognized that the log-likelihood z is a shifted exponential random variable, that is q0 (z) = q1 (z) = 1 zlog yl a0 u(z log yl ), e a0 1 zlog yl a1 e u(z log yl ), a1

with a0 = /(1+), and a1 = . The costs 0 and 1 become ( )1+1/ ( )1+1/ yl yl + (8) 0 = 1 l u and 1 = 1 ( yl l )1/ + ( yl u )1/ . (9)

We solve numerically the equations 1 = 0 = , getting the values of the thresholds l () and u ().

Taking the limit yu in eq. (7) tells that D10 = . The above analysis is repeated for the case of the other Kullback-Leibler distance, namely assuming C(y) = log y. A basic difference here is that the divergence per unit cost is nite. Indeed, again evaluating eq. (7) in the limit of large yl gives, for the considered example of = 1, D01 = log 2. A natural question at this point is about how much is lost by imposing the perfect secrecy constraint. Otherwise stated, what would be the gap between our D() and the divergence obtained by optimizing the censoring thresholds without care about the eavesdropper, as done, e.g., in [17]? As already discussed, the additive nature of the divergence, along with the per-sensor cost, allows working in terms of the per-sensor divergence Di , and we do this. Specically, we maximize the per-sensor divergence for a prescribed cost limitation max{0 , 1 } , without any constraint imposed on the divergence perceived by the eavesdropper side. As done before, we accordingly denote the optimized quantity by the symbol Dunc (). As to the intruder, different from the previous analysis, in this setup it will exhibit in general non-zero detection capabilities. We shall denote the per-sensor divergence seen by the eavesdropper with (). In Fig. 3 we display the three relevant quantities D(), Dunc () and (), for the case of the two Kullback-Leibler numbers. Note that, at the extreme points = 0, 1, the costs under different hypotheses must obviously coincide. This implies that D(0) = Dunc (0), D(1) = Dunc (1) and (0) = (1) = 0. Let us rst compare the two curves D() and Dunc (). We see that Dunc () D(), and this is obvious. What is

0.3 0.25 0.2 0.15 0.1 0.05 0 0 0.2 0.4 0.6 0.8 b
(b)

0.15 0.1 0.05 0 0 0.2 0.4 0.6 0.8 b


(a)

Fig. 4. With reference to the case study addressed in Fig. 3, we show the difference between the performance of the network and the performance of the eavesdropper in presence (solid curve) and in absence (dashed) of the perfect secrecy countermeasure. Therefore, the solid curve represents D(), and the dashed is Dunc () (). As for Fig. 3, panel (a) refers to the Kullback-Leibler distance between H0 and H1 , while panel (b) shows the case of the Kullback-Leibler distance between H1 and H0 .

not so obvious is the relatively small distance between the two curves: it is remarkable that the loss due to the secrecy requirement, in terms of detection performance, is modest. Switching to the analysis of (), we observe that () Dunc (), as we expect since the network has information about both the transmission activities and the message content. In the low-cost regime, 0, the curves Dunc () and () behave similarly. This seems to suggest that, when resources are scarce, the best detection is achieved by looking at the transmission activities, rather than the message content. A different behavior seems to arise for large costs. In this regime the details available about the message information give a substantial improvement with respect to the simple message-counting available at the eavesdropper side. Summarizing, the above analysis emphasizes that, for the considered case study, i) the loss in terms of detection exponent is modest and ii) the lack of countermeasures may be dangerous, especially in the low-cost regime. It is also of interest to compare the difference between the (per-sensor) divergence of the network and the (persensor) divergence of the eavesdropper, for the two distinct schemes under consideration: presence and absence of security constraints. This is explicitly done in Fig. 4. We stress that, for the considered scenario, the curve D() (recall that the eavesdropper has zero divergence in this case) stays always above the difference Dunc () (). Interestingly, for the particular values of the parameters considered in the example, the behavior of Dunc () () can be analytically veried to be exactly quadratic (Fig. 4a) and linear (Fig. 4b), but this cannot be taken as the general rule. III. D ETECTION - SECURE ACCESS OVER COLLISION
CHANNELS

physical model from [21], which is well-suited to a detectioncentric perspective of the multiple access and, as done for the case of parallel channels, the further requirement of secrecy is accounted for. Let us specify the model, rst. Suppose that a eld of tiny, cheap, possibly unreliable and battery-limited sensors is monitoring a feature, and is eager to deliver its observations to the fusion center according to the SENMA (Sensor Networks with Mobile Agents) concept [33]. The latter architecture, which we are going to briey describe, is chosen due to its potential efciency in terms of energy spent for communication, with respect to other protocols (e.g., multihop in at ad-hoc networks), other than for its properties of versatility, scalability, robustness, etc2 . Specically, a roving base-station queries the sensors in a certain footprint area by emitting a polling signal. The medium connecting the sensors to the mobile FC is modelled as a multiple access collision channel, and all sensors within the rovers eld of view, when queried, respond according to the (simplest version of) the slotted ALOHA policy [34]: if more than one message attempts to transmit, then the message is lost; otherwise the message is assumed to be received correctly. To simplify the analysis it is assumed, as suggested in [21], that different snapshots involve different sensors. This working hypothesis gives insights on the system performance and optimization, and it is also expected to provide accurate results for relatively large number of sensors. Complying with the same cross-layer philosophy exploited in [21], here we choose to tune the ALOHA transmission probability of each sensor on a detection-centric basis. Specically, a network node self-regulates its own accesses by censoring the less-informative likelihoods, i.e., the channel access is attempted only if the likelihood lies inside the send-region Ri . A general analysis allowing each sensor in the eld of view to use different regions is rather involved. For this reason, and without claiming a full generality, in the following we assume that the transmission policy is one and the same for all sensors in the single rovers snapshot, Ri = R, i. Furthermore, due to the symmetry of the constraints imposed to all sensors in the network, the policy does not change over different snapshots, and the problem reduces to the maximization of the persnapshot divergence, say Da (m), for a given number of persnapshot sensors m. As before, an eavesdropper is assumed to overhear the system by monitoring the transmission attempts. The goal of the designer is to maximize the detection performance Da (m), while ensuring perfect secrecy. It would clearly be desirable to fully exploit the available m sensors by let the detection performances to grow with m. To get this, we must take into account the presence of collisions over the channel in such a way that each sensor self-regulates its own transmission probability (i.e., the censoring thresholds) as a function of m. A tradeoff clearly arises in that the detection performances would benet from receiving as much data as possible, but, if the transmission activity is too high, it is
2 The assumption of working with the SENMA architecture is not restrictive: the actions taken by the mobile fusion center can be as well implemented by a xed fusion center, in some equivalent fashion. We refer to the SENMA mainly for concreteness.

RELATIVE PERFORMANCE HSystem - EavesdropperL

In this section we look at the problem of distributed detection with physical layer secrecy, in the context of WSNs where the communication between sensors and a central unit involves a MAC. We borrow the cross-layer viewpoint and the

RELATIVE PERFORMANCE HSystem - EavesdropperL

unlikely that one sensor could successfully communicate with the mobile agent. At rst glance it may appear that we are disregarding the energy constraint, while focusing only on the number of sensors per snapshot. However, this choice is motivated by the fact that, as just observed, the per-sensor cost will be automatically limited by the collisions, and, as we shall prove, it will be approximately in the order of 1/m. Let us now state the main result on the collision MAC with distributed secure detection. T HEOREM 2. The per-snapshot divergence Da (m) veries
m

0.26 0.25 0.24


DaHmL

1.2 1
DaHmL

0.23 0.22 0.21

0.8 0.6 0.4

200 400 600 800 1000 m


(a)

200 400 600 800 1000 m


(b)

lim Da (m) = D/e,

where D is the divergence per unit cost, 1/e is the throughput of the considered slotted ALOHA access protocol, and m is the number of sensors in a single query. The case that D is innite is contemplated. The proof is given in the appendix.

Fig. 5. Divergence function Da (m) for the MAC scenario as function of the number of per-snapshot sensors m. With reference to the same case study addressed in Fig. 3, panel (a) refers to the Kullback-Leibler distance between H0 and H1 , while panel (b) shows the case of the Kullback-Leibler distance between H1 and H0 .

Assume that each sensor in a single snapshot uses a transmission probability 0,1 , which in general depends upon the hypothesis. Accounting for the collision model, the probability density of the likelihood received at the fusion center, evaluated in a point belonging to the send-region, is simply m(1 0,1 )m1 pj (y), j = 0, 1. (10)

on the case that the number of sensors per rovers query m is large enough. First, let us investigate the behavior of m . From eq. (11) we can write
Da (m) m[1 m ]m1 D(1),

By integrating the above over the send-region R, we get the overall probability of getting data at the fusion center in a single query: m(1 0,1 )m1 0,1 , which must be enforced to be the same under the two hypotheses in order to fulll the perfect secrecy requirement. This imposes, as expected, 0 = 1 = . Moreover, the divergence per single snapshot only contains the term related to eq. (10), and amounts to m(1 )m1 p0 (y)C(y)dy.
R

Now, from the results in the previous sections, we know that, for each (0, 1), the send-region R should be selected as a connected region in order to maximize the divergence, and that a unique pair of thresholds exists yielding Da (, m) = m(1 )m1 D(). The above expression shows that it is no longer true that each sensor should use the maximum available transmission probability, as happens in the absence of collisions. Indeed, consider the limiting case of = 1, due to the presence of collisions on the channel, Da (1, m) = 0, m > 1. It turns out, instead, that each sensor should select the most convenient . Clearly, the solution is one and the same for all sensors, and corresponds to maximize the above Da (, m) with respect to , and the maximum divergence per single snapshot can be dened as
Da (m) = max Da (, m) = m[1 m ]m1 D(m ), (11) def def

where is just the maximizer. In general, this problem will not admit a simple closed-form solution. We shall thus focus

the upper bound following from having substituted the divergence-cost with its maximum attained when the argument is unitary. This reveals that m must vanish as m , as otherwise the available divergence would go to zero, that is, for large m, the system is forced to work in the low-cost regime. We can summarize the above analysis by saying that in the case of accesses by the ALOHA protocol, the sensors energy constraint are somehow replaced (or self-regulated) by the physical limitations imposed by the collisions occurring over the channel. The optimal operative point, in the limit of large m, is the low-cost regime of 0. Theorem 2 proves that the divergence per unit cost D arises as natural performance proxy; in particular the divergence attained at the mobile agent per single snapshot is asymptotically D/e. The same exponential example addressed in the previous section with reference to the parallel architecture, is now considered for the MAC scenario. The maximum per-snapshot divergence Da (m) achievable with m candidate per-snapshot sensors is shown in Fig. 5. Comfortably, the optimal trans mission probability m guarantees a monotonic growth of the detection performance with the available sensor density. Such a desired behavior comes from a careful detection-centric managing of the MAC collisions, provided by our cross-layer design. We know that Da (m) must converge to D/e, and we get this. Indeed, in the left panel of Fig. 5, where the KullbackLeibler distance between H0 and H1 is considered, D01 = log 2 and accordingly Da (m) approaches log 2/e 0.255. The right panel refers to the Kullback-Leibler distance between H1 and H0 , and, as already noticed, D10 is innite, and, accordingly, Da (m) grows indenitely.

IV. S UMMARY In a WSN engaged in a binary detection task, the remote nodes sense the environment and deliver messages to the fusion center for the decision. The main system constraint is the energy spent for the communication task from the sensors, and a censoring protocol is thus imposed to mitigate that burden. We address two different scenarios for the communication between sensors and fusion center, namely, the case of parallel access channels (PAC) and that of a single multiple access channel (MAC) with a slotted ALOHA protocol. The wireless nature of the system makes it very vulnerable to intrusion attacks: An unauthorized entity senses the wireless channel and monitors the transmission activity of the sensors with the aim of making the same binary inference of the WSN. The system adopts a drastic attack countermeasure: It is designed in such a way that the intruder is made completely blind. We give no chance of inference to the attacker by making its data useless for guessing the state of the nature, by enforcing a perfect secrecy requirement at the physical layer. For the PAC model, we characterize the information sources (sensors) in terms of a certain divergence function D(), e.g., the well-known Kullback-Leibler distance. Its monotonicity and concavity properties as function of are studied in due depth and the related physical implications are emphasized. The concept of divergence per unit cost is introduced, and the relevant properties and operative meaning are discussed. For the MAC case, we propose a cross-layer approach which focuses on a detection-centric perspective, in order to obtain an efcient management of the channel collisions that maximizes the network detection performance. We prove that the maximum divergence compatible with a given sensor density is ruled by the previously introduced divergence per unit cost, scaled by the classical ALOHA throughput 1/e.

h() l u

0
Fig. 6.

Notional sketch of the function h( ).

yu
yl

[p0 (y) p1 (y)]dy = 0, the above can be recast as

{ } D() = max y l p0 (y)C(y)dy + yu p0 (y)C(y)dy , u l l u yl [p0 (y) p1 (y)]dy = yl [p0 (y) p1 (y)]dy, l y p (y)dy + uu p0 (y)dy = . yl 0 (12) Proof of the Lemma. By dening g(y) = p0 (y) p1 (y), the second equation in (12) can be written l u g(y)dy. (13) g(y)dy =
yl yl

A PPENDIX A. Results for the PAC case The optimization problem is that of determining the censoring thresholds l and u which solve { } D() = max y l p0 (y)C(y)dy + yu p0 (y)C(y)dy , u l l yu yl [p0 (y) p1 (y)]dy = u [p0 (y) p1 (y)]dy, l y p (y)dy + uu p0 (y)dy = , yl 0 where the second equation serves to fulll the perfect secrecy requirement. As it can be seen from the third equation, we study the problem with an equality constraint, and then verify that D() is strictly increasing for (0, 1). This ensures that D() so computed is in fact the true maximum. Since

By introducing the function h( ) = yl g(y)dy, the relationship between l and u amounts to impose h(l ) = h(u ). Now, h( ) has the following (immediate) properties: h(yl ) = h(yu ) = 0, and dh( )/d = g( ). From the nesting property of the likelihood ratio [35, p. 44], we further know that p1 (y)/p0 (y) = y, yielding g(y) = p0 (y)(1 y) (14)

that is, h( ) is strictly increasing for < 1, and strictly 1 decreasing for > 1, with a maximum h(1) = yl g(y)dy > 0. In view of the above properties, the condition h(l ) = h(u ) can be pictorially illustrated as in Fig. 6. For = 0, which corresponds to the horizontal axis, we have that l = yl and u = yu ; all the data are censored, no transmission is allowed and all the sensors stay silent. To the other extreme, in the absence of censoring, = 1 and the horizontal line representing has only one intersection with the curve h( ), yielding l = u . For intermediate values of , there always exists a single pair of thresholds l 1 and u 1 such that h(l ) = h(u ). Furthermore, it is clear that u () is a strictly decreasing function ranging from yu to 1, while, l () is strictly increasing that ranges from yl to 1. Proof of Theorem 1. In view of the previously proved property

10

of the pair (l , u ), the function D() can be evaluated as yu l () p0 (y)C(y)dy, p0 (y)C(y)dy + D() =
yl u ()

C(y) > () () y u() u()

whose derivative is () = Now,


def

dD() dD/du = . d d/du

dD = p ( )C( ) dl p ( )C( ), 0 l l 0 u u d du u d dl = p0 (l ) p0 (u ). du du

0
Fig. 7.

l()

By differentiating both sides of eq. (13), we see that the function l (u ) has the further property g(l ) dl = g(u ). du (15)

l()

Notional sketch of the function C(y).

The above, along with eq. (14), gives dD = p ( ) 1 u C( ) p ( )C( ), 0 u l 0 u u d 1


u l

d 1 u = p0 (u ) p0 (u ), du 1 l yielding () = C(u ) + (1 )C(l ), with = 1 l (0, 1). u l (16)

where we just applied the denition of divergence per unit cost. On the other hand, a simple lower bound is obtained by considering = 1/m, which will give in general a value smaller than the true maximum Da (m), that is Da (m) (1 1/m)m1 Combining the two bounds gives (1 1/m)m1 (17) D(1/m) Da (m) (1 1/m)m1 D. 1/m D(1/m) . 1/m

From the above we immediately see that D() is strictly increasing. Indeed, by convexity of C(y), we have () C(u + (1 )l ) = C(1) = 0. (18)

Now, the fundamental properties of the divergence-cost function D() allows concluding that the divergence per unit cost D is just attained in the limit of small , that is
m

lim

D(1/m) = D. 1/m
m

We now show that the divergence-cost D() is a strictly concave function. It is apparent from eq. (16) that () is a convex combination of the function C(y) evaluated at the extremes of the interval (l (), u ()), and corresponds to the intersection of the segment joining C(l ) and C(u ) with the vertical line y = u + (1 )l = 1. Now, for > , we get u ( ) > u ( ) and l ( ) < l ( ), that is, the interval becomes larger and larger as decreases, see Fig. 7. As a consequence, the convex combination corresponding to is larger than that corresponding to . That is to say, () is strictly decreasing and this implies that D() is in fact strictly concave . B. Results for the MAC case Proof of Theorem 2. It is easy to upper bound Da (m) as Da (m) m max [(1 )m1 ] sup
[0,1]

Consider the case of nite D. The sandwich theorem gives Da (m) = D/e,

that is, Da (m) is asymptotically a fraction of D given by the well-known ALOHA throughput 1/e. As to the other case of innite D, the lower bound is sufcient to conclude that Da (m) diverges as m goes to innity. R EFERENCES
[1] P. K. Varshney, Distributed Detection and Data Fusion. New York, NY: Springer, 1997. [2] R. Viswanathan and P. K. Varshney, Distributed detection with multiple sensors: Part I fundamentals, Proc. IEEE, vol. 85, no. 1, pp. 5463, Jan. 1997. [3] R. S. Blum, A. Kassam, and H. V. Poor, Distributed detection with multiple sensors: Part II advanced topics, Proc. IEEE, vol. 85, no. 1, pp. 6479, Jan. 1997. [4] J.-F. Chamberland and V. V. Veeravalli, Decentralized detection in sensor networks, IEEE Trans. Signal Processing, vol. 51, no. 2, pp. 407416, Feb. 2003. [5] R. R. Tenney and N. R. Sandell Jr., Detection with distributed sensors, IEEE Trans. Aerosp. Electron. Syst., vol. AES-17, pp. 501510, 1981. [6] J. N. Tsitsiklis, Decentralized detection, in Advances in Signal Processing, H. V. Poor and J. B. Thomas, Eds. JAI Press, 1993, pp. 297344.

D() . (0,1)

Now, the rst maximum is obtained at = 1/m, giving Da (m) (1 1/m)m1 D,

11

[7] S. S. Pradhan, J. Kusuma, and K. Ramchandran, Distributed compression in a dense microsensor network, IEEE Signal Processing Mag., vol. 19, pp. 5160, Mar. 2002. [8] V. Poor, Physical layer security in wireless networks: Some recent results, in Communication Theory Workshop (CTW), St. Croix, US Virgin Islands, May 11-14 2008. [9] A. D. Wyner, The wire-tap channel, AT & T Bell Labs Tech. J., vol. 54, no. 8, pp. 13551387, Oct. 1975. [10] I. Csisz r and J. Korner, Broadcast channels with condential mesa sages, IEEE Trans. Inform. Theory, vol. 24, no. 3, pp. 339348, May 1978. [11] Y. Liang, H. V. Poor, and S. Shamai (Shitz), Secrecy capacity region of parallel broadcast channels, in Proc. of Information Theory and Applications Workshop, Nice, France, Jan. 29-Feb. 2, 2007, pp. 245 250. [12] , Secure communication over fading channels, IEEE Trans. Inform. Theory, vol. 54, no. 6, pp. 24702492, June 2008. [13] Y. Liang and H. V. Poor, Multiple-access channels with condential messages, IEEE Trans. Inform. Theory, vol. 54, no. 3, pp. 9761002, Mar. 2008. [14] T. He and L. Tong, Distributed detection of information ows in chaff, in Proc. IEEE International Symposium on Information Theory (ISIT07), Nice, France, June 2007. [15] , Detecting information ows: Improving chaff tolerance by joint detection, in Proc. IEEE Conference on Information Sciences and Systems 2007 (CISS07), Baltimore, MD, Mar. 2007. [16] , Detecting encrypted stepping-stone connections, IEEE Trans. Signal Processing, vol. 55, no. 5, pp. 16121623, May 2007. [17] C. Rago, P. Willett, and Y. Bar-Shalom, Censoring sensors: a lowcommunication-rate scheme for distributed detection, IEEE Trans. Aerosp. Electron. Syst., vol. 32, no. 2, pp. 554568, Apr. 1996. [18] S. Appadwedula, V. V. Veeravalli, and D. Jones, Decentralized detection with censoring sensors, IEEE Trans. Signal Processing, vol. 56, pp. 13621373, Apr. 2008. [19] , Energy efcient detection in sensor networks, IEEE J. Select. Areas Commun., vol. 23, pp. 639702, Apr. 2005. [20] B. M. Sadler, Fundamentals of energy-constrained sensor network systems, IEEE Aerosp. Electron. Syst. Mag, vol. 20, no. 8, Aug. 2005. [21] P. Willett and L. Tong, One aspect to cross-layer design in sensor networks, in Proceedings of MILCOM 2004, Monterey, CA, USA, Oct. 2004, pp. 688693. [22] S. Marano, V. Matta, P. Willett, and L. Tong, Cross-layer design of sequential detectors in sensor networks, IEEE Trans. Signal Processing, vol. 54, no. 11, pp. 41054117, Nov. 2006. [23] W. P. Tay, J. N. Tsitsiklis, and M. Z. Win, Asymptotic performance of a censoring sensor network, IEEE Trans. Inform. Theory, vol. 53, pp. 41914209, Nov. 2007. [24] H. V. Poor, Fine quantization in signal detection and estimation, IEEE Trans. Inform. Theory, vol. IT-34, no. 5, pp. 960972, Sept. 1988. [25] C. E. Shannon, A mathematical theory of communication, Bell Syst. Tech. Journ., vol. 27, pp. 379423, 623656, July-Oct. 1948. [26] R. J. McEliece, The Theory of Information and Coding. London, UK: Addison-Wesley, 1997. [27] S. Verd , On channel capacity per unit cost, IEEE Trans. Inform. u Theory, vol. 36, no. 5, pp. 10191030, Sept. 1990. [28] H. V. Poor, An Introduction to Signal Detection and Estimation. New York: Springer-Verlag, 1988. [29] I. Csisz r and P. C. Shields, Information Theory and Statistics: A a Tutorial. Hanover, MA, USA: now Publishers Inc., 2004. [30] H. Chernoff, A measure of asymptotic efciency for tests of a hypothesis based on a sum of observations, Annals Math. Statist, vol. 23, pp. 493507, 1952. [31] T. Cover and J. Thomas, Elements of Information Theory. New York: John Wiley & Sons, 1991. [32] D. Warren and P. Willett, Optimum quantization for detection fusion: some proofs, examples, and pathology, Journal of the Franklin Institute, vol. 336, pp. 323359, 1999. [33] L. Tong, Q. Zhao, and S. Adireddy, Sensor networks with mobile agents, in Proceedings of MILCOM 2003, Boston MA, Oct. 2003. [34] D. P. Bertsekas and R. G. Gallager, Data Networks, 2nd ed. Prentice Hall, 1991. [35] H. L. Van Trees, Detection, Estimation, and Modulation Theory. Part I. New York: John Wiley & Sons, Inc., 1968 (reprinted, 2001).

Anda mungkin juga menyukai