Green Opportunistic Access for Cognitive Radio Networks: A Minority Game Approach Mouna Elmachkour
Imane Daha
Essaid Sabir
MIS, ENSIAS, Mohammed V-Souissi University BP 713, Rabat, Morocco
[email protected] Faculty of Sciences, Sidi Mohamed Ben Abdellah University BP 1796, Fez Morocco
[email protected] ENSEM-GREENTIC, Hassan II University BP 8118, Oasis, Casablanca, Morocco
[email protected] Abdellatif Kobbane
Jalel Ben-othman
MIS, ENSIAS, Mohammed V-Souissi University BP 713, Rabat, Morocco
[email protected] L2TI, University of Paris 13, Paris, France
[email protected] Abstract—We investigate energy conservation and system performance of decentralized resource allocation scheme in cognitive radio networks thoroughly based on secondary users competitive behavior. Indeed, the contention on data channel unoccupied by licensed user leads to a single winner, but also involves a loss of energy of all nodes. In this paper, we apply minority game (MG) to the most important phase from the opportunistic spectrum access (OSA) process: the sensing phase. We attempt to carry out a cooperation in a non-cooperative environment with no information exchange. We study the Nash equilibrium solution for pure and fully mixed strategies, and we use distributed learning algorithms enabling cognitive users to learn the Nash equilibrium. Finally, we provide numerical results to validate the proposed approach. The resource allocation based on minority game approach improves secondary users battery life and the performance of the network. Keywords—Energy consumption, network performance, resource allocation, minority game, opportunistic access.
I.
INTRODUCTION
A. Emerging of Cognitive Radio Paradigm The great development of communication has created huge demand of radio spectrum, which has become a scarce resource. Nevertheless, it has been found that the major licensed bands are underutilized and some of the remaining bands are heavily used [1]. This is in a part due to the actual resource allocation strategies, which allocate a fixed frequency band to a single licensed system. Cognitive radio is a promising solution to this radio spectrum wastage, it allows the simultaneous access to the same spectrum band by Primary Users (PUs) and secondary users (SUs), by using spatial, temporal and frequency spectrum holes left by idles PUs [1]. Cognitive Radio (CR) was introduced to improve the spectrum utilization by using licensed spectrum but currently unused spectrum [1], the concept of cognitive radio is to detect available channel in wireless spectrum without disturbing the PU the legitimate user of the spectrum. In return, associated activities of the CR process require high amount of energy, specifically during:
1) the spectrum sensing stage, 2) the reporting stage, 3) the negotiation phase, and 3) the transmission phase, see e.g., [2]. B. Literature Review The ultimate goal of channel sensing is to locate the spectrum holes through the embedded sensing capability. Various methods exists to detect the free hole, e.g., [3] and [4]. This way, Secondary Users (SUs) are able to track the spectrum activity in the local vicinity and share it when unused. Fundamentally, secondary users should not interfere with the PU. Many research works has been conducted aiming to reduce the energy consumption, in particular during the sensing stage, see for example [5]. The author proposed a combined sleeping and censoring scheme to minimize the energy consumption incurred in distributed sensing, given constraints on the probabilities of detection and false-alarm. This is met by optimally designing the sleeping rate and the censoring thresholds. Authors in [11], proposed a new twostep game where sensing and opportunistic access are jointly considered. A full characterization of the Nash equilibria and analysis of the optimal pricing policy, from the network owner view, for both centralized setting and decentralized setting, are also provided. Next, a combined learning algorithm that is fully distributed and allows the cognitive users to learn their optimal payoffs and their optimal strategies in both symmetric and asymmetric cases is proposed. C. Minority Games Review Minority Games (MG) was usually introduced to simplify the El Farol Bar’s problem [6]. In the El Farol bar problem, N users decide independently whether to go to this unique bar or to stay at home. However, the bar is small, and the people are happy only if the number of attendees is appropriate, in which case they obtain a reward r for going to the bar. Otherwise, they can stay home with utility 0. Minority Games theory allows interaction among a population of N agents competing in repeated game, where N
is an odd integer number. At each round of the game, every agent has to choose between one of two possible actions: either "buy" or "sell". These two actions are represented by ”0” and ”1”. This is represented by assuming that each agent i = 1, ..., N , at time t, can either pick the action ai (t) = 1 or the action ai (t) = 0. The payoff function is the choice of player i based on the strategy si (t) and the history h(t), it can be defined as: ai (si (t), h(t)). The collective sum of actions from all agents PNat time step t is defined as attendance A(t) with A(t) = i=0 ai (si (t), h(t)). Let N denotes the number of players. In traditional minority game, we consider a value of N/2, so if the value of A(t) ⌅ N/2 then the users which have the value ai (t) = 1 win and the others lose. Otherwise, the users which have the value ai (t) = 0 win and the others lose. The concept of MG is used in delay tolerant networks [7], the relay have two strategies to participate to message relaying if it belongs to the minority subset or not. When it is not involved in relaying the message, the tagged relay saves energy ! Next, the authors build a MG framework to find the equilibria of the game and finally present a stochastic learning algorithm to discover the equilibria solutions of the system. D. Our Contribution Inevitably, guaranteeing cooperation among SUs require heavy exchange of information and a complex signaling mechanism. This is mainly due to the the distributed nature of such a system, which encourages aggressiveness and noncooperative behaviour rather that cooperation. We propose in this paper a distributed scheme built around a Minority Game core. The core idea of this paper is, from a SU point of view, to decide either to sense the channel or not depending on the issue of the implemented minority game. To the best of our knowledge, this is the first work devoted to this kind of analysis. Now, the set of SUs that belong to the minority would have some reward whereas the majority experience a penalty in form of regret. When the minority users decide to keep quite, they save energy. On one hand, the competition over the spectrum would be very high (majority users would sense the channel and attempt transmission consequently) which implies low chance to succeed a transmission. The risk of spending energy uselessly is the high. On the other hand, when the minority SUs decide to sense the channel, they would experience appropriate/comfortable probability of a successful transmission. This can be explained by a low number of competitors, and thereafter a low competition over the spectrum. Then, belonging to the minority set is promising good opportunities, which generate a virtual coordination among the SUs. We support through this interpretation the finding presented in[8]. Authors show how CR networks can achieve cooperation without explicit communications or a master controller scheme. They study a simple resource sharing case by allowing the SUs to play a MG in order to allocate the communication channels among themselves. One of our key results indicates that the MG formulation provides a acceptable solutions which might not necessary be
the most optimal. However, taking into account coordinationless feature among the SUs it is a tempting approach to solve coordination issue especially in highly congested cases. We believe that MG framework is an attractive and elegant solution to share intelligence among SUs without harming the PU and without saturating the spectrum with useless signaling messages. Unlike [8], we addressed the problem of existence and uniqueness of Nash equilibrium. Next, we propose two distributed algorithms to converge to both pure Nash equilibrium and mixed Nash equilibrium. Newt, we show that the energy consumption in the CR network is minimized under MG framework. A special feature is that a delay-energy tradeoff could efficiently defined , especially for real-time applications. The rest of the paper is organized as follows. In Section II we present the system model. The minority game formulation is introduced in Section III. In Section IV we provide an analysis of pure Nash equilibrium and the fully mixed Nash equilibrium as well. Two distributed learning schemes to ensure convergence to NE solutions are presented in Section V. We present the simulation proofs inn section VI and conclude the paper in section VII. II.
S YSTEM M ODEL
We consider a cognitive radio network with M active secondary users, C data primary channels and one control channel. Channels are slotted in time, and the communication between users is synchronized. In OSA context, primary users are licensed users of the spectrum , when secondary users are allowed to use the data channels opportunistically without affecting primary users communication. Hence, the data channel state alternates between active state (idle) and inactive state (busy). The data channel occupancy is expressed as µi zi = , (1) i + µi where i (resp. µi )is the probability that channel state i transits from inactive state to active state (resp. from active state to inactive state), 1 ⌅ i ⌅ C. Similar to the proposed MAC protocol in [9], secondary users are equipped with two transceivers. The first one consists of a SDR module used to sense, receive, and transmit signals, while the second one is devoted to operate over the control channel in obtaining the information of available channels, and negotiating with other via contention based function (DCF) protocol and CSMA/CA on the corresponding channel.
Each time slot consists on four phases: sensing phase, reporting phase, negotiation phase, and data transmission phase. In the beginning of time slot, secondary user chooses randomly one data channel to sense. Sensing phase lasts one mini-slot before the start of the reporting phase on the control channel. The reporting phase is subdivided into C mini-slots, so that the reporting for data channel i takes place in mini-slot i. After that, in the negotiation phase, all secondary users who sense the i-th data channel idle will start negotiation through CSMA/CA mechanism. At the end of this phase, only one
winner starts the packets transmission on the channel for the remaining period of the ongoing time slot. The total time slot duration can be written as T
= =
TS + TR + TN + TData Tms + iTms + TN + TData .
(2)
where Tms expresses a mini-slot duration, and TS , TR , TN and TData respectively are sensing, reporting, negotiation and data transmission phase duration. Secondary user can be prioritized if the length of its data packets exceeds some prioritization threshold value. Prioritized secondary users are allowed to continue their data transmission for the next slots, if no primary user occurs on one data channel or even on several available channels. Those cases will not be addressed in this paper, but the analysis here developed remains valid for those scenarios. A. Transmission delay We deal with transmission delay over y time slots. We assume saturation conditions (i.e., each node has always immediately a packet available for transmission). Secondary users, once the choice is made to sense the i-th channel, use their SDR transceiver to detect the channel state. If a primary user signal is detected, secondary users will release the process and wait for the next time slot to detect a given channel, otherwise, they pass to the reporting phase. During the reporting phase, secondary users who chose to sense the i-th data channel send beacons on the control channel at the i-th mini-slot. Once the number of secondary users in contention for the channel i, ui is known, they pass forthwith to the next phase. The CSMA/CA protocol with RTS/CTS mechanism is applied as follow: each node starts by sensing the channel. If idle, it sends RTS packet over the channel. When the receiving station detects RTS, it responds after a SIFS, with a CTS. If the channel is busy, the station continues checking the channel until it becomes idle. Thus each node has a transmission probability calculated based on the basic back-off stage and collision probability. When a collision occurs, all nodes back-off and then wait for a random time and retry. The negotiation time can be expressed as follows: m X TN = TRT S + TSIF S + ( qi (bi + TDIF S (3) i=0
+TRT S + TSIF S )) + TCT S + TSIF S ,
qi denotes the probability of collision on the channel i. bi the back-off stage equals 2i ⇥ Wmin ⇥ Tms , where Wmin is the minimum back-off window, and m is the maximum back-off stage. TRT S and TCT S denote respectively the time to send RTS and CTS packets. TDIF S and TSIF S represent Distributed and short inter-frame space time intervals respectively. The probability that a given station transmits on the channel i after collision, along the same reasoning in Bianchi’s model, is calculated as 2(1 2qi ) ⌧i = , (4) (1 2qi )(W + 1) + qi W (1 (2qi )m ) where the probability that a transmitted packet collides with at least one of the ui 1 remaining nodes, is
qi = 1
(1
⌧i ) u i
1
.
(5)
Let pN be the probability that a secondary user wins the contention for a given channel i.e. the probability that a least one station transmits and that a successful transmission occurs on the channel. This probability becomes as follow: pN i =
ui ⌧ i (1 ⌧ i )ui 1 . 1 (1 ⌧ i )ui
(6)
In particular, the resource reservation model in [9] has demonstrated what the system performance can be improved effectively by allowing the negotiation phase winner to continue the data transmission for the next time slot if no primary user comes back to the channel. So, we assume that after transmitting in the remaining time of the ongoing time slot when the channel i is sensed idle, secondary users can carry on with data transmission till the channel becomes busy. The probability that a secondary user transmits its data packets on a given channel at a given time slot is pi = (1
zi )(1
qi )pNi .
(7)
The average transmission delay for secondary user to transmit its data packets supposed requiring y time slots on channel i is [9]:
Di
=
i (TS y X
+
+ TRi + TNi + TDatai ) j i (TS + TDatai ),
(8)
j=1
where i denotes the average number of attempts to reach the first successful transmissions in the ordinary process on channel i (sense, report, negotiate, and transmit for the remaining time) and is calculated as i = 1/pi . i denotes the average number of attempts to reach the first successful transmissions for the next time slots (sense, and transmit)is zi )(1 qi )]. i = 1/ [(1 B. Energy Consumption The proposed green resource allocation approach has the main purpose to lead secondary users to an efficient energy utilization through a minimal energy consumption. A secondary user goes through four phases of the process to transmit its packets. Every phase has a minimum required power and time duration that are implied in its energy consumption. We assume that all nodes have the same initial battery capacity. The power consumed by nodes belongs to one of these four classes: • • • •
power for sensing data channels Ps , power for transmitting data packets Ptx , power for receiving data packets Prx , power for waiting for the next opportunity Pw .
The energy consumption for each secondary user activity can be obtained as follows:
•
energy consumed by channels Sensing: ES = Ps TS = Ps Tms ,
player will lose the chance to transmit its data packets. The payoff function is represented by a constant regret function. ⇢ p. r(a, ua ) g(a, ua ) if a = s, f (a, ua ) = (10) ⌘ if a = n
•
energy consumed by Reporting:
•
where p is the probability that the target player transmits on a given channel given in (7), and ⌘ is a positive number. Thus, i=1 a player who chooses not to sense the channel a = n, will energy consumed by Negotiation: receive a payoff equals ⌘, regardless of whether he is within the majority or the minority. If the played action is a = s m X EN = Ptx TRT S + Pw TSIF S + ( q(bi + Pw TDIF S and the player belongs to the minority group, he will receive a positive payoff f + , else he will receive a negative payoff f , i=1 with the assumption that ⌘ > f . +Ptx TRT S + Pw TSIF S )) + Prx TCT S + Pw TSIF S,
•
energy consumed by data Transmission:
ER = Ptx Tbeacon +
uX i 1
Prx Tbeacon ,
ET r = Ptx TData . Hence, the total energy consumption for a secondary user with data packets to transmit on x time slots on data channel i is as follows y X Ei = i (ES + ERi + ENi + ET ri ) + j i (ES + ET ri ). (9) j=1
III.
M INORITY GAME FORMULATION
As mentioned above, after the negotiation phase, only one secondary user among the ui users in contention for data channel i transmits in the data transmission phase. Thus, ui 1 secondary users, who not win the contention for the channel i, lose the energy spent during the previous phases (sensing, reporting, and negotiation). We note a trade-off between attempting to send its data packets and conserving battery energy. We consider a traditional MG, i.e., the capacity level is = 1/2. For the rest of this paper we assume that the C data channels are symmetric, i.e., channels have the same number of users, the same QoS, and the same primary user utilization. So we have ui = u = M/C , i = 1, . . . , C. At the beginning of time slot, each player j among u (an odd number) has two strategies: either to sense the channel 0 s0 , or not to sense 0 n0 , i.e., aj ⌃ {s, n}, j = 1, . . . , u. let us denote us (resp. un ) the number of players that select the strategy 0 s0 (resp. 0 n0 ) for a given channel, u = us + un . Therefore the comfort level of this traditional MG is so that (us , un ) = ( , u ), with = u⌦. The payoff function of our game, that is the function which represents feedback loss or win of a target player, is related to the two possible actions a = s, n. If a target player proceeds to sense the channel, a=s, the payoff will be the difference of the corresponding reward and the spent energy. The reward reflects typically the benefit from the whole process for a player in a given game stage. In our case, the reward r(s, us ) equals the remaining time to transmit the data packets, r = TData . Obviously, to transmit data packets, user consumes a part of its energy, which we denote for a given game stage and for a target player as g(s, as ), which equals B E where B is the initial battery capacity and E the energy consumed given in (9). Now, if the player chooses not to sense the channel a = n, and thus there will not be any loss of energy, nonetheless the
Lemma 3.1: f (a, ua )is a decreasing concave function of u , inasmuch as p(u) is a decreasing concave function of u. Also f (a, ua ) is continuous in ua and the strategy space of player is a compact, convex, and nonempty. The considered traditional Minority Game is an n-Person Concave Game. a
So we have the following result. A detailed proof is given in [10]. Theorem 3.2: (Rosen (1965), Existence of Nash equilibrium): Every n-person concave game admits an Nash equilibrium solution. IV.
NASH E QUILIBRIUM A NALYSIS
In this section we discuss both the pure strategy and mixed strategy Nash equilibria. In one time step, one player decides on how to win more and lose less: how to end up on the minority side. The Nash Equilibrium is defined as the set of strategies according to which no player can benefit by unilaterally changing his strategy. A. Pure Strategy Definition A Nash Equilibrium in pure strategy must satisfy the following two conditions:
+ 1)
(11)
1) ⌅ f (s, )
(12)
f (n, ) ⇧ f (s, f (n,
Through the definition above, no player can do better by unilaterally deviating from the equilibrium. Proposition 4.1: The Nash Equilibria for pure strategy is when exactly secondary users choose to sense the channel, i.e., (us , un ) = ( , u ). Proof: (us , un ) = ( , u ). By contradiction: Suppose that us > then f (n, us ) = ⌘ ⇧ f = f (s, us + 1), the first condition (11) is satisfied. But f (n, us 1) = ⌘ > f = f (s, us ); (12) fails.
Suppose that us < then f (n, us 1) = ⌘ ⌅ f (s, us ) = f + , so (12) holds. But f (n, us ) = ⌘ < f (s, us + 1) = f + , (11) fails. Now, let us verify that us = satisfies the two conditions: f (n, ) = ⌘ ⇧ f (s, + 1) = f , (11) is satisfied, and f (n, 1) = ⌘ ⌅ f (s, ) = f + , (12) is satisfied too.
0.6 0.5 0.4 0.3 0.2 0.1 0
10
20
30
40
50
x1,t(a=s)
0.8 0.7
x2,t(a=s)
0.6 0.5 0.4 0.3 0.2 0.1 0
60
0
10
20
30
40
50
60
Number of iterations
(a) x0 = (0.2 0.2 0.3 0.8 0.7)
n (f , ⌘, f ) ( ⌘, ⌘, f + )⇤
(b) x0 = (0.7 0.9 0.5 0.1 0.5)
Fig. 1: Pure strategy NE convergence: different initial points.
n (f + , ⌘, ⌘)⇤ ( ⌘, ⌘, ⌘)
0.9
1 0.9
0.8
0.8
(s, n, n), (n, s, n), and (n, n, s) are three possible Nash equilibrium solutions for u = 3. We emphasize that there are exactly u asymmetric pure-strategy Nash Equilibria for the minority game.
0.7
x0=[0.9, 0.7 , 0.7, 0.8, 0.9], a=s
0.6
x0=[0.9, 0.7 , 0.7, 0.8, 0.9], a=n x0=[ 0.1, 0.3, 0.1, 0.2, 0.2], a=s
0.5
x0=[ 0.1, 0.3, 0.1, 0.2, 0.2], a=n
0.4
x0=[ 0.5, 0.5, 0.5, 0.5, 0.5], a=s
0.3
x0=[ 0.5, 0.5, 0.5, 0.5, 0.5], a=n
0.7
u=3, −η=−2.5, a=s u=3, −η=−2.5, a=n u=21, −η=−2.5, a=s u=21, −η=−2.5, a=n
0.6 t
s (f , f , ↵) ( ⌘, f +, ⌘)⇤
x2,t(a=s)
x1,
s n
0.7
0.9
Number of iterations
x1, t
s n
s (f , f , f ) ( ⌘, f , f )
x1,t(a=s)
0.8
0
Example: NE for a game with three players:
1
Probability of playing channel sensing
Probability of playing channel sensing
1 0.9
0.5 0.4 0.3
0.2 0.2
0.1 0 0 10
1
0.1
2
10
10
0
50
100
150
200
250
300
350
400
450
500
Number of iterations
Number of iterations
(a) Different initial points x0
(b) u = 3, 21
B. Mixed Strategy 0.9
1 0.9
0.8
0.8 0.7
u=3, a=s
0.7
0.6
−η=0, a=s −η=0, a=n −η=−10, a=s −η=−10, a=n −η=−5, a=s −η=−5, a=n
0.6 x1,
t
u=3, a=n
t
x1,
In the mixed strategy, each player has a probability distribution over the two possible actions, i.e., player i can choose to play a = s with probability xi , and choose a = n with probability 1-xi . We consider a fully mixed strategy in which the probability to select any action is greater than 0. We denote by x = (x1 , x2 , ..., xu ), 0 < xi < 1 ⌥i, the mixed strategy profile of our game.
0.5
0.5 0.4
0.4
0.3 0.3 0.2 0.2 0.1 −10
0.1 −8
−6
−4
−2
0
2
4
6
8
10
0
0
100
Regret −η
(c) Different regret values
Definition A fully mixed strategy Nash Equilibrium specifies a fully mixed strategy x⇤i ⌃]0, 1[ for each player i (where i = 1 . . . M ) such that : fi (x⇤i , x⇤ i ) ⇧ fi (xi , x⇤ i )
for every fully mixed strategy xi ⌃]0, 1[, where x x1 , . . . , xi 1 , xi+1 , . . . , xu .
i
=
l=1
x˜i )( ⌘).
In the equilibrium state, all players choose strategies that maximize their expected payoff: Proposition 4.2: There exists a unique fully mixed Nash Equilibrium x⇤ that is the solution to: ◆ u ✓ X f (x⇤ ) u 1 = (x⇤ )l 1 (1 (x⇤ ))u l [f (s, l) + ⌘] x⇤ l 1 l=1
=0
300
400
500
600
700
800
900
1000
Number of iterations
⌘, u = 3
(d)
⌘=
10,
5, 0, u = 5
Fig. 2: Fully mixed strategy NE convergence
(13)
The utility of player i under a strategy profile (˜ xi , x i ) is: ◆ u ✓ X u 1 l 1 fi (x˜i , x i ) = x˜i x i (1 x i )u l (14) l 1 f (s, l) + (1
200
(15)
V.
D ISTRIBUTED L EARNING NE SOLUTIONS
The distributed learning algorithm involves repeated plays of a static "one-shot" players. Indeed, players rely on local observations to adaptively adjust their strategies. Each player modifies the previous selected strategy with some probability if he learns that other strategy provides a higher payoff. The decision to change the actual strategy to other is made via a learning rule, that leads to converge towards the equilibrium point. For pure strategy Nash equilibrium, we opted for Linear Reward Inaction algorithm (LRI) [12], and for Gradient Descent algorithm for mixed strategy Nash equilibrium. A. Linear reward inaction (LRI) algorithm This algorithm converges to one of the pure strategies coordination equilibria while the initial population distributions are asymmetric. Initially, each secondary user i chooses an action (s or n) according to distribution x(0). Each round,
secondary user decides on the action that maximizes its payoff through the LRI updating rule. A time t, a player i chooses action ai,t . Then the probability to choose one action is updated as: xi (t)),
C T Tms RT S CT S SIF S DIF S
(16)
where 0 < ⇢ < 1 is the step-size parameter and 1{.} denotes the indicator function.
W, m z Ptx Prx Pw Ps Tbeacon
2
x 10
14
1.8 1.6 1.4 1.2
u=13 u=7 u=13, MG u=7, MG
1 0.8 0.6 0.4 0.2
10
15
20
25
30
35
Number of time slot L
40
45
50
x 10
u=7 u=13 u=7, MG u=13, MG
12
10
DMax
8
6
4
2
0 5
32, 3 0.3 0.5W 0.3W 0.05W 0.2W 292µs
6
7
Average Energy Consumption (Wh)
Figure 1 plots the probability of choosing ’sensing’ action in pure strategy while learning with LRI algorithm for u = 5 and ⌘ = 2.5. The two Figures 1a and 1b show that the players’ strategies successfully converge to the NE in a remarkably few number of iterations (< 60), and also highlight the impact of the initial distribution on the convergence time of the algorithm. We take arbitrary two different initial points; for x0 = (0.7 0.9 0.5 0.1 0.5) the algorithm converges slightly slower than for x0 = (0.2 0.2 0.3 0.8 0.7), due the closeness to the optimal solution. The two curves in each plot picture the probability of the minority group (blue curve) and of the majority group (pink curve) to play the action a = s. We adopt a constant step size, ⇢ = 0.01, instead of decreasing step size to not influence the convergence learning process and its accuracy.
10 3000µs 9µs 352bits 304bits 15µs 34µs
Average transmission delay (µs)
xi,t+1 (ai ) = xi,t + ⇢fi,t (ai )(1{ai,t =ai }
TABLE I: Parameters used in numerical simulations
5
10
15
20
25
30
35
40
45
50
Number of time slot L
(a) The average energy consumption (b) The average transmission delay vs vs y, u y, u
Fig. 3: Network performance
B. Gradient descent algorithm At each step, players update their beliefs by choosing the best response to the new action profile by computing a gradient-response. The algorithm begins initially with distribution x(0). A time t, a player i chooses action ai,t and updates its strategy: xi,t+1 (ai ) = xi,t
⇢
xi,t fi,t ,
(17)
fi is continuous and differentiable function. Figure 2 illustrates the fully mixed NE convergence using the gradient descent algorithm with a constant step size ⇢ = 0.01. In Figure 2a the probability of choosing the two actions by the players in minority side x1,t is represented considering different initial points. We perceive that the player converges rapidly to the equilibrium according to the starting points, the more the starting points are close to the equilibrium state, the more he converges rapidly to the NE. Note that even the starting points are dissimilar we converge to the same NE. In Figure 2b we investigate the convergence to the NE considering two different values of u and a fixed regret value equals 2.5. The number of secondary users in competition for the channel affects the convergence speed to the equilibrium. For u = 3, secondary users reach the equilibrium state only on 70 iterations, while the algorithm requires some 450 iterations for u = 21. We expect that the more the number of users in contention for a given channel is large, the more the convergence to the equilibrium becomes slow. In our model we assume that ⌘ ⇧ 0. In Figure 2c we depict the probability x1,t as function of regret ⌘ which takes negative and positive value for u = 3 to highlight the effect of this parameter on the player action. The player tends to choose the action a = n when this parameter is larger and the probability to play a = s decreases even if he is in the minority side. This fact is due to the trade-off between conserving the battery energy and decreasing the transmission delay. When the regret is positive the player by choosing a = n receives a positive reward and
conserves its energy, which seems useful to reduce secondary users channel competition for crowded networks. The regret value affects also the convergence points, Figure2d. VI.
N UMERICAL RESULTS
In this section, we aim to evaluate the energy consumption and the performance of the cognitive network in a MG approach. We have studied system performance: transmission delay. The proposed minority game approach allows each secondary user to decide at the beginning of the time slot whether to sense the channel or to be non-active. Thereby secondary user focuses on the trade-off between energy consumption and performance improvement. Furthermore, secondary users seeking to use target data channel will cooperate with each other with no intention of cooperating. Indeed, the (noncooperative) secondary users behavior will improve the overall system performance, and also ensure an efficient battery energy management due to its effect to decrease and lighten the number of active users in the system to the equilibrium fraction. The parameters values are defined in Table.I. For simplicity, we consider u = 7, 13, 5, and compare the system performance with and without MG approach. The simulation results are depicted in Fig. 3. Figure 3a shows the average energy consumption as a function of the number of time slot required to achieve the data transmission y, and the number of secondary users contenting for the channel u. The average energy consumption decreases with the increase of y and of u for the two approaches. The decrease is considerably less rapid for the MG approach and secondary users can significantly preserve their battery energy due to the energy awareness aspect of our model. Nevertheless, the average transmission delay is more important for the case of MG, as depicted in
Figure 3b, due to number of slot time in which a part of SUs decide to remain inactive. This fact does not influence the relevance of our approach. Consider that the application layer for target secondary user specifies a maximum transmission delay value DM ax = 8s, then for y = 40 and for u = 13 we have a difference of almost 3.5µs between the optimal formulation and the MG formulation. Comparing with the energy consumption numerical results, for the same y and u values, the difference between the two approaches is rather important and thus the proposed MG approach achieves a good trade-off energy consumption / transmission delay. VII.
C ONCLUSIONS
In this paper, we proposed a green opportunistic channel access scheme for cognitive network based on minority game to improve battery life and system performance. We emphasized the impact of the number of users involved in sensing phase on system as whole. We designed the payoff mechanism and we studied the Nash equilibrium solutions for both pure and fully mixed strategies. We also used learning algorithms with constant step size to converge to NE. Finally, we shown through simulations that the cognitive users battery life and the system performance improve through MG-based opportunistic channel access scheme. R EFERENCES [1]
J. Mitola and G.Q. Maguire, Cognitive radio: making software radio more personal, IEEE Personal Comm. Mag., vol. 6, no. 4, pp, 13–18, 1999 [2] Moshe Masonta, Yoram Haddad, Luca De Nardis, Adrian Kliks, and Oliver Holland, Energy Efficiency in Future Wireless Networks:Cognitive Radio Standardization Requirements, 2012 [3] Sana Ziafat, Waleed Ejaz and HabibUllah Jamal, Spectrum Sensing Techniques for Cognitive RadioNetworks: Performance Analysis, IEEE MTT-S, 2011 [4] Waleed Ejaz, Najam ul Hasan ,Seok Lee and Hyung Seok Kim, I3S: Intelligent spectrum sensing scheme for cognitive radio networks, EURASIP Journal on Wireless Communications and Networking, 2013 [5] Sina Maleki, Ashish Pandharipande, and Geert Leus Energy-Efficient Distributed Spectrum Sensing for Cognitive Sensor Networks , IEEE SENSORS, 2011 [6] Sebastien Turban, Game theory, Columbia University , 2011 [7] Habib B.A. Sidi, Wissam Chahin, Rachid El-Azouzi and Francesco De Pellegrini, Energy efficient Minority Game for Delay Tolerant Networks, arXiv:1207.6760v1, 2012 [8] Petri Mähönen, Marina Petrova, ,Minority game for cognitive radios: Cooperating without cooperation, physical communication, 2008 [9] Mouna El machkour, Abdellatif Kobbane, Essaid Sabir and Mohammed El koutbi,New insights from a delay analysis for cognitive radio networks with and without reservation, IWCMC,2012 [10] J.B. Rosen, "Existence and uniqueness of equilibrium point for concave n-person games", Econometrica ,1965 [11] E. Sabir, T. Hamidou, and M. Haddad, "Joint strategic spectrum sensing and opportunistic access for cognitive radio networks". In IEEE GLOBECOM, December 2012. [12] M.A.L. Thathachar, P.S. Sastry, and V.V. Phansalkar. Decentralized learning of nash equilibria in multiperson stochastic games with incomplete information. IEEE transactions on system, man, and cybernetics, 24(5), 1994