Repeated Game-Inspired Spectrum Sharing for Clustering Cognitive Ad Hoc Networks

The paper studies the cooperative spectrum sharing among multiple secondary users (SUs) in a clustering cognitive ad hoc network. The problem is formulated as a repeated game with the aim of maximizing the total transmission rate of SUs. Firstly, a clustering formation procedure is proposed to reduce the overhead and delay of game process in cognitive radio network (CRN). Then the repeated game-inspired model for SUs is introduced. With the model, the convergence condition of the proposed spectrum-sharing algorithm is conducted, and the convergence performance is investigated by considering the effects of three key factors: transmission power, discount factor, and convergence coefficient. Furthermore, the fairness of spectrum sharing is analyzed, and numerical results show a significant performance improvement of the proposed strategy when compared to other similar spectrum-sharing algorithms.


Introduction
The scarcity of spectrum has become a major bottleneck of the development of next generation wireless communication system. Cognitive radio (CR), which allows unlicensed or secondary users (SUs) to share the spectrum with licensed or primary users (PUs), shows great promise to enhance the spectrum utilization efficiency [1,2]. The CR technology enables the SUs to opportunistically access the available spectrum bands through four main functionalities: spectrum sensing, spectrum managing, spectrum mobility, and spectrum sharing [3]. Among these, spectrum sharing is one of the most important functions in cognitive radio, which allows SUs to share the available spectrum bands among the coexisting PUs [4]. It is essential for improving spectrum utilization.
There exist some research efforts on the problem of spectrum sharing in CR. Among these studies, a centralized spectrum management scheme was proposed in [4]. It greatly improves the system performance over the (iterative waterfilling) IWF scheme by utilizing a centralized spectrum management center (SMC). However, due to the heterogeneous and dynamic nature of cognitive radio, centralized approach is not practical. Instead, in some studies, the distributed approach which does not need any central controller is suggested [5]; in [5], asynchronous distributed pricing scheme is proposed, based on the signal exchange via coordination between users to compensate the ascendant interference level. Distributed approach provides the better adaptation capability to CR in the dynamically changing heterogeneous environment, but the coordination among SUs results in significant amount of coordination delay. In [6], a novel distance-dependent MAC protocol for CRN is proposed, which attempts to maximize the CRN throughput. Regretfully, these protocols do not consider the fairness of the spectrum sharing.
Game theory, which analyzes the conflict and cooperation among decision makers (users), has widely used in designing efficient spectrum sharing. A dynamic game model is presented in [7], in which the SUs can iteratively adapt their strategies in terms of requested spectrum size. The stability condition of the dynamic behavior for the spectrum-sharing scheme is investigated. In [8,9], the authors investigate whether spectrum efficiency and fairness can be obtained by modeling the spectrum sharing as a repeated game. In [10], the authors model the channel assignment and power  control problems as a noncooperative game, in which all wireless users jointly pick an optimal channel and power level to minimize a joint cost function. A no-regret learning algorithm using the correlated equilibrium concept to coordinate the secondary spectrum access is considered in [11]. In [12], a self-enforcing truth-telling game mechanism is used to suppress cheating and collusion behavior of selfish users for spectrum sharing. It is shown that the SUs can get the highest rate utility only by announcing their true private information under the assumption that all the SUs have the same maximal transmission power. Based on the Nash bargaining in cooperative game, an improved utility function is proposed in [13] to maximize the profit product of all the SUs. Consider that the CRN is characterized by a lack of centralized control and the restriction that global information is not available, which requires that the game algorithms for spectrum sharing should be completely distributed relying on local information. It motivate us to employ local interaction games [14], which have been recently introduced in CRN research known as graphical games in [15]. In [16], two cases of local interaction game are proposed to cope with the lack of centralized control and local influences. The first is local altruistic game, in which each user considers the payoffs of itself as well as its neighbors rather than considering itself only. The second is local congestion game, in which each user minimizes the number of competing neighbors. It is shown that, with the local games, global optimization is achieved with local information. Although some progresses have been achieved in the above approaches, the problem of spectrum sharing more efficient and fair is not yet solved.
In this paper, a repeated game theoretic model is proposed, with the aim to maximize the total rate revenue of SUs in cognitive ad hoc network. Clustering is executed firstly in the model to avoid frequent collisions and to reduce the coordination delay between SUs. With the model, the convergence condition for the total revenue maximum is studied. The transmission power, discount factor, and convergence coefficient, which impact the convergence behavior, are explored. Besides, the fairness of spectrum sharing is investigated.
The rest of the paper is organized as follows. In Section 2, the system model is introduced. In Section 3, the total rate revenue of the spectrum-sharing algorithm, the convergence behavior and fairness of the algorithm are analyzed. In Section 4, the performances of the rate revenue, convergence, and fairness are simulated and evaluated. Finally, conclusions are drawn in Section 5.

System Model and Descriptions
Spectrum-sharing model based on repeated game theory is presented in Figure 1. In the scenario, N secondary users compete to access the available spectrum to transmit data. Note that the wireless channels are assumed to be quasistatic for each time slot, that is, channels remain unchanged within the time slot duration, but they vary from one slot to another one.
The model consists of two parts: cluster formation and spectrum sharing. With no clustering in a cognitive network, the collision happened as long as ( ≥ 2) SUs in the transmission range of each other have data to transmit simultaneously [17]. By introducing the idea of clustering in the network, ordinary (cluster-member) SU communicates with only cluster-head (CH) SU, and collisions between SUs can be decreased greatly. A combined weight metrics to elect the CHs is considered in the cluster formation process.
For the spectrum sharing, a static repeated game among SUs can be used to obtain the Pareto optimality in a clustering fashion by the "grim trigger" strategy. Definition 1. The Nash equilibrium is a set of strategies, one for each player, such that no player has incentive to unilaterally change her action. Players are in equilibrium if a change in strategies by any one of them would lead that player to earn less than if she remained with her current strategy. For games in which players randomize (mixed strategies), the expected, or average payoff (also termed as revenue, utility or outcome) must be at least as large as that obtained by any other strategy.

Definition 2.
The Pareto optimality is a measure of efficiency. An outcome of a game is Pareto optimal, if there is no other outcome that makes every player at least as well off and at least one player strictly better off. That is, a Pareto optimal outcome cannot be improved upon without hurting at least one player. Often, a Nash Equilibrium is not Pareto optimal implying that the players' payoffs can all be increased.
International Journal of Distributed Sensor Networks 3 In order to stimulate cooperation among selfish players (SUs) and achieve the Pareto optimality, the "grim trigger" strategy is adopted in our spectrum-sharing model. The "grim trigger" is a trigger strategy employed in a repeated game [18]. Initially, a player using grim trigger will cooperate, as soon as the opponent defects (thus satisfying the trigger condition), and the player using grim trigger will defect for the remainder of the iterated game. Since a single defect by the opponent triggers defection forever, grim trigger is the most strictly unforgiving of strategies in an iterated game.

Clustering Formation.
In this section, a clustering procedure that exploits combined weight metrics is proposed. The idea of using combined weight metrics, including the ideal degree, transmission power, and battery power, has been considered in the literature [19]. In this paper, in addition to the above weight metrics, the clustering stability and node type have also been considered as weight metrics to elect the cluster-heads (CHs). The clustering stability is used to retain the stability of the network topology, and node type helps to reflect the realistic characteristics of multiple types of nodes in cognitive ad hoc networks.
The network formed by the SUs and transmission links can be represented by an undirected graph = ( , ), where V represents the set of SUs and E represents the set of links . Clustering can be thought as a graph-partitioning problem with some constraints. Look for the set of vertices ⊆ ( ), such that where [ ] is the neighborhood of SU-. The set S is called a dominating set such that every vertex of G belongs to S or has a neighbor in S. The dominating set of the graph is the set of CHs. It might be possible that a node is physically nearer to a CH but is the member of another CH.
The following metrics are considered in our clustering procedure for a cognitive ad hoc network.
(i) Ideal Degree. Each CH can ideally support ideal (a predefined threshold) nodes to ensure that CHs are not overloaded, and the efficiency of the system is maintained at the expected level. If the CH tries to serve more nodes than it is capable of, the system efficiency decreases in the sense, because the nodes have to wait longer for their turn to get the share of the resource. A high system throughput can be achieved by limiting or optimizing the degree of each CH.
The degree difference from the ideal degree helps in efficient MAC functions and load balancing because it is always desirable for a CH to handle up to a certain number of nodes in its cluster.
The neighbors of each SU-(i.e., SUs within its transmission range) is defined as the degree of node , The degree difference Δ for every node is computed as where ideal is the number of nodes that a CH can handle ideally.
(ii) Transmission Power. It is known that more power is required to communicate to a larger distance. As the nodes move away from the CH, the communication may become difficult due mainly to signal attenuation with increasing distance. The usual attenuation in the signal strength is inversely proportional to some exponent of the distance, which is usually approximated to 4 in cellular networks, where the distance between mobiles and base stations is of the order of 2-3 miles. In ad hoc networks, the distances involved are rather small (approximately hundreds of meters). In this range, the attenuation can be assumed to be linear [20].
For every node, the sum of the distances, , with all its neighbors can be computed: (iii) Battery Power. The battery power can be efficiently used within certain transmission range; that is, it takes less power for a node to communicate with other nodes if they are within close distance to each other. A CH consumes more battery power than an ordinary node, since it has extra responsibilities to carry out for its members.
We consider a heterogeneous network with multiple initial energy levels. The cumulative time, , during which an SU-acts as a CH, implies how much battery power has been consumed.
(iv) Clustering Stability. In order to avoid frequent CH changes, it is desirable to elect a CH that does not move very quickly. The focus of the most existing literatures has mostly been on absolute mobility of the nodes without taking into consideration the relative mobility. Thus, the stability of the network is perturbed. Our clustering procedure achieves the network stability by considering the relative mobility of nodes.
The average distance for every node from its neighbor till current time T is calculate as where is the distance between node and at time . Let LS represent the link stability between and , expressed by The average of the link stability for every node , with all its neighbors [21], is given by The cluster stability can be obtained as 4 International Journal of Distributed Sensor Networks From (8), we can see that the CStab is somewhat like that of variance, which reflects the relative mobility of the nodes.
(v) Node Types. In many realistic ad hoc networks, multiple types of nodes do coexist [22]. For example, in a battlefield network, portable wireless devices are carried by soldiers, and more powerful and reliable communication devices are carried by vehicles, tanks, aircrafts, and satellites; these devices/nodes have different communication characteristics in terms of transmission power, data rate, processing capability, reliability, security level, and so forth.
In heterogeneous ad hoc networks, it would be more realistic to elect CHs for considering the different types of nodes. For simplicity, two types of nodes are considered in the network. One type of node has larger transmission range (power) and data rate and better processing capability and is more reliable and robust than the other types. Accordingly, the mapping values of the two types of nodes might be denoted by 1 and 2.
The combined weight for each SU-can be calculate by Subject to: where 1 , 2 , 3 , 4 , and 5 are the weighing factors for the corresponding metrics. The contribution of the individual metrics can be tuned by choosing the appropriate combination of the weighing factors [21]. The node with the smallest would be selected as the CH. All the neighbors of the chosen CH are no longer allowed to participate in the election procedure. The clustering procedure continues until the remaining nodes are selected as CHs or assigned to a cluster.
The first component in (9), Δ , contributing towards the combined metric helps in efficient MAC functioning because it is always desirable for a CH to handle up to a certain number of nodes in its cluster. The motivation of is mainly related to energy consumption. A CH is able to communicate better with its neighbors having closer distances from it within the transmission range. The third component, , is measured as the total (cumulative) time a node acts as a CH. As a heterogeneous network with multiple initial energy levels is considered, the power currently available at the node depends on the node's initial power, the actual network traffic, and the length of the links. The component of CStab is measured as the clustering stability, which is mainly related to the velocity and direction of the mobile nodes, especially the mobility relative to CHs. The nodes' association and dissociation to and from clusters perturb the stability of the network, and thus reconfiguration of CHs is unavoidable. It is desirable to elect a CH that does not move very quickly relative to its neighbors. The last component is related to the types of nodes. The more powerful, the more responsibility for communication nodes must be taken. As a result, it would be more appropriate for a node with more powerful capacity to be a CH. The proposed clustering strategy is demonstrated with the help of Figures 2(a)-2(d). All numeric values obtained from clustering process are given in Table 1. Figure 2(a) shows the initial configuration of the nodes (SUs) in a cognitive radio network, where an edge (link) between two nodes in the figure signifies that the nodes are neighbors of each other, and the length of a link represents the distance of two nodes. In the figure, the degree difference, Δ , of each node with ideal node degree ideal = 2 is computed. The arrows in Figure 2(b) represent the speed and direction of movement associated with every node. A longer arrow represents faster movement, and a shorter arrow indicates slower movement. Some arbitrary values for are chosen which represent the amount of time a node has acted as a CH. The values for are chosen randomly. If = 1, it implies that a node is more reliable and robust than the nodes whose  Figure 2(c) shows how a node with minimum is selected as the CH, where the pink solid nodes represent the CHs elected for the network. Figure 2(d) shows the initial clusters formed by execution of our clustering strategy and the achieved connectivity in the network, where a dashed ellipse is used to express a cluster. We can see that no two CHs are immediate neighbors, since all the neighbors of the chosen CH belong to the same cluster. The network connectivity is achieved through the higher power transmission range of CHs. Also, it can be noted that a single component graph is obtained in this case which means that there is a path from a node to any other node. For simplicity, the ideal node degree is set to 2 in this paper. Without loss of generality, the ideal node degree can be set as an arbitrary positive integer.

Repeated Game-Based Spectrum
Sharing. The SUs sharing the spectrum of licensed users (PUs) may lead to the following conflicting problems: (i) limitation on the transmission power in each channel for minimum interference to coexisting PUs and (ii) certain signal-to-noise ratio (SNR) required for data transmission of SUs without substantial performance degradation. Cooperation among SUs has been proved to be beneficial in solving such conflicting interests [23]. Such cooperation can be achieved with the application of the concepts of game theory. But, cooperative game faces scalability problem to be implemented in CR networks. In a large CR network, the overhead and delay of the game process will be unbearable, if all the SUs play a single game; on the other hand, it is not reasonable to let SUs that are set apart far way (thus has little direct mutual impacts) play the same game. So it is necessary and reasonable to group the cognitive radios network into multiple clusters firstly.
In order to model and analyze long-term interactions among players, the repeated game model is used where the game is played for multiple stages. A repeated game is a special form of an extensive-form game in which each stage is a repetition of the same strategic-form game. Particularly, the spectrum-sharing problem can be modeled as the outcome of a repeated game, in which the players are the SUs their strategies (actions) are the choice of spectrum resources.
In a repeated game, a normal-form game is mathematically defined as where V is the finite set of SUs and is the set of strategies associated with SU-. Define B = × ∈ as the strategy space and : B → R as the set of utility functions that the SUs associate with their strategies. For every SU-in game A, the utility function, , is a function of , the strategy selected by SU-, and of the current strategy profile of its opponents: − .
In analyzing the outcome, as the decision of one SU is influenced by the other SUs' decisions, we are interested to determine, if there exists a convergence point, that is, Nash equilibrium (NE), for the spectrum-sharing algorithm, from which no SU would deviate anymore. A strategy profile for the SUs, = [ 1 , 2 , . . . , ], is an NE if and only if If the equilibrium strategy profile in (12) is deterministic, a pure strategy NE exists. When there is more than one NE in the repeated game process, it is natural to ask whether there exists an optimal one, that is, Pareto optimality. In order to stimulate cooperation among selfish SUs and achieve the optimality, the "grim trigger" strategy is adopted. For the case of no deviation from cooperation in a repeated game, the utility function at every stage for SU-is unchangeable. The overall utility for SUin a repeated game is represented as the discounted sum of immediate utilities from each stage; that is, where (0 < < 1) is the discount factor which measures how much the SUs value the future utility over the current utility. The larger the value is, the more patient the SUs are. In general, is close to 1 for cooperative spectrum sharing in a repeated game. For finite K-( > 1) stages repeated game, (13) can be rewritten as Without loss of generality, consider a repeated game with two SUs (one is CH and the other is cluster-member) competing for the limited spectrum resources. If an SU senses the PU at the licensed spectrum, it moves to another spectrum hole or stays in the same band without interfering with the PU by adapting its communication parameters such as transmission power or modulation scheme. For this paper, the total transmission power is constrained for SUs to make them stay in the same band during the spectrum sharing. Figure 3 illustrates the utility region of a repeated game with two SUs for the Gaussian interference channel.

Performance Evaluation
The cognitive ad hoc network considered consists of multiple SUs is illustrated in Figure 1. In the figure, TDMA scheme is suggested for every cluster; that is, the TDMA frame is divided into M time slots, and every cluster is assigned one unique spectrum-sharing slot in a frame. The mth cluster always transmits in slots assigned to it.
For a cluster containing only one SU, the SU might occupy the assigned slots for transmission. However, if two or more SUs exist in a cluster, the repeated game should be executed to compete for the assigned slot. Take SU-1 and SU-2 in Figure 1 as an example; they consist of a set of two transmittingreceiving (T-R) pairs with channel gain of 1 for each T-R pair. Assume that the sharing channels, width is normalized to 1, which is divided into two independent channels with width of 1/2 and with the external noise power of 0 . It is noticeable that the transmission powers for SU-1 and SU-2 are 1 and 2 , respectively, and the total power constraint P for two users is constant with respect to threshold interference power of interference temperature mechanism for CRs.

Rate Utilities and Total Rate Revenue.
When the two users transmit over the same channel, the interference is looked as the Gaussian noise, and the Gaussian interference game was defined in [24,25].
The interference measured at the receiver associated with transmitter is shown to be [26] where denotes the set of transmission power associated with user over the transmission channel and the interference gain with symmetric and identical channel conditions.
( , ) is the interference function characterizing the interference caused by SU-to SU-and is defined as 1 if transmitters and are transmitting over the same channel, 0 otherwise. (16) It is apparent that utilities of SUs are closely related to interferences. Moreover, the performance of the spectrumsharing algorithm depends significantly on the choice of the utility function which characterizes the preference of a user for a particular channel. The choice of a utility function is not unique. It must be selected to have physical meaning for the particular application and also to have appealing mathematical properties that guarantee equilibrium convergence for the game process. Our objective is to maximize the total rate revenue of SUs by cooperatively sharing the spectrum. Let 1 and 2 be the utility functions of SU-1 and SU-2 for each stage of the repeated game, respectively, according to Shannon theory, which can be obtained by where 1 and 2 are the transmitting rates available for SU-1 and SU-2 and 1 and 2 are the transmission power of the two SUs. By substituting (17) into (14), 1 ( ) and 2 ( ) are given by the following equations: where K is the number of stages for repeated game and 1 ( ) and 2 ( ) are the rate utilities of SU-1 and SU-2 for repeated game spectrum sharing. Then the total rate revenue is 3.2. Convergence Analysis. As mentioned, revenue stability is closely related to the number of stages. The definition is following: when the revenue differences Δ ( ) between kstages repeated game, and ( − 1)-stages repeated game is less than convergence coefficient ( ≥ 0) that is, Δ ( ) satisfies; International Journal of Distributed Sensor Networks It is believed that the spectrum sharing achieves convergence. According to (19), Δ ( ) is shown to be Then, the convergent condition of the algorithm is written as The convergent condition is satisfied, if and only if ≥ log ( 2 log 2 (1 + 1/ 0 + 1 2 / 2 0 ) ) + 1⋅ Let cvg denote the number of stages for convergence; that is, cvg = ⌊log ( 2 log 2 (1 + 1/ 0 + 1 2 / 2 0 ) ) + 1⌋ ⋅ (24)

Fairness Analysis and Improvements.
To maintain reliable communication, a certain transmitting rate threshold required for SUs is necessary. If one user transmits with high power while the other uses low transmission power (as point C or D in Figure 3), the normal communication is not well guaranteed.
How to improve the fairness of spectrum sharing is an urgent issue to be settled. It is assumed that 1 or 2 is smaller than the given rate threshold, min , at the beginning. The rate utilities 1 and 2 can be changed by adjusting the transmission powers. The power adjustment procedure for SU-i is executed as follows.
Let the fixed-step size of power adjustment, ΔP, be where ini is the initial transmission power of the user, min is the minimum power required correspondence with the rate threshold min for reliable communication, and m is the number of times for power adjustment. The transmission power for the kth game can be expressed by where −1 is the transmission power of the ( − 1)th game.

Simulation Results
It is assumed that the noise power 0 = 0.01 W and the total power constraint = 1 + 2 = 1W. Figure 4(a) shows the impact of the number of stages K on the total rate revenue with varying transmission powers for = 0.95. When the transmission powers for SU-1 and SU-2 are equal (i.e., ratio of transmission powers = 1 : 2 = 1), the total revenue reaches the maximum. It can be seen that, as expected, with the number of stages K increasing, the total revenue increases. It is noticeable that the total revenue changes slowly when K is more than 30. This is because the revenue is tending towards stable condition. Compared with the spectrum allocation algorithm in [13], it is shown clearly that our spectrum-sharing strategy outperforms the algorithm in [13], and the algorithm in [13] only performs well at high SNR.
The total rate revenue versus different number of stages with varying discount factors for = 1 is plotted in Figure 4(b). We can see that the total revenue increases with increasing. Figure 5(a) shows the impact of transmission power on the convergence rate when the noise power 0 = 0.01 W, discount factor = 0.95, and convergence coefficient = 0.1. The number of stages for convergence cvg increases when the difference between 1 and 2 is large enough (i.e., < 0.4). Nevertheless, with r increasing, the number of stages for convergence almost remains unchanged. In this case, the total revenue can reach the maximum (as shown in Figure 4). Figure 5(b) plots the convergence behavior with varying discount factor for noise power 0 = 0.01 W, ratio of transmission powers = 1, and convergence coefficient = 0.1. There is a significant increase in the number of stages for convergence cvg with increasing. This is because the larger the value is, the more patient the players are, and the convergence rate becomes more slowly. Specifically, cvg is an exponential function of when the value is more than 0.96.  It can be seen that the smaller value is, that is, more stringent convergence condition, the more slowly convergence rate is.
The rate utilities 1 and 2 for different transmission powers are presented in Figure 6. It is observed that when the transmission power difference between the two users is large at the beginning of a repeated game, that is, 1 or 2 is smaller than the given rate threshold (a predefined value), then the reliable communication is not to be guaranteed. Spectrum sharing is only one side for this case, and the fairness of SUs is nothing to speak of. However, with the transmission power difference decreasing, the fairness is improved.
The adaptation of rate utility due to the transmission power adjustment is shown in Figure 7. As expected, when the rate utility 1 is smaller than the given threshold, 1 can be changeable for meeting the requirement of transmitting rate with very little rate utility loss of 2 . Thus, the fairness of the algorithm is well guaranteed.
The total revenue comparison between the proposed algorithm with the method (LAG) in [16] is plotted in Figure 8. It can be observed that the total rate revenue increases with better fairness (i.e., larger ratio of the transmission powers, r). Especially, the total revenue is maximum when = 1. It is also noted from the figure that when the access probability is less than a value, that is, ≤ 0.7 the obtained total rate revenue of the proposed algorithm   outperforms the LAG algorithm. As the access probability increases, that is, > 0.7, there is an increasing revenue gap for LAG algorithm. However, as mentioned in [17], the collision happened as long as ( ≥ 2) SUs in the transmission range of each other has data to transmit simultaneously. That is, larger access probability is hardly guaranteed in CRN.

Conclusions
In this paper, the spectrum sharing is modeled as a repeated game under which cluster-member SU communicates with only cluster-head SU, and frequent collisions between SUs International Journal of Distributed Sensor Networks are avoided than that under no-cluster policy. The aim of this work is to maximize the total rate revenue of SUs under the repeated game model, while meeting the fairness requirements for transmitting data. The first step toward this work is to analyze convergence condition under which the total rate revenue of SUs is maximized. The analysis shows that the transmission powers, discount factor, and convergence coefficient affect the total rate revenue. The analysis continues by considering the fairness of spectrum sharing, and the fairness is improved by adjusting the transmission powers. Simulation results demonstrate that the proposed spectrumsharing algorithm can achieve better performance than the preexisting ones in terms of the total rate revenue and fairness of spectrum sharing.
For future research, the QoS (Quality-of-Service) requirement for SUs will be considered for spectrum sharing.