Bayesian analysis of spectrum occupancy prediction in cognitive radio

Abstract Efficient spectrum sensing is an important requirement for the success of the cognitive radio system. Presence of primary users over a specific band has to be monitored periodically in each time slot. Throughput of the system can be improved by sensing only those channels with higher probability of being idle. In this paper, we suggest a prediction-based spectrum sensing scheme and propose two simple and fast approaches based on Bayesian inference to predict the probability of a busy/idle next state. Further analysis is done to study the impact of various parameters associated with them.This channel prediction will help to select suitable channels for spectrum sensing from a rank list prepared based on the probability of channel being idle. It is seen that channel ranking using Bayesian approaches closely follow actual ranking. Proposed approaches are compared for their prediction performance and the computational complexity, with other approaches based on EWMA, HMM, and Neural Networks, which are already available in the literature. Data from spectrum occupancy measurement are used to compare the performance of all the above methods. It is seen that Bayesian approaches are having low computational complexity and hence faster. Their performance is also superior to other methods and it establishes that Bayesian methods are potential candidates for spectrum prediction in a realistic scenario.


Introduction
Cognitive Radio (CR) is considered as a solution to overcome spectrum scarcity in the present communication scenario. This will enable new wireless services to get introduced. CR technology has the potential to alleviate the problem of spectrum scarcity by increasing spectral efficiency and thereby allowing greater usage of this natural resource. Spectrum sensing is a very important activity that is used by the CR to understand the spectrum occupancy of primary user (PU). The biggest challenge related to spectrum sensing is in developing sensing techniques that are able to detect very weak PU signals while approaches used for comparison. Simulation results are presented in Section 4 with conclusions in Section 5.

Related work
Several prediction methods are proposed in the literature. Predicting the duration of spectrum holes of PU using HMM is proposed in [5]. The Authors have assumed that the channel state occupancy of PUs are to be Poisson distributed and based on the prediction, a CR can continue to use a channel or can be relinquished. A linear filter model followed by a sigmoid transform is used in [6] for spectrum prediction where spectrum occupancy is characterized as binary time series. The authors have considered two types of spectrum occupancy schemes, namely deterministic and non-deterministic for spectrum occupancy. These models have been used to provide predicted information to SUs' 'next-step decision. ' Multilayer Perceptron (MLP)-based approaches for spectrum prediction are presented in [7,8]. The parameters of the MLP predictor are updated using back propagation (BP) algorithm. Exponential weighted moving average (EWMA) approach is proposed in [9]. Here it uses the previous status of spectrum occupancy to predict the probability of the next state. A modified HMM method for channel state prediction is proposed in [4] and its performance is compared with 1-NN approach. Implementation of Hidden Markov Model Spectrum Prediction Algorithm is presented in [10] with some analysis. Beta distribution is considered in [11] to represent the channel occupancy pattern of PU and it is validated in [12].

System model
In order to improve the throughput of the system, we suggest a prediction-based spectrum sensing scheme as given in Figure 1. Here the trend of the spectrum occupancy of each channel is assumed to be available to the CR as being sufficiently fast and with low cost to implement. [1] Cooperative spectrum sensing is proposed in literatures as a solution to issues that arise in spectrum sensing due to noise, fading, shadowing, and hidden terminals. [2] Even though distributed sensing reduces errors in the individual sensing, extra hardware is required by the SU to fully exploit the spectrum opportunity. Various methods are proposed in the literature to sense the spectrum hole. A CR user develops a spectrum pool consisting of all the spectrum holes in a range of frequencies and chooses the optimum one for its future usage. Channel capacity can be increased using proper spectrum sharing policy. CR users are supposed to operate within very small time slots for spectrum sensing and for communicates with other users. Spectrum sensing, spectrum decision, and spectrum sharing will lead to considerable time delays. If it takes more time for these activities then the time available for data communication will be less and the throughput of the system will also come down.
Spectrum prediction will be an alternate approach to save sensing time. If there is a higher probability for the channel to be busy, CR can skip that channel for sensing purpose. It can look for channels with less chance of being busy for spectrum sensing. Prediction methods are used to predict the usage behavior of a frequency-band based on channel usage patterns of PU so that a CR can decide whether or not to move to another frequency band. Spectrum prediction in CR networks is a challenging problem that involves several sub topics such as channel status prediction, PU activity prediction, radio environment prediction, and transmission rate prediction. [3] Prediction-based spectrum sensing, [4] prediction-based spectrum decision, and prediction-based spectrum mobility [5] have been presented in the literatures. Predictionbased sensing is explored in this paper. In this case, the channel predicted to be busy can be omitted by the SU for sensing so that there will be time saving and energy saving. In CR networks, since SUs are sensing and observing the spectrum all the time, they can learn the usage pattern of the spectrum and use such information to predict the future status of the spectrum.
In this paper, we suggest a prediction-based spectrum sensing scheme and propose two simple and fast approaches based on Bayesian inference and its analysis is carried out. Its performance is compared with similar prediction methods based on statistical approach, Neural Networks, and Hidden Markov model. Major contribution of this work is its low-computational complexity and its adaptability to real scenario.
The organization of the paper is as follows: In Section 2, related works are presented and in Section 3, the system model is described with proposed Bayesian approaches for spectrum prediction and a brief description of other a record and it is called as 'prior. ' It is obtained through spectrum measurement by an external system. Spectrum measurement block is shown with doted lines to convey that the measurement is done outside the CR. A spectrum prediction unit can combine the prior and recent observations to predict the probability of the next state of a channel to be idle/busy. This information will allow the CR to have a channel ranking so that the CR needs to perform spectrum sensing only to selected channels. Channels which are predicted to have, higher probability to be busy, can be omitted. Cooperative spectrum sensing can be performed later to arrive at the list of vacant channels. Remaining part of the paper is given more focus to spectrum prediction approaches and their analysis.
We have considered that a PU operates on a specific frequency band and each channel is occupied by various PUs. Time is divided into different slots of specific duration and it is assumed that the channel is stable within the time slot; i.e. a channel state is constant for one slot and if a PU is not detected during the initial period of the slot, an SU can use the time slot for the remaining duration of the slot without causing any interference to the PU. Typically, the duration of the time slot is one millisecond or smaller. Spectrum occupancy status of PUs is represented as 'present' or 'absent' in a specific time slot. Gray boxes in Figure  2 represent the absence of a PU and magenta boxes represent the presence of a PU in the respective time slots. Spectrum sensing is an important task to be performed by each SU to sense its opportunity to use vacant slots.
It is required to perform spectrum sensing in the initial period of the time slot followed with data transmission or reception. Spectrum prediction will help to skip some channels from spectrum sensing. Spectrum predictor takes the status of N previous time slot into account and tries to predict the next state. The cyan block in Figure  2 represents the time slot to be predicted by the CR by utilizing the spectrum occupancy status in the previous time slots. It is assumed that the spectrum sensing by the CR terminals are correct. In the following paragraphs, prediction approaches are proposed based on Bayesian inference and a brief description about some of the prominent prediction approaches from the literature, which are implemented for performance comparison with our proposed methods.

Bayesian model for spectrum prediction
Bayesian Inference is an approach of inference where Bayes' rule is used to update the probability distribution of a hypothesis when additional evidence data are learned. In CR networks, a CR user can compute a probability distribution (also known as prior) of a system parameter θ, such as the spectrum occupancy status of a PU, denoted by P(θ), from the observations made and subjective assessment. Through spectrum sensing, some data X = [x 1 , x 2 , ... x n ] are observed for 'n' time slots. Then, a likelihood function of parameter θ, is calculated by CR user, denoted by L(θ), as the probability of the observed data given that parameter. After acquiring the prior probability distribution and the likelihood function, Bayesian inference can be used to derive the posterior probability distribution of the system parameter θ conditioned on the data X = [x 1 , x 2 , ... x n ]; [3]. Bayes' rule is given as The posterior probability is proportional to the product of the prior probability and another term P(X/θ), the probability of the data given the parameter, commonly known  be the probability of previous state being idle given the next state is busy, P(C/N) be the probability of previous state being busy, given the next state is idle, and P(D/N) be the probability of previous state being idle, given the next state is also idle. Now the probability of the next state to be busy, given the previous state is also busy can be calculated as where And the probability of next state to be busy, given the previous state is idle can be calculated as where This approach is implemented as case-I and its variants are also implemented as case-II and case-III. Differences between three cases are given below.
Case-I: Next state is predicted by looking only at the present state and the previous statistics. In this case, if you want to process N cases, you need to have N + 1 observations. Here the calculations are as mentioned above. Here the present state may be either S or N.
Case-II: A state is predicted considering the two previous states and statistics of the observed duration. In this case, N + 1th state is predicted by considering the pattern of N − 1th and Nth states and the previous statistics. Hence, the present states may be having any one combination from 'SS, ' 'SN, ' 'NS, ' 'NN. ' Case-III: A state is predicted considering three previous states and statistics of the observed duration. In this case N + 1th state is predicted by considering the pattern of N − 2th, N − 1th and Nth states and the previous statistics. Here the present states will have eight combinations 'SSS, ' 'SSN, ' 'SNS, ' ... 'NNN. '

Approach 2
Let us consider that the spectrum sensing by a CR in specific number of time slots follow a binary pattern represented as 'sensed (S)' and 'not sensed (N). ' It is assumed that sensing result in each time slot is arrived independently from a Bernoulli distribution with parameter θ. It may looks like SSNNNNSSNSSSSN…. By looking into the recent sensed results, the probability of a busy (s) state in the next time slot is to be predicted. The belief about the arrival of PU is called 'prior, ' which is formed based on the history of its arrival pattern. It is considered that this sequence forms a beta distribution. Let X be the observed result with S p and N p as the number of 'sensed' as the likelihood. Likelihoods are the critical bridge from priors to posteriors, re-weighting each parameter by how well it predicts the observed data. Different choices of the prior, P(θ), will lead to different inferences about the value of θ. The posterior distribution over θ contains more information than a single point estimate. It indicates not just which values of θ are probable, but also how much uncertainty there is about those values. However, there are two methods that are commonly used to obtain a point estimate from a posterior distribution. The first method is maximum a posteriori (MAP) estimation: choosing the value of θ that maximizes the posterior probability. The second method is computing the posterior mean of the quantity which is a weighted average of all possible values of the quantity, where the weights are given by the posterior distribution. System scenario is formulated to match with Bayesian problem in our work [13] and its extension is presented as approach 1. In approach 2, posterior mean is employed to predict the probability of a 'busy' next state. In the following discussion, term 'history' is used to represent the prior information. This is the average occupancy of the channel in the past. And 'recent observation' is used to estimate the Likelihood.

Approach 1
According to Bayes' rule, the posterior probability is proportional to the product of the prior probability and the likelihood. In this approach, the prior is the probability of channel occupancy by a PU in a particular channel based on large data observed already. Likelihood function is the probability of busy next state given a busy previous state. This is calculated from the data observed recently. Posterior probability is the probability of a busy next state (to be sensed) given a previous busy state. The prior is calculated by observing the spectrum occupancy status of PU for M 1 previous time slots. Let S p be the number of busy slots and N p be the number of idle slots from the spectrum occupancy status of the PU. Prior probability of the channel being busy is given by Hence, the prior probability of channel being idle is, where S stands for 'sensed state' or 'busy state' of a PU within a time slot and N stands for a 'Not sensed state' or 'Idle state. ' Let X be the recently obtained result with M 2 observations such that M 1 >> M 2 and S r is the number of cases where both next and previous states are busy. And N r is the number of cases where both next and the previous states are idle. Let P(C/S) be the probability of previous state being busy given the next state is also busy, P(D/S) where ε is a variable follows standardized normal distribution, thus Δ has a normal distribution with the mean and standard deviation Δt and √ Δt, respectively. μ and σ are known as the expected drift rate and the standard deviation rate of Δ . [15] Drift rate can be calculated by Equation (13) and the estimator ̂ of μ and ̂ of σ are given by (14) and (15)

Neural network approach for spectrum prediction
Neural network approaches are presented in [7,8]. In line with the methods used in those papers, a Neural Network is implemented for comparing its computational complexity and prediction performance with proposed Bayesian approaches. The MLP network used here is a multi-layered structure consisting of an input layer and two hidden layers, each with 15 neurons. The output layer consists of a single neuron. The network has N inputs and one output. The parameters of the MLP predictor are updated using the BP algorithm. Spectrum sensing results are applied to the network as a binary sequence. For training the network, a sequence of N inputs, say x n , x n − 1 , x n − 2 , x n − 3 , ... x n − (N − 1) are applied and x n + 1 is supplied as the desired response. This way the training is carried out with sufficient number of sequences and later it is used for prediction.

Discrete time hidden Markov model
An HMM-based approach is presented in [5] for spectrum prediction in CR. The sequence of spectrum states is modeled through the use of a two-state Markov Chain with the spectrum in either state St = 1 or St = 0. A Markov Chain has the property that the probability of future states is dependent only on the past 'm' states where 'm' is the order of the Markov chain. The parameters of the HMM are updated using the Baum-Welch algorithm. This algorithm uses the observed sequence of spectrum states to infer the underlying HMM transmission matrix. Assuming perfect sensing for the CR,the case of the emission matrix is not considered. This method is also implemented for comparing its computational complexity and prediction rate with proposed Bayesian approaches.
r 'not sensed, ' respectively, from the history considered. Similarly, S r and N r are the details about the recently observed results. Using a beta prior with the Bernoulli likelihood, posterior distribution can be obtained based on [14] as Which is a Beta (S p + S r + 1, N p + N r + 1) distribution. A point estimate of θ from this distribution is obtained through the MAP estimate of θ is given as And the posterior mean of the distribution is calculated as Probability of the channel being busy can be estimated from Equations (9) or (10). There will be slight difference between the two results. In a practical situation, a CR is expected to have the prior probability about a PUs arrival based on the history and this will be updated regularly. Likelihood of the data can be calculated from the recent observations. In practical cases, N p and S p need not be available in certain cases and it may be available as only a prior probability. Since denominator of Equation (1) normalizes the posterior probability, posterior probability can be calculated as the product of prior and likelihood.

EWMA-based prediction approach
EWMA-based approach is presented in [9]. The authors have considered channel occupying probability π(t) as Wiener process, which has the following two properties. [15] The change of π(t) during a small period of time Δt is where Δ(t) can be defined as the prediction interval, and ε follows standardized normal distribution (a normal distribution with mean 0 and standard deviation 1). The value of Δ for any two different prediction interval Δt are independent. Hence, the mean and standard deviation of Δ is 0 and Δt, respectively. In a prediction time interval Δt, the change Δ in the value of π(t) can be defined as context is around 0.5. Case-1 is showing more similarity with actual probability than other cases. On observing the direction of transition with respect to previous step at each time instances in the graph, evidence of correlation can be established between actual and Bayesian approaches. Comparison of Bayesian-2 approach is shown in Figure 6(b). In this case, (2, 1) distribution is used for analysis. EWMA approach is also compared here with Bayesian-2 approach. It is found that both Bayesian and EWMA patterns try to follow the actual probability pattern. Considering the magnitude of the predicted value at each time slots, EWMA looks closer to actual than Bayesian-2. But it can be seen that, this difference is in the average value of the plot. On observing the transition pattern, Bayesian approach follows the actual transition pattern with one time slot delay and the EWMA follows the actual pattern with two time slot delay. Hence, it can be considered that Bayesian-2's output is closer to the actual probability. It shows that Bayesian approach is a better choice for spectrum prediction in CR.
In all the Bayesian cases, there is a possibility to vary the number of data considered for prior and number of data considered for observation. In the next part, the effect of prior size and observation block size on the prediction value is analyzed. A typical case with known next state and its respective prediction by various methods are shown below. Two cases of next state ('sensed' and 'not sensed') with their respective prediction are shown in Figures 7-9.
Here the aim is to find the minimal size of the prior block and observation block for a reliable prediction. In all these cases, scale used for x-axis is 1:5 and that of y axis is 1:10. Figure 7 shows that Bayesian-1 (case-1 is considered) is giving a stable result with a block size of around 10 and that of prior is around 100. Generally, larger block size for a prior will give a stable result. Figure 8 shows that Bayesian-2 approach require slightly more block size as that of Bayesian-1. Slightly larger block size for prior is also required here. In Figure 9 EWMA approach is analyzed only on the block size under consideration and it is repeated more times to make it similar to other approaches. It is found that EWMA needs more data to get a stable prediction.
All the above analyses were carried out with the help of different data generated under various beta distributions. In order to match with a real situation, we have performed spectrum occupancy measurement of GSM-900 band using NI-USRP. 24 h measurement of spectrum occupancy in each time slot, per channel is obtained. Duration of time slot was around 1 s. The ON, OFF status of PU in each time slot is represented with 1s and 0s. These data of nine channels are used for comparing the performance of proposed Bayesian approaches with other methods.

Result and discussion
In this section, performance analysis of our proposed Bayesian approaches is carried out first and their performance is compared with other methods mentioned above. Binary data with Beta (α, β) distribution are used to represent spectrum occupancy pattern of a PU. Both prior and recent observations are extracted from same distribution and based on this data, probability of the next state is predicted using the methods presented above. Also the effect of various parameters on the predicted probability is analyzed.
As shown in Figure 3 consecutive blocks of data for prior and observation are selected and the probability of next busy state is predicted. For simulation the number of time slots considered for prior should be large. In a practical situation a CR will only have the estimate of the prior.
In order to reduce the computational complexity, the quantity of data considered as recent observation needs to be small. The cyan block shown in Figure 3 as prediction is of one time slot duration and its probability needs to be predicted. Comparison of predicted probability is dealt first. A comparison of prediction by Bayesian approaches and EWMA approach for 15 channels are given in Figures 4 and 5. Probability of 'next state to be idle' is calculated here. Actual probability is calculated from the known data. It can be seen that prediction by Bayesian approaches are very close to actual probability. Relative difference in the predicted probability between channels are almost closer to actual values. Number of recent observations N considered for this prediction is 12. All the 15 channels were drawn from different beta distributions.
Next the prediction over 25-50 consecutive time slots for various approaches are presented. In order to predict consecutive time slots, specific size of the prior and observation blocks are moved forward for 'n' number of times over the time slots and the predicted values are plotted in Figure 6. In practice, each SU will have the recent observations and a subjective estimate of PUs arrival rate. From this a node will infer the probability of the next busy state. Figure 6(a) shows the comparison of predicted probability of three cases of Bayesian-1 approach with the actual probability. This observation is arrived at with data distribution of (0.5, 0.5) and observation block size of 30. Three cases of Bayesian-1 approach are showing similarity among themselves and their magnitudes are increasing as it moves from cases 1-3. It is seen that the Bayesian estimate is moving around the actual probability. It was observed in our trials that as the size of the observation block increases, variation of Bayesian estimate from the actual has become smaller and smaller. Since uniform data distribution is considered, actual probability in this an estimate of prior data for the ranking. This estimate is taken only once and it may be updated later, but not on a regular basis. N recent observations will be considered regularly to prepare the ranking. This is to reduce the time consumption. Channel ranking carried out by all the methods are shown in Figure 10. X-axis in each Channel ranking, based on the probability of the channel being idle, is used to compare the performance of proposed Bayesian method with approaches based on EWMA, HMM, and Neural Network. All the methods tries to rank the channels after observing spectrum occupancy status of N recent time slots. They also use    they use 'prior, ' it seems the effect of recent observation is dominant in the estimate. Ranking pattern by Neural net is giving more deviation from the actual. When N = 10, Neural net is showing a closer relation with actual case than the previous case. But ranking pattern by other methods are showing deviation from the actual pattern. This is because of the insufficient size of the N. During experiment it was seen that N > 15 was giving a good result for Bayesian-1 and as N increases Bayesian-2, subplot is the channel number and the Y-axis gives the predicted estimate. The estimate with higher value is the best channel with higher probability of being idle. N = 20 is used for Figure 10(a) and N = 10 is used for Figure  10(b). Actual ranking is obtained from the available data and is shown in subplot-1. When N = 20, ranking by Bayesian-1 is almost closer to actual ranking. Estimates by Bayesian-2, EWMA, HMM are giving similar pattern, but slight deviation from the actual pattern. Even though   in the throughput of the CR system. In this paper, we proposed some Bayesian approaches for spectrum prediction and their analysis is carried out. This predicted probability can be used to rank the channels so that channels with lower rank can ne skipped from spectrum sensing. Performance of proposed methods are carried out on generated data as well as on real data obtained through spectrum measurement. Its performance is compared with existing approaches such as EWMA, [10] Neural Network, [9] HMM. [6] Considering the prediction performance and computational cost Bayesian approaches are giving a better performance and it is found that they are the promising approaches to improve the throughput of the system. EWMA, and HMM were giving a closer estimate to the actual case.
Analysis of computational complexity is carried out and the result is shown in Figure 11. These approaches were run on an Intel core-2 Duo CPU with 2 GHz clock and the time elapsed for each method is tabulated and compared. In the case of NN and HMM approach training time is not considered for comparison. Figure 11 clearly shows that Bayesian approaches are less expensive in terms of computational cost. Considering the time required for spectrum sensing and time available for data transmission within a specific slot as in Figure 3, a successful prediction is going to improve the throughput of the system. Low prediction time and higher detection rate of Bayesian approaches can make it a useful candidate for CR system.

Conclusion
Spectrum prediction is a very useful task that improves the efficiency of spectrum sensing and it will lead to increase