Cooperative Multiband Spectrum Sensing Using Radio Environment Maps and Neural Networks

Cogitive radio networks (CRNs) require high capacity and accuracy to detect the presence of licensed or primary users (PUs) in the sensed spectrum. In addition, they must correctly locate the spectral opportunities (holes) in order to be available to nonlicensed or secondary users (SUs). In this research, a centralized network of cognitive radios for monitoring a multiband spectrum in real time is proposed and implemented in a real wireless communication environment through generic communication devices such as software-defined radios (SDRs). Locally, each SU uses a monitoring technique based on sample entropy to determine spectrum occupancy. The determined features (power, bandwidth, and central frequency) of detected PUs are uploaded to a database. The uploaded data are then processed by a central entity. The objective of this work was to determine the number of PUs, their carrier frequency, bandwidth, and the spectral gaps in the sensed spectrum in a specific area through the construction of radioelectric environment maps (REMs). To this end, we compared the results of classical digital signal processing methods and neural networks performed by the central entity. Results show that both proposed cognitive networks (one working with a central entity using typical signal processing and one performing with neural networks) accurately locate PUs and give information to SUs to transmit, avoiding the hidden terminal problem. However, the best-performing cognitive radio network was the one working with neural networks to accurately detect PUs on both carrier frequency and bandwidth.


Introduction
Cognitive radio (CR) is a concept involving a communication device capable of knowing the spectral behavior in its environment and adapting to it. Taking advantage of spectral gaps (or holes) not utilized by licensed users, also known as primary users (PUs), CR technology allows nonlicensed or secondary users (SUs) to detect these available parts of the spectrum [1]. Specifically, the operation of CR involves four stages or functions: spectrum sensing, spectrum sharing, spectrum decision, and spectrum mobility. Spectrum sensing (SS) is a fundamental task of detecting one or more PUs; this stage shows whether the sensed spectrum is occupied or empty [2]. Usually, this task is carried out in single bands; however, the current paradigm of multiband spectrum sensing (MBSS) involves multiple bands that are not necessarily contiguous [3].
of the emitted signals, as shown in Figure 1. These maps are generally used to evaluate and predict the behavior of radio signals in a given environment, which is essential for capacity planning, communication system design, and troubleshooting related to interference. In addition, REMs are also used in CR applications, where devices can use the information provided by the REM to select the most suitable frequency to transmit data and avoid interference with other devices in the same location [20]. Collecting and interpolating power measurements are the two essential processes in constructing a correct map. In this case, interpolation methods [26] are critical for correctly estimating an unknown value between two or more known points (measurement points). In other words, a REM is simply a smoothed estimate of the power distribution based on a few known values of measured power. Two interpolation methods widely used to build REMs are inverse distance weighting (IDW) and Kriging, and both are explained below.

IDW Method
The inverse distance weighting interpolation method [27,28] is simple and easy to implement, and it is widely used in applications such as precipitation estimation, topographic data interpolation, and air pollution estimation. IDW is a classical interpolation method in spatial analysis and is most commonly used in geostatistical and mathematical interpolation [29].
IDW estimates the value of a variable at a specific point based on known values at nearby points. This method rests on the idea that nearby points have a greater impact on the estimate than more distant points. Thus, IDW uses a formula that assigns a weight to each known point based on the distance between them and the unknown point. The weights are determined by considering the inverse of the distance, so that the closest points have higher weights and the farther ones have lower weights. For the IDW method, we considered the data transposed vector z = (z(s 1 ), . . . , z(s n )) T as observations from a random process. In our case, z contained the average power at each point of the considered sampled region (see Figure 1) {z(s) : s ∈ D}; D ⊂ R 2 , at known geographic locations s 1 , . . . , s n . The final estimate was obtained using the weighted sum of the known values in the nearby points, given by [28]: where Z(x) is the estimated value of the variable at the unknown point s, z i is the known value of the variable at the i-th point s i , d(s i , s) is the Euclidean distance between the unknown point s and the point s i , p ∈ R is a smoothing parameter controlling the influence of the known points over the estimated values, and w i (s i ) is the weight assigned to each known point s i in order to account for the uncertainty in the known data. In practice, a value of p = 2 or p = 3 is usually used. The estimated values of Z(s) result in a mesh of unknown points. However, this method has limitations, such as a tendency to smooth out data variability and produce an inaccurate estimate in areas with few known points. In this work, Z(s) represents the power spectral density (PSD) in the different points around the sensed area.

Kriging Method
The Kriging method [30] is an interpolation technique derived from regionalized variable theory. It depends on expressing the spatial variation of the property using the variogram, and it minimizes the estimated prediction errors [31]. The goal of Kriging is to find the most accurate and least uncertain estimate by considering not only the value of the variable at the known points but also the distribution of the variable in space and its correlation with the known points. Following [30], the Kriging method can be established by taking the points and the region mentioned in (1), assuming µ is known and are considered with known covariance function: where δ(·) is a zero-mean stochastic process. Thus, the best linear unbiased predictor z(s 0 ) is obtained by minimizing [32] with λ 1 , . . . , λ n , subject to The optimal values were computed using the method of Lagrange multipliers. For this, the mean-squared prediction error (Equation (5)) was obtained by where c ≡ (C(s 0 , s 1 ), . . . , C(s 0 , s n )) , where γ ≡ (γ(s 0 , s 1 ), . . . , γ(s 0 , s n )) , Γ is an n × n matrix whose (i, j) th element is γ s i , s j , and is the variogram [γ s i , s j γ s i , s j , named the semivariogram [33]]. The variogram, a valuable tool in modeling spatial variables, describes how the data are correlated with distance and, in doing so, allows us to accomplish spatial interpolation using the sampled data and the variogram information to estimate the variance of the values of the variable at the unsampled points [33,34]. In summary, Kriging was more precise than other interpolation methods, such as IDW or cubic interpolation [29], and suitable for applications where the data have a certain spatial dependence.

Neural Networks
Neural networks are artificial intelligence models inspired by the structure and functioning of the human brain. These networks comprise many nodes, known as "neurons", connected to each other through "synapses". Each neuron receives inputs from other neurons, processes them through an activation function, and sends its output to other neurons, as shown in Figure 2. The combination of inputs and connections between neurons allows a neural network to learn complex tasks from the selected training data. Neural networks are a powerful tool for machine learning because they can model complex relationships between inputs and outputs and make accurate inferences from real data. However, these tasks require a large amount of training data and are computationally intensive (sometimes impossible) to train and use in real-time implementations [35]. In this work, we used the multilayer perceptron (MLP), as shown in Figure 2. This type of artificial neural network consists of an input layer, one or more hidden layers, and one output layer. In the MLP, neurons in each layer are connected to neurons in the next layer, and they use an activation function to determine the neuron's output. Information flows only in one direction, from the input to the output layers, passing through the hidden intermediate layers between the input and the output [36]. The general formula for calculating the output of a multilayer perceptron is given by [37] where O is the output of the multilayer perceptron, f is the activation function, which can be a step function, sigmoid, rectified linear unit (ReLU), etc., W i,j are the synaptic weights connecting the input I i to the current neuron, I i is the input of the multilayer perceptron, and b is the bias, where i = 1, . . . , n and j = 1, . . . , k, with n being the number of input features to the NN (corresponding to the number of neurons in the input layer), k the number of neurons in a given hidden layer, and m the number of provided outputs (i.e., the number of neurons in the output layer of the NN).
The formula is applied to every neuron in every network layer, including the input, hidden, and output layers. Each neuron in a hidden layer takes the output of all the neurons in the previous layer (i.e., assuming a fully connected network) as its input, and the output of one neuron in the output layer is the network's final output. During the training phase, the weights and the bias are adjusted to minimize the error between the network output and the desired output. This is achieved by a learning algorithm, such as the backpropagation algorithm [38], which updates the weights and bias in the direction of the downward gradient [39].

Preliminary Work
This section briefly describes the preliminary work on a novel MBSS technique based on the sample entropy (SampEn) developed and published by the authors in [39], constituting the basis of the new proposal. However, the MBSS technique presented in [39] was the result of the authors' previous efforts to refine and complement it with modules implemented in real time to, for instance, diminish the noise introduced by SDR devices. Hence, several comparative studies have been carried out in order to measure the performance of the proposed MBSS technique compared with other classical methods, such as energy detectors. The efficiency of locating the PUs by this MBSS technique stands out in SNR values close to 0 dB [11]. Table 1 highlights the principal contributions of these previous works. The last MBSS technique from Table 1 was implemented in a real environment using low-cost SDR devices. Each connected SDR device was considered an SU, independent from others, processing its information locally in a determined single band to achieve ensemble wideband spectrum sensing. The MBSS technique used in this proposal, however, combines different SDR devices, namely the RTL-SDR [42], the HackRF One [43], and the LimeSDR mini [44], to conform each considered SU, forming a CR infrastructure, which permits having an MBSS in a specific location. Figure 3 below shows how this MBSS method works. The MBSS blocks are as follows: • Sliding window of 100 ms. In this block, the complex signal x Ii,l (n) + jx Qi,l (n) from each SDR in the time domain was received and updated every 100 ms from the radio environment of the i-th SU integrated by z different SDR devices; • Power Spectrum Density (PSD) estimation. In this module, the Welch method [45] was applied to each signal x Ii,l (n) + jx Qi,l (n) in order to obtain, on a linear scale, the wideband PSD R i,l (k) from the SU ensemble; • Impulsive noise reduction. In this block, impulsive noise, high-frequency noise, and abrupt changes (many of them generated by the SDR devices themselves) in the signal R i,l (k) were eliminated or diminished through discrete wavelets via the multiresolution analysis [46], resulting in the signal R i,l (k); • Estimation of frequency bands and detection of primary users. In this module, the MBSS technique based on SampEn, K-means algorithm [47] (permitting to optimize specific detection parameters), and discrete wavelets was applied to each SU's analyzed wideband spectrum in order to obtain the spectrum occupation given by contained binary values indicating occupied ("1") and empty ("0") bands. The second vector was [L i,1 , L i,2 , . . . , L i,N ] T t and contained the corresponding computed boundaries for each detected band, and the third was [P i,1 , P i,2 , . . . , P i,N−1 ] T t , which contained the power for each detected band.
One of the primary motivations for this technique was to propose an MBSS method (i) adequate to correctly detect a primary user and the operating parameters, such as carrier frequency, bandwidth, and power, with (ii) computational complexity permitting a real-time implementation. Both aims were fulfilled, and the results were used to create a compact CRN capable of sharing spectral information with a central entity in order to estimate the area occupied by different detected PUs in the studied zone. A wireless communications environment was proposed considering these SDR devices.

New Proposal and Methodology
This section presents the new research proposal and the implemented methodology.

Proposal
Our previous work's novel idea was to consider a computer integrating different SDR devices detecting each PU in a single band to sense a wide frequency range, as shown in Figure 4a. This model was seen as the only entity containing different SUs. Our follow-up work proposed replicating this entity containing different connected SDR devices and creating a cooperative CR network for sensing a wide frequency range in a broader geographical region. Each entity was considered an SU containing different technologies for sensing a broad spectrum, as shown in Figure 4b. In addition, to sense the radio spectrum and uncover the PUs' behavior in a given geographical region, this cooperative CR network had the task of avoiding the hidden terminal problem. One of the main strengths of this new proposal (see Figure 4b) over previous work is the cooperation of the different SUs in geographically locating the PUs through the REMs and the ability to carry out this processing in real time.

Methodology
The proposed methodology can be outlined in three big blocks, shown in Figure 5:

•
The collection of information obtained by SUs. For this, each SU locally processes the sensed data to be sent to a central entity, including the occupancy of the observed spectrum in its geographic location and the frequency band edges and estimated power vectors; • The database for storing the information obtained by each SU at a specific time; • The central entity overseeing the processing determines the geographic area occupied by the detected PUs in the radio spectrum. Each block is described in detail below.

Collection of Information Locally by the SUs
The information collection was carried out every 100 ms for each SU, resulting in the Occupation T t R i,l (k) of analyzed spectral signal, as mentioned in Section 3 and displayed in Figure 3. This information was conveyed by three vectors: (i) the edge detector vector [L i,1 , L i,2 , . . . , L i,N ] T t , which stores the frequency edges where it is possible to find the presence of PUs, (ii) the binary decision vector [b i,1 , b i,2 , . . . , b i,N−1 ] T t , which stores a binary decision for corresponding delimited bands where noise or a possible PU transmission is detected, and (iii) the power vector [P i,1 , P i,2 , . . . , P i,N−1 ] T t , which stores the received average power corresponding to each classified window (i.e., each binary decision). The vectors in (ii) and (iii) are the same size.

Database
The edge detector, binary decision, and power vectors were stored in the database. It is important to note that the information stored in the database stemming from each SU was not necessarily similar; the lengths of the vectors were different for each SU. Indeed, each SU observed a different behavior of the radioelectric spectrum, because they were placed randomly in different geographic coordinates. To account for this discrepancy, all vectors were labeled with the exact sensing time T x (see Table 2), i.e., synchronized. Information was uploaded to a server, saving only enough relevant data (three different vectors for each SU) in the database to permit fast storage and extraction, taking advantage of SUs' hardware performing the MBSS technique locally. Indeed, the three uploaded vectors for each SU take up very little memory space. In the best case (without even a PU in the sensed spectrum), the edge detection vector had a length of two, while the binary decision vector and the power vector had a length of one. In the worst case, when the SNR had a low value (close to 0 dB), multiple windowing appeared due to noise, thus increasing the vectors' lengths. However, regardless of the size of the three vectors, the information was compact and facilitated the reconstruction of the spectrum signal occupation, including the average power for each reconstructed window.

Central Entity
The main tasks of the central entity were (i) to indicate how many PUs appear in the spectrum compared with the perceived SUs, (ii) to build a REM for each detected PU using the location of each SU, the perceived power in each SU, the time of radio space monitoring as parameters, and finally (iii) to show the area covered by the PUs.
Two ways of implementing these tasks were developed, the first through classical digital processing and the second using machine learning techniques (specifically, the NN). Both approaches are detailed below.

Central Entity Implementing Classical Digital Signal Processing
Digital signal processing, which uses programmed algorithms to process and analyze data, is a well-established and stable technology that has been used for decades in a variety of applications. It is relatively easy to implement and can be very fast and efficient in situations where the datasets and tasks are specific. In this case, the central entity was implemented with classical techniques for reconstructing the spectrum occupation from the shared information (edge detector, binary decision, and power vectors) obtained by each SU in the database. The central entity processing the information is shown in Figure 6 and described by Algorithm 1. Figure 6. Implementation of the central entity using classical digital signal processing.

Algorithm 1 Central Entity Computed by Traditional Signal Processing
Step 1. 1 The central entity collects the edge detector, binary decision, and power vectors of each PU at determined time instant t from the database. With this information, the central entity reconstructs the occupancy signal Occupation T t {R i (k)} and constructs simultaneously, an approximation of the power spectral density PSD_rec T t {R i (k)} using only the average powers (power vector) associated to each SU. The length of these frames is set to 1024 samples.

Step 1.2
Once the reconstruction of the occupation of each SU is performed, the mean value E[Occupation T t {R i (k)}] is computed. Step 1.4 Knowing now the central frequency and the corresponding bandwidth of each detected PU, it is possible to locate them in each reconstructed PSD In the strict sense, this scalar represents the average power in the carrier, sensed by each SU, where the PU N is or should be.

Step 1.5
With the scalar aux_psd i,N of each SU located in a specific coordinate, the REM will be built using a double interpolation. First interpolation is done through the IDW method and for the sec ond one, the Kriging method is applied. Due to the fact that in our case only eight geographical points are considered (i.e., only eight SUs are implemented), this double interpolation is carried out with the purpose of having a better precision to describe the behavior of the radio electric space in the environment described later. In this case, each REM is constructed with values aux_psd i,N that specifically correspond to the average power of the bandwidth B PU N . Step 1. 6 Finally, the active area of each PU will be determined according to the information collected by each sec ondary entity SU i . For this, a threshold of −80 dBm is used to classify the area estimated by the REMs that was chosen in [41] for a wireless environment. That is, regions in the REM that are above this threshold correspond to the active area of detected PUs.

Central Entity Implementing Classical Digital Signal Processing
The central entity was implemented using ML techniques, specifically NNs. NNs have the ability to learn from data and improve their performance as they absorb more information. This makes them particularly useful in applications where the data are complex, difficult to understand, and processed using traditional methods. Furthermore, NNs can perform tasks that traditional methods cannot, such as recognizing patterns in unstructured data or processing input signals that change over time. In this case, some modules in the central entity were replaced by NNs, as shown in Figure 7. The steps of this operation (Algorithm 2) are described below.

Algorithm 2 Central Entity Computed by NNs
Step 2. 1 The central entity collects the data from each of the SUs in the database at a specific time T x . In the module input estimator and processor, a processing is carried out in order to estimate the inputs driving a NNs ensemble. As mentioned above, the three vectors coming from each SU for time T x do not have the same length as the vectors for time T x+t 1 or T x−t 2 . Even, the size of the vectors differs from one SU to another SU even though they are at the same time T x . Based on this fact, this module oversees building the input vectors PSD_rec T t {R i (k)} and Edge_rec T t {R i (k)} driving the NNs. For this, these vectors must have always the same length (in our case this length is set to 13 samples). This block is detailed below.

Step 2.2
The vector PSD_rec T t {R i (k)} is evaluated by NN 1 , at the same time the vector Edge_rec T t {R i (k)} is evaluated by NN 2 and by NN 3 . This evaluation is performed sequentially, i.e., each vector of each SU will be evaluated by its corresponding NN one after the other until the result of the i − th connected SU be obtained. As a result of this step, NN 1 provides an approximate number of detected PUs and their powers. NN 2 gives an approximate number of detected PUs and their central frequencies. Finally, NN 3 returns an approximate number of detected PUs and their bandwidths. At the end of this section, it is discussed what would happen if the number of PUs detected by each NN is not the same.

Step 2.3
The information obtained from Step 2.2 is evaluated to determine the number of PUs in the spectrum and their corresponding power, bandwidth, and carrier frequency. This evaluation is detailed below.

Step 2.4
The information obtained from Step 2.3 will be shared with the REM estimator module. This block receives (i) the geographic coordinates of the SUs in the network, (ii) the data of the possible PUs in the spectrum (power, carrier, and bandwidth) and (iii) an identifier that corresponds to time T x in which the spectrum was monitored.
The rest of the steps of this algorithm that include NNs corresponds to Steps 1.5 and 1.6 of the methodology described in Algorithm 1 (Section 4.2.3.1). The input estimator and processor (Step 2.1) and the information collector (Step 2.3) modules are described below. Figure 8 shows the process of generating PSD_rec T t {R i (k)}, the input vector of NN 1 . First, power, edge detector, and binary decision vectors were used, with samples ranging in the intervals of [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20], [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20], and [2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21], respectively, according to the studies carried out in our preliminary work [39]. As mentioned above, using these vectors, we obtained (i) the frequency range that was sensed by each SU, (ii) the number of samples of the estimated power spectral density, and (iii) the number of windows in which the sensed spectrum was partitioned. An occupation value was assigned for each detected window, 1 for a possible PU transmission or 0 for the noise. Accompanying each occupation value was an average power value or the power vector element representing the power estimate in each detected window. The occupation and the power signal were reconstructed using these average values of each detected window, setting both vectors to a length of 1024 samples each. After that, the power-and occupation-reconstructed vectors were multiplied column by column, resulting in the Power_Occupation vector of 1024 samples. In this last vector, locations with estimated noise had a value of 0, and those with the possible transmission of one or several PUs had a value corresponding to the average power of detected windows. A downsampling of the Power_Occupation vector was performed to reduce it to only 10 samples and to combine it with the coordinates of the SUs (to which the analyzed vectors correspond) and the time T x at which the signal Occupation T t R i,l (k) was sensed, thus finally integrating the input vector for NN 1 . It should be noted that the time parameter is extremely important since the spectrum behaves differently over time. Figure 9 shows the construction of the vector Edge_rec T t {R i (k)}. Here, the occupancy and the frequency bands vectors corresponding to the frequency interval perceived by the corresponding SU were reconstructed. This vector was also multiplied column by column with the occupation vector, resulting in the Freq_Occupation vector being 1024 samples in length. This latter vector was assigned zeros where it corresponded to noise and nonzero values for the samples that corresponded to one or several PU transmissions. Again, downsampling was applied to reduce this vector to only 10 samples and join it with the coordinates of the SUs and the time T x corresponding to the sensing period. This concatenation drives the input of NN 2 and NN 3 . The NN 1 output corresponds to the power of each PU detected by each SU. When the PU power value was less than −80 dBm, it was interpreted as no transmission, corresponding to noise. The output of NN 2 resulted in the carrier on which a possible PU was located. Finally, the NN 3 output delivered the transmission bandwidths corresponding to each detected PU in the analyzed spectrum.
The resulting outputs from NNs for each SU are shown in Figure 10. These outputs can be classed into three large blocks: powers, carrier frequencies, and transmission bandwidths of each PU detected by the corresponding SU. The information collector block analyzes the output of each SU making up the network and sends it to the module permitting the REM estimate. For example, Figure 10a shows that the power of PU 1 is less than −80 dBm; the information collector interprets that PU 1 does not correspond to a transmission and, by sharing the information delivered by SU 1 with the estimator of REM, assumes that SU 1 observes (j − 1) PUs (where j represents the number of possible PU transmissions). Figure 10b shows that the carrier frequencies of PU 1 and PU j are outside the monitored space, so the information collector shares with the REM estimator that SU 2 only observes (j − 2) PUs. In Figure 10c, the B of PU 1 is a very small value (possibly corresponding to impulse noise); thus, the information collector shares with the REM estimator module that the SU n observes (j − 1) PUs. It is important to point out that any of these combinations shown in Figure 10 may change the number of observed PUs, i.e., if the power of a first PU does not exceed the threshold of −80 dBm, the carrier of a second PU is not in a correct frequency range, and the B of a third PU tends to zero; then, these three PUs will be considered as noise.
The proposed methodology has been implemented in a real wireless communication environment. This controlled environment is explained in the next section. Figure 10. (a). The SU 1 observes j PUs, but PU 1 is discarded because it does not meet the threshold. (b) The SU 2 observes j PUs; nevertheless, PU 1 and PU j are discarded because they are not in the correct frequency range. (c) The SU n observes j PUs, but the PU 1 is discarded for not having a large enough bandwidth to be considered a transmission. Figure 11 shows the real, controlled environment implemented in our research. In this proposed scenario, we considered two PUs collocated at the center of the studied area while, at the same time, eight SUs were sensing their behavior in their geographical zone of influence. It is important to note that the SUs and PUs did not share the geographic coordinates among themselves, nor was that information used in spectrum sensing. PUs were located at the center, hoping most SUs could receive part of the Pus' signal. SUs were set randomly in the area of study. Figure 11. The real implemented scenario. Table 3 specifies the involved parameters of both SUs and PUs. These SUs shared the channel occupancy with the database and the central entity in order to determine (i) how many PUs on average were observed in this environment, (ii) the B and the F C in which they located the detected PUs, and (iii) finally the area occupied by the detected PUs. PUs and SUs were deployed in an area 12 × 12 m 2 , containing structures such as walls, doors, windows, columns, etc., as indicated in Figure 11.

Results
This section outlines the results of implementing the proposed methodology in a real wireless communication environment. They are divided into two subsections; the first corresponds to the central entity based on digital signal processing, and the second concerns the central entity based on NNs.

Results with a Central Entity Based on Digital Signal Processing
This section shows the results of the implemented scenario presented in Figure 11. Figure 12 shows the reconstruction of the occupation of each SU by the central entity. In this case, we can highlight the following important points:

•
There were SUs who failed to perceive both PUs; • The central frequency F C of each PU was perceived at a different frequency by each SU; • The bandwidth size B for each PU varied for each SU; • SUs that were the farthest from the PUs failed to detect both PUs. Figure 13 shows the result of constructing an approximation of the PSD for each SU, considering only mean power values for each spectrum section. These approximations were used for the estimation of the REM.    Figures 15a and 16a show the REM of PU 1 and PU 2 , respectively. These maps were created based on the information collected by the SUs in their different locations. Figures 15b and 16b show the respective areas occupied by PU 1 and PU 2 . Values of the occupied area for each PU are area PU 1 = 60.76 m 2 and area PU 2 = 56.39 m 2 . These results were obtained by placing a threshold L = −80 dBm in the obtained REM. In this way, the area with a power greater than this threshold was considered a space occupied by the PU.   Figure 15b shows that SU 6 and SU 8 did not observe the transmission of PU 1 (i.e., both SU 6 and SU 8 did not appear in this area of influence). This effect was a result of the structural distribution in which the implementation of the real scenario occurred. Nevertheless, an expected collaboration between SUs might improve this result.

Results with a Central Entity Based on NNs
This section details the results obtained using a central entity and NNs. First, the NN training is presented, and then the results obtained by applying ML algorithms to this stage are shown.

Training
For the NN 1 training, NN 2 and NN 3 , the backpropagation algorithm, the Levenberg-Marquardt activation function, 1000 epochs, the mean square error as loss function, and a low learning rate were used. In addition, 9000 vectors of inputs and their corresponding outputs were used. To determine the best NN?, the number of layers that should be used, and how many neurons, 12 NNs were studied to carry out the work of NN 1 , NN 2 , and NN 3 . Each NN was considered with the number of layers la = 1, 2, 3, 4 and with the number of neurons ne = 16, 32, 64. For each NN, there were 12 possible combinations between neurons and layers. They were performed to find the most convenient NN configuration for this work.
The results are presented for both the entity that uses digital signal processing and the one that uses the three NNs. As the first parameter, the training time used by each NN is shown in Table 4, indicating the time each NN spent in the training stage. Additionally, Figure 17 shows that the more neurons and layers an NN had, the longer its training time was. All the processing was carried out using the same computer (MacBook Pro with 8 GB RAM and a 1st-generation M1 processor), and the training time for each NN in its different versions tended to follow similar behavior.

Statistics
This section analyzes the statistical results of the three NNs used in this proposal. To obtain these results, 27,000 entries were considered for each NN. Figure 18a presents the results of NN 1 , responsible for granting the power value of the detected PUs, showing the power difference between the real values and those obtained with the different versions of NN 1 . The difference between these values was, on average, −0.01 dBm for each version of NN 1 , highlighting the accuracy of the predictive ability of the NN-based approach. Figure 18b only shows the average value, indicating that the margin of error between the expected value and those given by the different NNs was very small; the values were practically the same.  Figure 19 shows the result of NN 2 , the network that gives the carrier value of each PU detected. In the proposed environment, the NN 2 detected two PUs, and Figure 19a shows its precision in detecting each PU carrier, with F C1 , on average, at 699.4 MHz, regardless of the configuration of NN 2 . Figure 19b shows the precision of NN 2 in detecting the PU 2 carrier, indicating that, on average, the carrier value was 700.494 MHz. Even though the network with 2 layers and 64 neurons per layer deviated slightly from the ideal value (700.5 MHz) compared with the other NN configurations, it continued to provide reasonably accurate results. In both images, the value of applying digital signal processing can be seen in blue and is quite close to the ideal values of 699.4934 MHz and 700.4935 MHz for PU 1 and PU 2 , respectively, thus resulting in a comparable level of accuracy of the NN-based approach.   Figure 20a). Moreover, Figure 20b shows that notwithstanding the NN structure, the value of the B for PU 2 is practically 1 MHz, which perfectly coincides with the ideal value (see Table 3). The blue dot in the image indicates the result of digital signal processing, as B 1 = 0.425 MHz and B 2 = 0.824 MHz. For B 1 , the margins of error/precision of DSP and NN were quite similar (only DSP overestimates while NN underestimates). For B 2 , however, it seems that the NN was much more precise, suggesting that the method based on NNs tends to improve DSP or, in the worst case, provide a comparable absolute error. To measure the precision of this methodology, the F1 score (F1) [48] was used as a common evaluation metric in the ML field for assessing the accuracy of binary classification models. This metric combines the accuracy and recall of the model into a single measure.
Precision refers to the ratio of true positives (TP) to the sum of true and false positives (FP), while recall, on the other hand, refers to the ratio of true positives to the sum of true positives and false negatives (FN). The F1 value expresses the harmonic mean of precision and completeness, giving more weight to low values. The F1 formula is A value of F1 equal to 1 indicates that the accuracy and completeness are perfect, while an F1 of 0 indicates that the model is unable to correctly classify any of the samples. The F1 metric is valuable for comparing different binary classification models and selecting the best model for a given classification task. To determine this parameter (i.e., F1), the following four possible cases were considered (see Figure 21):

1.
A window corresponding to a PU transmission and classified as such by the SU is considered a true positive (TP) value; 2.
A frequency window corresponding to a PU transmission classified as noise by the SU is a false negative (FN) value; 3.
A window corresponding to noise classified by the SU as a PU transmission is a false positive (FP) value; 4.
A frequency window corresponding to noise classified by the SU as such is a true negative (TN) value. Figure 22 shows the F1 of the NN 1 , which indicates whether the PUs were correctly located. In this image, all the NNs have a nearly perfect performance, around 0.98.
In the case of B, the F1 was estimated from the following cases (see Figure 23): • The resulting NN 3 B matching the ideal bandwidth that corresponds to a transmission is a true positive (TP) case; • The resulting NN 3 B corresponding to an ideal bandwidth close to zero is a true negative (TN) case; • The resulting NN 3 B much greater than an ideal bandwidth that is close to zero is a false positive (FP) case; • The resulting NN 3 B that should correspond to an ideal transmission B but is a value close to zero will be a false negative (FN) case.   Figure 23 shows a ∆B value, which is the difference between real and estimated bandwidths when there is a PU. When this parameter tends to grow, it provides flexibility so that the resulting NN 3 B matches an ideal B that corresponds to one transmission. However, when the value of ∆B is very small, the system becomes more inaccurate, since it only detects as TP those values that are close to the ideal value. Figure 24 shows the F1 of NN 3 . The NN with four layers and the NN with two layers both had an undesirable performance. This result could be explained by observing Figure 20, which shows the result of B 1 is very close to the minimum value (ideal_value − ∆B). Nevertheless, this undesirable performance was observed only when a low number of neurons per layer (i.e., 16) were employed. By properly configuring the number of neurons per layer to a sufficiently high value (e.g., 32 or greater), accurate performance with F1 values around 0.95 and 0.96 was obtained, as well as with two and four layers in the NN. Figure 24 also suggests overfitting in the NNs with two and four layers. This is because the models for the 2-layer and 4-layer NNs, both with 16 neurons per layer, fit too well to the training data and, as a result, do not generalize well to the new data. Despite this, they have an F1 of 0.9 and 0.79, respectively.  Figure 25 shows the ratio of F1 scores between training time for NN 1 and NN 3 (Figure 25a and 25b, respectively). The value obtained from this relationship indicates which NN provides the best trade-off based on the F1 accuracy result and the time invested in training the network to attain such accuracy. As can be noticed, using an NN with a single layer provides the best F1 performance for the training time required to obtain it. Further increasing the number of layers will increase the F1 score performance but will proportionately require a much longer training time, thus leading to a worse trade-off or relation between benefit (represented by F1 performance) and cost (represented by the required training time). Finally, Figure 26 shows the behavior of the proposed methodology over time, during which the sensed spectrum was analyzed in a specific geographic area and where SUs collaborated to obtain the REM of the different detected PUs. As can be appreciated, the proposed methodology was able to characterize the dynamic temporal evolution of the REM in the geographical area of interest, thus providing a valuable tool for the study of CRNs.

Conclusions
In this work, a real wireless communication scenario was implemented to detect the occupancy of multiple PUs through several SUs via a central entity, permitting the determination of the area used by the PUs based on REM estimations. Apart from the REM constructions, other features, such as power, bandwidth, and central frequency of possible detected PUs from the multiband spectrum frames, were estimated by the considered SUs.
For this task, we proposed using neural networks to substitute the classical digital signal processing used in our preliminary work. It was expected that the performance and processing time would be faster. Clearly, the training stage of the NNs, as shown by the results, was a factor to be considered. In this work, higher precision was preferred to locate the PUs and less processing time for the central entity when using the NNs. However, it must be considered that NN training time was not negligible and could even be high, depending on the number of layers and neurons each NN contained. The difference between the real values and that of the NNs was minimal, and it could even be said they were practically the same. In some cases, the NN showed even better results than the digital signal processing, for instance, when detecting the PUs' carriers.
Neural networks are a powerful and useful tool in many applications but are not always the best option. In some cases, classical digital signal processing may be sufficient, while in others, neural networks can significantly improve performance and accuracy. Future work will aim to determine the optimal number of SUs needed to obtain the area occupied by the PUs, thus avoiding the excessive processing of the central entity.

Acknowledgments:
The authors are grateful to CONAHCYT, the Department of Electrical Engineering of UAM-Iztapalapa and Department of Electrical Engineering and Electronics of University of Liverpool, for providing the necessary support for this work.

Conflicts of Interest:
The authors declare no conflict of interest.