Dynamic Power Optimization for Secondary Wearable Biosensors in E-Healthcare Leveraging Cognitive WBSNs with Imperfect Spectrum Sensing

,


Introduction
With the rapid development of Internet of Things (IoT), artificial intelligence, wireless communications, mobile edge cloud, smart devices, blockchain, wearable computing, big data, etc, electronic healthcare (e-healthcare) technologies are gradually replacing traditional paper-based healthcare systems as a new generation of pervasive healthcare solutions in medicine and public health [1,2,3].Instead of relying on on-site face-to-face healthcare, e-healthcare has potentials to bring multiple benefits to patients and healthcare providers, such as more time savings, improved information sharing, reduced medical errors, enhanced point of care, improved personalized patient experience, better utilization of healthcare resources, etc.
As one of the primary technologies for ubiquitous IoT, the use of wireless body sensor networks (WBSNs), also known as wireless body area networks (WBANs), has been recognized as a promising solution to provide a flexible infrastructure tailored for e-healthcare and telemedicine systems [4,5,6].In a WBSN, a set of resource-constrained body-centric radio biosensors (e.g., physiological, biokinetic, and ambient sensors [7]) can be attached on, in close proximity to, or implanted in a patient at home or in hospital to collect the patient's physiological data continuously, including electrocardiogram (ECG), electroencephalogram (EEG), electromyogram (EMG), blood oxygen saturation (SpO 2 ), etc.By the aid of 5G and short-range wireless technologies, the collected data is then transmitted to medical servers or physicians via a coordinator, which is capable of data fusion and decision making.In this way, the integration of sensing, computing, and communications into e-healthcare makes it feasible to enable continuous, reliable, non-intrusive remote health monitoring and point of patient care.
In spite of the advantages of e-healthcare by using WBSNs, electromagnetic interference generated by wireless transmissions of various medical devices in hospital settings and many different biosensors in WBSNs is more and more severe, which will seriously affect system performance1 [8,9].Particularly, many medical devices are more susceptible to the electromagnetic interference problem incurred by wireless transmissions.Even worse, because of the interference effect on electrical circuits, a mistake or malfunction can occur in a medical device, which may be harmful to the patient who is currently using that device.Moreover, in realistic hospital settings, a large number of medical devices operate in the unlicensed Industrial, Scientific and Medical radio bands which are also overlapped with the frequency bands used by most of legacy short-range wireless technologies.Therefore, limited spectrum available for usage is another major concern for e-healthcare applications due to coexistence of medical devices and other wireless devices.Beyond that, from the perspective of medical-grade mission-critical e-healthcare applications, differentiated types of medical devices in hospital settings and biosensors should have different quality of service (QoS) requirements, i.e., different levels of priorities, to obtain channel access [8,7].For instance, a medical device (e.g., anesthesia machine) with high priority has the privilege to access the wireless channel in comparison with an ordinary biosensor with low priority, e.g., wearable motion detection sensor.As a result, effective channel access for multiple medical devices and biosensors over the crowded radio spectrum to avoid harmful electromagnetic interference to medical devices while meeting the QoS constraints with differentiated priorities has been the focus of future e-healthcare.
One of the most promising and enabling approaches to solve the aforementioned challenges is to integrate cognitive radio into WBSN-assisted e-healthcare systems, to design an e-healthcare framework leveraging cognitive WBSNs (CWBSNs) [8,9,7,10,11].In a traditional cognitive radio system, unlicensed secondary users are allowed to opportunistically access the idle band of spectrum temporarily unused by licensed or primary users without interfering with them.For an emerging e-healthcare framework leveraging CWBSNs, patient-centric biosensors with low priority acting as secondary users are permitted to sense, identify, and access the spectrum opportunities while incurring no harmful interference to the coexisting primary users, i.e., primary medical devices (PMDs) with higher priority.The resulting e-healthcare based on the cognitive radio idea has been envisioned as a new solution for next generation healthcare services, and also has the potential to improve the spectrum efficiency performance of system.Fig. 1 illustrates a typical application scenario of our proposed e-healthcare system in a hospital environment including a hospital room and a central monitoring room.In the considered scenario, four underlay CWBSNs are coexisting with a primary system sharing the authorized spectrum simultaneously in the hospital room.The primary system consists of four protected PMDs, i.e., wireless ECG monitor, infusion pump, anesthesia machine, and telemetry monitor, which are sensitive to the electromagnetic interference effect.In intra-CWBSN, several secondary wearable biosensors (SWBs) are attached on a patient to sense his or her physiological data, and further send the collected data to a near-body secondary coordinator (SC) through uplink transmission.To obtain the spectrum opportunities for SWBs, there is a highly capable cognitive gateway (CG) located in the hospital room to perform spectrum sensing over the entire licensed channels continuously.According to the spectrum sensing results (SSR), the CG can make a global decision on the occupancy status of licensed band for allocating the idle channel to every CWBSN.Additionally, the CG also has capability to forward the collected physiological data sent by SCs to medical servers or physicians by means of cloud-assisted e-healthcare system in the central monitoring room [12].
Based on the proposed e-healthcare framework, the SC not only gathers all the physiological data sampled by every SWB in intra-CWBSN, but also transmits the physiological data to the CG.Although the harmful electromagnetic interference to PMDs is well mitigated via cognitive radio approach, there still exists additional inter-CWBSN interference caused by overlapped transmission ranges among multiple nearby CWBSNs under the hospital room scenario.Such inevitable inter-CWBSN interference leads to the drop of signal-to-interference-plus-noise ratio (SINR) and more losses of packets with critical physiological data, which seriously threatens the life of patients.A lower SINR subsequently reduces the throughput or uplink capacity of every CWBSN, thereby leading to decreased transmission reliability and efficiency.The resulting electromagnetic interference in inter-CWBSN is one of the top technical challenges that severely deteriorate the performance of e-healthcare system.On the other hand, battery-powered SWBs attached to the patient have limited resource and energy and in many cases it is either impossible or difficult to replace the batteries [13].Hence, energy efficiency is one of crucial issues for prolonging the lifetime of light-weighted energyconstrained SWBs.High energy consumption operations due to increased transmit power of radio transceivers in SWBs as well as unreliable data transmissions caused by larger packet losses and more inter-CWBSN electromagnetic interference are main hurdles for the migration to the proposed e-healthcare framework.
Admittedly, these challenges motivate the need for better understanding the impact of inter-CWBSN electromagnetic interference and energy efficiency on the performance of the proposed e-healthcare system, which typically require a trade-off between SWB's transmit power and total system capacity.Firstly, to mitigate inter-CWBSN interference coexisting with other SWBs, an effective transmit power control method is necessary for every SWB.By optimizing the transmit power, the power supply for each SWB can be coordinated to improve the energy efficiency and maintain the long-term low-power operations.Secondly, uplink capacity between any SWB and its associated SC in a CWBSN is a concave function of transmit power of this SWB and channel conditions [14].Although uplink capacity can be improved by allocating higher transmit power to every SWB, inter-CWBSN interference will also increase, resulting in lower total system capacity.Overall, proper design of transmit power control and optimization schemes is vital to ensure reasonable performance of the proposed e-healthcare framework.To the best of our knowledge, existing works on transmit power control for interference mitigation in WBANs and WBSNs, although providing precious insights into the impact of inter-network interference and energy efficiency on the system performance, have one common limitation: most of them considered only pure WBAN or WBSN settings without capturing the integration of cognitive radio with such system scenarios.Despite existing efforts such as [8,10], building a clear understanding of preferred integration of cognitive radio and WBSNs, spectrum sensing performed by secondary wireless devices is assumed to be fully reliable or perfect without taking spectrum sensing errors into account.With the hardware limitations, sensing errors are inevitable, i.e., false alarm and miss detection, in practical hospital scenarios [15,16].Additionally, transmit power should be allocated and optimized for all SWBs locally in a distributive manner due to lack of global information for the SC to achieve the centralized schedule.In this regard, the objective of power optimization for every SWB is to achieve its own utility maximization.Note that the utility functions of SWBs are typically conflicting and their decisions are interactive.Moreover, because of complex and dynamic environment, the CWBSN configuration can be extended to a more general scenario by relaxing the discrete time constraint to the continuous time model.With the continuous time setup in mind, it would be interesting to investigate a distributed dynamic power optimization, which can capture the feature of practical dynamic interactive process among all SWBs.This motivate us to resort to the continuous-time differential game theory rather than other static game frameworks and decentralized optimization approaches for modeling the problem of transmit power control and optimization in our proposed e-healthcare system.
In this paper, we investigate the problem of dynamic power optimization for SWBs in the proposed e-healthcare system, aiming to mitigate inter-CWBSN co-channel interference and improve the energy efficiency for SWBs.As opposed to existing studies in the literature where spectrum sensing is assumed to be perfect, we capture the unreliable spectrum sensing of the CG.To this end, we develop a distributed optimization framework of dynamic power optimization for SWBs based on the theory of differential game and demonstrate how this proposed framework can be implemented in our proposed e-healthcare system with imperfect spectrum sensing.The main contributions of this paper can be summarized as follows: • We propose a distributed optimization framework of dynamic power optimization for SWBs, by jointly considering utility maximization posed by the competitive and cooperative scenarios, imperfect spectrum sensing, and the quality of physiological data sampling in e-healthcare leveraging CWBSNs.Particularly, the proposed framework not only mitigates the inter-CWBSN co-channel electromagnetic interference, but also achieves a trade-off between transmit power of every SWB and total system capacity.
• We formulate the problem of dynamic power optimization for SWBs in our proposed e-healthcare system as a differential game model, which maximizes the utility function throughout a continuous time interval while satisfying the evolution law of energy consumption in the battery of every SWB.To achieve the optimal allocation of the player's individual utility, we then transform the game model into two subproblems, i.e., utility maximization problem and total utility maximization problem, in terms of the competitive and fully cooperative scenarios for each SWB under the proposed game framework.
• For the utility maximization problem in the competitive scenario, the non-cooperative optimal solution for power optimization is derived as a unique Nash equilibrium (NE) point via Bellman's dynamic programming.
Meanwhile, we further obtain the cooperative optimal solution for power optimization with respect to the total utility maximization problem in cooperative scenarios by invoking Pontryagin's maximum principle.We also analyze the performance of dynamic power optimization according to the non-cooperation and cooperation relations for SWBs.Our analysis provides quantitative insights on the impact of non-cooperation and cooperation relations on the optimal power and the actual utility gained by each player.
The key acronyms and main notations used throughout this paper are summarized in Table 1 and Table 2 for the ease of reference.The remainder of this paper is organized as follows.In Section 2, we provide an overview of the related work.Section 3 presents the system model including system configuration, imperfect spectrum sensing model, and physiological data sampling model.In Section 4, we propose the dynamic power optimization framework, and derive the non-cooperative and cooperative optimal solutions.Moreover, we discuss and analyze the utilities of the non-cooperative and cooperative dynamic power optimization.Simulation results are provided in Section 5, and we conclude the paper in Section 6.

Related Work
In the past few years, a number of studies have investigated the problem of transmit power control in WBANs and WBSNs for interference management from different perspectives, including link quality [17,13,18], energy harvesting [19], and relay selection [20,21].
By analyzing the properties of link states via the received signal strength indicator (RSSI), Kim and Eom in [17] proposed a link-state-estimation-based power control scheme, which enables the transceiver of sensor node to adapt its transmit power level and target the RSSI threshold range.In [13], Zang et al. developed an accelerometer-aided transmit power control mechanism by incorporating the impact of human body movement on link quality measured by the receiving RSSI value.The local accelerometer as a common WBAN device was utilized to acquire the periodic link quality information without generating additional costs.With the aid of the hybrid strategy by jointly considering closed-loop control and posture/motion detection, Fernandes et al. in [18] devised a proactive power control method, which uses the RSSI value and the acceleration signal to predict the fading signal and to determine the position of wearable devices during the gait cycle of human body, respectively.
Different from the link quality-aware power control approaches, Liu et al. in [19] proposed a two-phase resource allocation scheme, to jointly optimize the allocation of transmit power, source rate, and time slots in the context of energy harvesting-powered WBANs.On the basis of the time-varying and heterogeneous energy harvesting states, the long-and short-term QoS performances in two adopted phases were markedly improved by analyzing the statistical knowledge of energy harvesting.In [20], Dong and Smith presented an integrated scheme of joint two-hop relay selection and power control by taking advantage of channel prediction, to strike a desired balance in the trade-off between power saving and interference mitigation.In [21], a relay-aided transmit power control method was designed, which can automatically switch the transmitter's transmission strategy between direct transmission and relay-aided transmission according to the on-body channel conditions.All the above works in [13,17,18,19,20,21], even though they provide valuable insights on the potential of bringing power control into WBANs and WBSNs, do so without capturing the integration of dynamic spectrum access with the considered system scenarios, to protect the primary devices with high priority from harmful interference.
Alternatively, there have been some existing researches in the literature analyzing the performance of power control in WBANs by using machine learning, to enable adaptive learning and intelligent decision making.To combat the jamming attacks, Chen et al. [22] developed a reinforcement learning-based power control scheme for in-body sensors.Particularly, a hotbooting-Q-learning was adopted to assist the coordinator for optimizing the transmit power without considering the in-body channel states and the utility models of other sensor nodes.In [23], Kazemi et al. further employed a reinforcement Q-learning to obtain dynamic power optimization for power controller in WBANs, which can explicitly learn from the environment and improve the performance, i.e., substantial saving in energy consumption.

Notation Description
However, the machine learning-based power control method in [22,23] applies only to the scenario of WBANs and does not apply to the emerging e-healthcare framework leveraging CWBSNs.
Game theoretic approach for power allocation and optimization within a WBAN setup has received significant interest recently.With the design of a cost function with QoS requirement and energy constraint for sensor node, Zhao et al. in [24] formulated the problem of power control as a non-cooperative game model, and also derived a NE as the optimal power allocation.By the assistance of dynamic social interaction information for inter-WBAN relations, Zhang et al. in [25] proposed a power control game to maximize the network utility while minimizing the total power consumption, and further proved the existence of a NE point of the game.In [26], Moosavi et al. also utilized the non-cooperative game theoretic approach to tackle the problem of joint relay selection and power control, aiming to obtain the maximized energy efficiency and meet the QoS requirements.Similarly, the idea of using non-cooperative game framework as an analytical method to optimize the transmit power in WBANs has been further explored in [27,28].However, the game theoretic models presented in the these works [24,25,26,27,28] were just focused on the non-cooperative scenarios, without capturing the potential benefits of designing cooperative power optimization strategies.Moreover, the existing game theoretic methods using the static game framework cannot exactly capture how to dynamically optimize the transmit power with respect to the current instant time in a more realistic and dynamic environment.By contrast, we extensively consider the actual behaviors of individual game players in terms of the competitive and cooperative scenarios, and investigate their impacts on the performance of dynamic power optimization.
To date, only rare few efforts have been devoted to address the issue of power allocation and optimization in e-healthcare systems by tightly integrating cognitive radio with WBSNs, to protect PMDs from harmful interference.Considering the issues of harmful interference to primary medical devices and differentiated QoS requirements, Phunchongharn et al. in [8] demonstrated the potential advantages of using cognitive radio in the design of wireless communication systems, particularly for e-healthcare applications in a hospital environment.With this idea, to obtain the effective channel access, an interference-aware time slotted handshaking protocol was designed for the primary and secondary medical devices with desired QoS differentiation.The proposed protocol only extended the traditional carrier sense multiple access with collision avoidance mechanism to the cognitive radio based e-healthcare scenario.This paper is quite attractive and instructive, although it does not investigate the power optimization issue that reveals an interplay between transmit power of low priority medical devices and total system capacity.In [10], Naeem et al. focused on a modern automated hospital scenario with various wireless devices, e.g., wireless sensor devices, personal wireless hubs, central controllers, etc, in healthcare facilities.By applying the cognitive radio idea, a limited set of unlicensed secondary wireless devices (i.e., wireless sensor devices and personal wireless hubs) was allowed to exploit the unused spectrum that was assigned to the licensed primary wireless devices, including all other healthcare equipments.Depending on these settings, the authors presented a framework of interference-aware joint power control and assignment of multiple personal wireless hubs, which can maximize the total transmission data rate under the acceptable interference constraints.Despite its novel insights, the joint optimization framework in [10] is formulated bearing in mind an ideal assumption of perfect spectrum sensing carried out by secondary wireless devices.Nevertheless, perfect spectrum sensing with none of spectrum sensing errors is extremely difficult to achieve in practical CWBSN-based e-healthcare systems.To that end, the work presented in this paper adopts more practical limitations, e.g., imperfect spectrum sensing and the quality of physiological data sampling, to investigate the issue of transmit power control and optimization in the proposed e-healthcare system.
To sum up, although a lot of works have been carried out on the power control problem in pure WBAN/WBSN scenarios extensively as well as only limited efforts have focused on the attempts at a coupling of cognitive radio and e-healthcare, there still seems to be a gap between e-healthcare systems and efficient integration of cognitive radio with WBSNs.To be more concrete, there exist several fundamental technical difficulties to deal with: • Most of the existing studies involve the static game framework without capturing the practical constraints of dynamic interactive actions among all the sensor nodes subject to the dynamic time-varying transmit power adjustment in WBANs.From practical view points, the update of transmit power level for each sensor node should be continuous in time.Hence, it is imperative to develop a general optimization framework to achieve dynamic power control via the continuous-time differential game theory.
• In practice, sensing errors for spectrum sensing conducted by secondary wireless devices in e-healthcare systems that considering the integration of cognitive radio and WBSNs are actually unavoidable as a result of the limitation on hardware.However, it is a widespread assumption in the existing works that spectrum sensing is fully reliable or perfect without any explicit errors.Hence, it is necessary to reconsider spectrum sensing to adapt to the practical environment.In this paper, we tackle the above issues by providing a distributed optimization framework of dynamic power control for SWBs in our proposed e-healthcare system with imperfect spectrum sensing.

System Configuration
Consider an application scenario of e-healthcare framework leveraging CWBSNs in a hospital environment with a central monitoring room and a hospital room as shown in Fig. 1.We concentrate on one CG and n W underlay CWBSNs, denoted by N W = {1, 2, • • • , n W }, coexisting with a primary system in a hospital room sharing the authorized spectrum in the same frequency band simultaneously.The whole authorized spectrum band is divided into n C licensed channels with equal bandwidth W C , denoted by We assume that the CG is equipped with an omnidirectional antenna, a predefined common control channel, and an energy detector that continuously senses the entire licensed channels through local real-time measurements.Based on the SSR, the CG makes decision to determine whether or not the licensed channels are vacant, for allocating the unoccupied channel to each CWBSN via the common control channel.The primary system consists of n P PMDs, denoted by N P = {1, 2, • • • , n P }, which have the full privilege of accessing the licensed channels from time to time.Particularly, the occupation time length of every licensed channel assigned to a PMD follows an independent and identically distributed (i.i.d.) alternating ON-OFF process.The OFF state indicates the idle state where unoccupied channel can be freely accessed by each CWBSN.
We use a time-slotted frame structure, as illustrated in Fig. 2, where a cognitive frame with duration T F for the CG is divided into four time slots with different duration, i.e., a spectrum sensing phase with duration τ S , a SSR reporting phase with duration τ R , a physiological data receiving phase with duration τ DR , and a physiological data transmission phase with duration τ DT .If one licensed channel is sensed to be idle in the spectrum sensing phase, the CG allocates the unoccupied channel to each CWBSN in the SSR reporting phase, and then receives and transmits the physiological data in the subsequent phases.Otherwise, if all the licensed channels are sensed to be busy, the CG keeps silent in the subsequent phases.
Let us assume that the unoccupied channel allocated to each CWBSN in the SSR reporting phase is denoted by  Once the corresponding time slot has been obtained no matter what type of coordination mechanism, the SWB will enter a physiological data sampling phase with duration t D and a physiological data transmission phase with duration t T .To simplify analysis, we assume that SC C m employs both the SSR reporting phase and the physiological data receiving phase in cognitive frame to allocate time slots with equal duration D Slot to n S SWBs in CWBSN m.
Apparently, the duration of each time slot assigned to each SWB in CWBSN m can be expressed as: Under the constraint of equal slot duration, we adopt the continuous time model to represent each SWB's operation duration which is characterized by time interval [t 0 , t 0 + D Slot ], where t 0 denotes the initial time of time slot.Because of the adopted TDMA-based access policy, there is only one active SWB for each CWBSN within time interval Note that the SWB is an active node when it can currently access to the associated time slot by using ordered channel access or disordered channel access.Without losing generality, we turn to define a n W -tuple A k (t) as follows to describe a current active SWB set wherein each SWB refers to an active node working over channel k at time t ∈ [t 0 , t 0 + D Slot ]: where s m,k (t) is the current active SWB in CWBSN m over channel k at time t.For the notational brevity, we denote the current active SWB in CWBSN m over channel k at time t by SWB i, i.e., i s m,k (t), for i ∈ N m,S and We further use t 0 to stand for an initial time of the physiological data transmission phase for every SWB as shown in Fig. 2. Thus, the duration of the physiological data transmission phase can be specified by time interval [t 0 , t 0 + D Slot ].Apparently, we can easily have t T = D Slot + t 0 − t 0 .To deal with co-channel electromagnetic interference problem in inter-CWBSN, we turn our attention to the uplink transmission by exploring transmit power optimization mechanism for the current active SWB during the physiological data transmission phase.In fact, the way but is also limited by a maximum value P max m,i .Based on the Friis formula in free space, similar to [29,30], the maximum transmit power of SWB i in CWBSN m can be approximately given by: where P 0 (in dBm) is the receiving reference power by SC C m at a reference distance d 0 , d m,i is the distance between SWB i and SC C m in CWBSN m, and ω ≥ 2 is the path-loss exponent.We assume that the uplink wireless channel state is expected to be unchanged during time interval [t 0 , t 0 + D Slot ], and is subject to the distance dependent power attenuation.Thereby, we can use a slow flat fading channel model to characterize the channel gain from SWB i to where p j,sj (t),k (t) is the instant transmit power of SWB s j,k (t) in CWBSN j over channel k at time t, g j,sj (t),k is the interference gain from SWB s j,k (t) in CWBSN j to SC C m over channel k, and n 0 is the background noise power spectral density at SC C m .According to Shannon's capacity formula, the uplink capacity (in bps) between SWB i and SC C m in CWBSN m over channel k at time t can be calculated by: where is the constant processing gain factor with constants β 1 and β 2 depending on acceptable bit error rate (BER), modulation strategy, and coding scheme over channel k.Although the overall spectrumutilization efficiency has been improved, the uplink transmission for each SWB in an opportunistic way during the physiological data transmission phase may also generate the extra interference to PMDs.In this case, the interference power constraint should be imposed to protect PMDs from unavoidable electromagnetic interference caused by all SWBs.For PMD n, the accumulated interference caused by the current active SWB set working over channel k at time t must be kept below the interference temperature limit I max n,k given as follows [31]: where p m,s m,k (t),k (t) is the transmit power of SWB i in CWBSN m over channel k at time t, and g n,s m,k (t),k is the channel gain from SWB i in CWBSN m to PMD n over channel k.

Imperfect Spectrum Sensing Model
Recall that the CG performs spectrum sensing during the spectrum sensing phase via local real-time measurements to identify the occupation status of licensed channels.Spectrum sensing can be formulated as a binary hypothesis testing problem in which there are two hypotheses based on the presence and absence of the PMD over channel k, denoted by H B k (busy) and H I k (idle), respectively.We denote the sensing result that channel k is occupied by the PMD with the probability HB k and the sensing result that channel k is vacant with the probability HI k .We also assume that the signal of the PMD follows an i.i.d.random process, and the noise is an i.i.d.circularly symmetric complex Gaussian at the CG with zero mean and variance σ 2 .Let ε k denote a decision threshold recognized by the CG to decide whether the licensed channel is occupied by the PMD.For the CG, the detection probability φ d k and the false alarm probability φ f k detecting the status of channel k during the spectrum sensing phase with duration τ S can be exactly expressed as [32]: where γ k is the received average SINR from the PMD at the CG over channel k, f S is the signal sampling frequency over channel k, and Q (x) is the standard Gaussian Q-function defined by In practical scenarios, the CG cannot purely guarantee perfect spectrum sensing due to varying channel and fading.
As such, imperfect spectrum sensing should be fully considered.In general, two kinds of spectrum sensing errors are inevitable: false alarm which means the CG may detect channel k as occupied by the PMD but it is actually idle, and miss detection which means the CG detects channel k as vacant when it is truly busy.Particularly, miss detection will cause the severe co-channel interference to the PMD, while false alarm will lead to the situation that each CWBSN may miss the transmission opportunity.It is clear that imperfect spectrum sensing of the CG will seriously degrade the performance of CWBSNs.By taking into account the combination relationship between actual state and sensing result of the CG, there are four different cases for spectrum sensing, which are summarized in Table 3.More specifically, for Case 1 and Case 4, the CG makes the correct decisions on the occupancy status of the licensed channel.However, Case 2 is a false alarm in which the CG and each CWBSN keep silent until the next cognitive frame, and Case 3 is a miss-detection in which the transmission of sampled physiological data for each SWB will come into collision with the PMD.By using the Bayes' rule, the probabilities of Cases 1, 2, 3, and 4 for channel k can be summarized as follows: where φ k is the priori probability that channel k is occupied by the PMD.Based on the probabilities of Cases 1, 2, 3, and 4 and the CWBSN actions as listed in Table 3, we notice that there is only Case 1 achieving normal transmission during the SSR reporting phase and the physiological data receiving phase.Here, we focus on the uplink transmission of collected physiological data in each CWBSN.Let Ψ m,i,k , Ψ m,k , and Ψ k represent the average number of transmitted bits of physiological data for SWB i in CWBSN m, for CWBSN m, and for n W CWBSNs over channel k, respectively.
For SWB i in CWBSN m over channel k, the average number of transmitted bits of physiological data within equal slot duration D Slot assigned to each SWB can be approximately calculated as follows [33]: Likewise, the average number of transmitted bits of physiological data for CWBSN m and n W CWBSNs over channel k, can be derived as Ψ m,i,k , respectively.

Physiological Data Sampling Model
We assume that the physiological data sampling phase with duration t D is divided into K equal sampling intervals for each SWB, and there will be one physiological data sampling operation in each sampling interval.Conceptually, each SWB can sample one or many attributes.As for sampling interval , the physiological data sampling value with respect to sampled attribute α acquired by SWB i in CWBSN m over channel k is defined as: where A is the number of sampled attributes of physiological data for SWB i in CWBSN m, for i ∈ N m,S .For K sampling intervals, the sampling values acquired by SWB i in CWBSN m over channel k in terms of sampled attribute α can be given by a sampling value vector Afterwards, the collected physiological data sampling values for SWB i in CWBSN m over channel k during the physiological data sampling phase can be expressed by a sampling value matrix as follows: Algorithm 1 Approximate Probability Distribution Generation Algorithm Calculate the number of sampling values within subinterval (Λ l−1 , Λ l ] denoted by z m,i,k,α (l).

8:
Calculate the probability of sampling value by So far, we have rigorously characterized the sampling value X m,i,k,α ( ) belonging to sampling interval with respect to sampled attribute α sensed by SWB i in CWBSN m over channel k.Based on this description, we are also interested in the probability distribution of sampling value from a perspective of statistical property of the physiological data sampling.To this aim, we shall proceed to exploit mathematical statistics method to construct the approximate probability distribution of sampling value.Specifically, we propose Algorithm 1 as an implementation to obtain the approximate probability distribution P [X m,i,k,α ( )] of sampling value X m,i,k,α ( ).
Normally, it is impossible that the physiological data sampling values for each SWB will remain unchangeable.In this way, anomalies or outliers will also occur when the current sampling values monitored by a SWB significantly deviate from normal pattern of physiological data.The abnormal sampling may result from many reasons such as emergency situations (e.g., health state degradation and heart attack) and malfunctions of the SWB (e.g., improperly attached SWB, detached SWB, and false treatment) [34].For instance, the normal SpO 2 ratio is larger than 95%, and when this ratio is lower than 90%, an emergency alarm must be triggered as a consequence of lung problems or respiratory complications [35].In order to describe the normal pattern of physiological data for sampled attribute α acquired by SWB i in CWBSN m over channel k, we define a statistical probability distribution P [X m,i,k,α ( )] of sampling value X m,i,k,α ( ) as a priori information: Note that the smaller the relative divergence between the approximate probability distribution P [X m,i,k,α ( )] and the statistical probability distribution P [X m,i,k,α ( )], the better the quality of physiological data sampling.It has been shown that the Kullback-Leibler divergence is an effective tool to measure how one probability distribution diverges from another reference probability distribution [36].Thereby, the relative divergence between P [X m,i,k,α ( )] and P [X m,i,k,α ( )] can be calculated under the Kullback-Leibler divergence framework as follows: For A sampled attributes for SWB i in CWBSN m over channel k, the vector of the relative divergence is written by ). Correspondingly, as for A sampled attributes, the weighted average relative divergence for SWB i in CWBSN m over channel k can be given by: where w m,i,k,α is the weight of sampled attribute α monitored by SWB i.Note that the weight of sampled attribute α reveals the importance of this attribute within all the sampled attributes.From (18), it is found that the smaller the weighted average relative divergence, the better the quality of physiological data sampling.In this way, we wish to remark that the weighted average relative divergence can be used to characterize the quality of physiological data sampling for each SWB.

Dynamic Power Optimization
Framework: Problem Formulation, Optimal Solution, and Utility Discussion

Problem Formulation
Owing to lack of global information for the SC to achieve the centralized schedule, transmit power should be allocated distributively by the current active SWBs.In general, the current active SWB has to reduce its transmit power in order to cope with the inter-CWBSN co-channel electromagnetic interference problem.However, the reduction of its own transmit power of the SWB would be attained at the expense of its own uplink capacity.This is due to the fact that the uplink capacity between the SWB and the SC is a concave function of current transmit power and channel conditions according to (5).Meanwhile, the objective of transmit power allocation for each SWB is to maximize its own utility function.However, the utility functions of the current active SWBs are conflicting and their decisions are interactive.Moreover, it will be more realistic to dynamically optimize the transmit power with respect to the current instant time.This motivates us to formulate the problem of dynamic power optimization for the current active SWB in each CWBSN as a differential game model.

Definition 1. (Differential Game
Framework ) The differential game theoretic framework for dynamic power optimization of the current active SWB set working over channel k at time t ∈ [t 0 , t 0 + D Slot ] is defined as a 4-tuple: where, t)} is the set of the current active SWBs playing the game.The players belong to the rational policy makers and act throughout a time interval in game G k .
• Set of strategies {P m,i,k (t)}: The strategy of a player is defined as its instant transmit power bounded by the maximum transmit power.P m,i,k (t) = {p m,i,k (t) |∀t } is the strategy space of SWB i in CWBSN m over channel k.
• Set of states {E m,i,k (t)}: The state of a player refers to its instant energy consumption.
is the state space of SWB i, where E m,i,k (t) denotes the energy consumption value of SWB i in CWBSN m over channel k.
• Set of utility functions {U m,i,k }: U m,i,k (p m,i,k , E m,i,k ) is the utility function of SWB i in CWBSN m over channel k.The objective of each SWB is to maximize its utility function by rationally selecting optimal strategy and state.
Note that differential game is a kind of continuous-time dynamic game wherein the utility function of the player relies on the strategy and the state of itself.Hereinafter, the terms "player" and "SWB" are all interchangeable, unless explicitly stated otherwise.Under this game framework, we put more emphasis on how to formulate the state space E m,i,k (t) and the utility function U m,i,k of SWB i in CWBSN m over channel k, respectively.To model the fact that the state space of SWB i needs to be defined to satisfy the state dynamics as in [37], we introduce a linear differential equation to represent the state space of SWB i.Let E R m,i,k (t) denote the residual energy in the battery of SWB i in CWBSN m over channel k at time t ∈ [t 0 , t 0 + D Slot ].Similar to [38], the evolution law of energy consumption in the battery of SWB i can be defined by a linear differential equation as follows: We next discuss how to characterize the utility function U m,i,k of SWB i in CWBSN m over channel k.With the constraint of maximum value P max m,i , the value of power reduction for SWB i is given by P max m,i − p m,i,k (t).Hence, the power reduction efficiency for SWB i at time t can be further expressed as: Let us revisit the weighted average relative divergence as mentioned in ( 18) that has been suggested to characterize the quality of physiological data sampling for each SWB.Therefore, to formulate the revenue of power reduction for each SWB, we attempt to devise a revenue pricing factor for power reduction performed by each SWB through a joint consideration of both the power reduction efficiency and the weighted average relative divergence.This motivates the need for better understanding of the interplay between the power reduction efficiency and the weighted average relative divergence, which typically require a trade-off between them.From the point of view of trade-off strategy, a similar design for pricing factor can be found in the formulation of energy-per-capacity factor [31].Inspired by this kind of trade-off design, the revenue pricing factor for every SWB in consideration should be highlighted as two points to achieve a reasonable result.On the one hand, it goes without saying that the more power reduction efficiency, the higher pricing for a SWB, which means that a higher reward must be attached to that SWB due to its contribution to power reduction.On the other hand, the worse quality of physiological data sampling because of abnormal sampling incurred by emergency situations, the larger weighted average relative divergence for a SWB.Thus, the higher pricing corresponding to a more reward should be offered for that SWB due to the importance of that SWB to reflect the current emergency alarm, e.g., heart attack or health state degradation.
By combining these considerations, as for SWB i in CWBSN m over channel k, the revenue pricing factor for power reduction is defined as the product of the power reduction efficiency and the weighted average relative divergence, which can be thus written as follows: Obviously, the increase of any values belonging to η m,i or E [d m,i,k ] will result in the more revenue pricing for SWB i.Consequently, the revenue of power reduction for SWB i in CWBSN m over channel k at time t by attaining the product of the revenue pricing factor along with the value of power reduction for SWB i, i.e.: So far, we have obtained the revenue of power reduction for each SWB.Next, we turn to formulate the cost of energy consumption for each SWB, aiming to derive the utility function U m,i,k of SWB i as previously introduced in Definition 1. Within the physiological data transmission phase with duration t T , the energy consumption per unit of time for SWB i in CWBSN m over channel k can be calculated by . We denote λ C E as the cost pricing for energy consumption per unit of time for each SWB.As a result, the cost of energy consumption for SWB i in CWBSN m over channel k at time t ∈ [t 0 , t 0 + D Slot ] can be expressed as: Under the above setup, the utility function of SWB i in CWBSN m over channel k at time t ∈ [t 0 , t 0 + D Slot ] can be characterized by: Our target is to maximize the utility function throughout time interval by adaptively deriving the optimal transmit power p OP m,i,k (t) and the optimal energy consumption E OE m,i,k (t) for SWB i in CWBSN m over channel k, which can be precisely obtained by: where r ∈ (0, 1) is the discount factor.This adopted target allows us to mathematically formulate the utility maximization problem as: maximize where Ẽm,i,k (t = t 0 ) is the initial energy consumption value at the initial time t 0 .Constraint (27b) limits the transmit power level of SWB i. Constraint (27c) represents the evolution law of energy consumption in the battery of SWB i.Finally, constraint (27d) expresses the initial value of energy consumption at the initial time of the physiological data transmission phase.To solve the utility maximization problem in ( 27), we will focus on two dynamic power optimization strategies based on actual behaviors of individual players in the proposed game framework.The first strategy is to consider a competitive scenario in which the players act independently to maximize their own individual utility without being able to contract the behaviors of other players.An alternative strategy aims at a cooperative scenario wherein the players can make joint strategies from a social point of view to maximize the overall utilities through full cooperation among all players.

Non-cooperative Optimal Solution
Here, we will focus on the utility maximization problem posed by the competitive scenario that each SWB aims at individually maximizing its own individual utility within the physiological data transmission phase.In this scenario, the optimal solution for the non-cooperative game G k is the NE point, if the NE exists and it is unique.
where the state E m,i,k (t) of SWB i should satisfy constraints (27c) and (27d) on the interval [t 0 , t 0 + D Slot ].
We wish to remark that the NE solution or optimal strategy of SWB where χ i (•) and ϕ i (•) are continuously differentiable functions for SWB i under the general differential game model, respectively.
Proof: The proof of the lemma is omitted due to space limitations.A similar detailed proof can be found in [37].
To be precise, with relation to our designed differential game theoretic framework, both of the differentiable functions for SWB i in CWBSN m over channel k in Lemma 1 can be modeled as , respectively.Following Lemma 1 and the above analysis, let us also assume henceforth that there exists a continuously differentiable auxiliary value function for SWB i in CWBSN m over channel k, denoted by L m,i,k (p m,i,k , E m,i,k ), which is subject to the following partial differential equation: Theorem 1.The non-cooperative optimal solution p * m,i,k (t) to the utility maximization problem in (27) constitutes the NE to the non-cooperative game G k if and only if the non-cooperative optimal solution p * m,i,k (t) and the value function L m,i,k p * m,i,k , E * m,i,k are respectively formulated as: Proof : Please refer to Appendix A.
Note that Theorem 1 guarantees the existence and uniqueness of the NE point to the non-cooperative game G k in that we can use a specific fixed value to quantify the NE point for each SWB.It further turns out that Theorem 1 ensures the convergence of the non-cooperative optimal solution p * m,i,k (t) to the NE point.Applying this result in Theorem 1, the average number of transmitted bits of physiological data within slot duration D Slot under imperfect spectrum sensing can be rewritten as: In the meantime, the evolution law of the non-cooperative optimal energy consumption E * m,i,k (t) for SWB i in CWBSN m over channel k can be formally represented by: To this end, by solving (34), the non-cooperative optimal energy consumption E * m,i,k (t) follows that: where ξ 1 is the constant number.By substituting (31), (32), and ( 35) into (A.3),we further exactly obtain that:

Cooperative Optimal Solution
We now consider an alternative cooperative dynamic power optimization strategy, where all SWBs fully cooperate to obtain the highest total utilities by achieving full cooperation for their common interests.Our objective is to maximize the sum of the utility functions of all SWBs throughout time interval [t 0 , t 0 + D Slot ] while simultaneously satisfying the constraints (27b)-(27d).This is attained by a suitable choice of the cooperative optimal transmit power and energy consumption for each SWB which are detailed below.To achieve this goal, the total utility maximization problem can be mathematically modeled as: Let p m,i,k (t) and E m,i,k (t) be the cooperative optimal transmit power and energy consumption for SWB i in CWBSN m over channel k, respectively.Invoking Pontryagin's maximum principle, we further assume that there exists a continuously differentiable auxiliary value function F m,i,k (p m,i,k , E m,i,k ) satisfying the partial differential equation given as follows [37]: Theorem 2. The cooperative optimal transmit power constitutes the cooperative optimal solution p m,i,k (t) to the total utility maximization problem in (37) if and only if the cooperative optimal transmit power p m,i,k (t) and the value function F m,i,k p m,i,k , E m,i,k can be formulated as: Proof : Please refer to Appendix B.
As such, by applying the cooperative optimal solution p m,i,k (t) in Theorem 2, the average number of transmitted bits of physiological data within slot duration D Slot under imperfect spectrum sensing is equivalent to: According to the above result, the evolution law of the cooperative optimal energy consumption E m,i,k (t) for SWB i in CWBSN m over channel k can be characterized by: Finally, by solving (42), the cooperative optimal energy consumption E m,i,k (t) exactly follows that: where ξ 2 is the constant number.
Based on the outcome of Theorem 1 and Theorem 2 obtained above, we present a distributed dynamic optimal transmit power update algorithm to jointly realize the dynamic optimal transmit power allocation and update implementation, which is sketched in Algorithm 2. The proposed algorithm returns the non-cooperative and cooperative optimal power allocation for each SWB on the basis of two dynamic power optimization strategies as its output.First, we initialize and calculate all the necessary parameter values as exactly indicated in the algorithm.Then, we compute and allocate the optimal transmit power in terms of the non-cooperative and cooperative modes according to the proposed differential game theoretic framework.Next, we devise an iteration implementation process to ensure fast convergence of the update of transmit power.By updating the optimal transmit power for each SWB, the iteration process is terminated when the interference temperature limit for each PMD is guaranteed is reached.for m = 1 to nW do

7:
end for Complexity Analysis: We now analyze the complexity of Algorithm 2. The main computational complexity of Algorithm 2 lies in obtaining the optimal transmit power and then implementing the power update.The total number of operations for obtaining optimal power allocation would be |A k (t)| n W .Note that A k (t) is a n W -tuple player set to represent a current active SWB set.Thus, the complexity order is calculated as O n 2 W .For the power update implementation, there are at most |A k (t)| n W iterations in the worst case to guarantee the interference temperature limit for each PMD.Hence, the complexity for these inner iterations is also on the order of O n 2 W .In addition, considering that there are n P PMDs in the primary system, the power update process entails O (n P ) operations at most.Then the complexity of the power update implementation needs O n P n 2 W .Therefore, the overall computational complexity of the proposed algorithm is achieved by the order of O n 2 W + O n P n 2 W = O n P n 2 W .

Utility Discussion
As aforementioned previously, we rigorously derive the non-cooperative and cooperative optimal solutions to the utility maximization problem and total utility maximization problem, respectively.We can observe that the cooperative optimal power is lower that the non-cooperative optimal power because the performance of CWBSNs has been improved due to the SWBs' cooperation with each other to achieve optimal uplink transmission in intra-CWBSN.In order to get some insights and further assess the performance of dynamic power optimization for the individual players from the point of view of non-cooperation and cooperation relations, we proceed to reconsider another performance metric: actual utility distributed to each player.
According to differential game theory [37], the utility that each player has gained in the non-cooperative dynamic power optimization can be defined as value function L m,i,k p * m,i,k , E * m,i,k , which has been reformulated by (36).Then we just need to calculate the utility of each player under the cooperative dynamic power optimization.Note that the utility of each player is deemed to be the Shapley value in cooperative game framework, aiming to propose the fairest allocation of the utilities obtained by all the collaborative players.Let us first formulate a nonvoid proper subset of player set A k (t) as M k (t), in which |M k (t)| ϑ players are the entirely collaborative players during the game, for M k (t) ⊂ A k (t) and M k (t) = ∅ over channel k at time t.However, other n W − ϑ players belong to the non-cooperative players.In other words, ϑ collaborative players agree to constitute a partial coalition M k (t) of player set A k (t) to play against other n W − ϑ non-cooperative players.The following definition set a formal formulation of the Shapley value that is used to distribute the total cooperative utilities to the collaborative players.

Definition 3. (Shapley Value)
The Shapley value O m,i,k p m,i,k , E m,i,k for SWB i in CWBSN m over channel k under the cooperative dynamic power optimization can be mathematically defined as: where W m,i,k (M k (t)) is the predefined value function of the cooperative dynamic power optimization under partial coalition M k (t), and is the derived value function exploited by the non-cooperative dynamic power optimization.
With this definition, we need to provide a theoretical derivation of the predefined value function to calculate the Shapley value.Notice that the partial coalition M k (t) refers to a partially cooperative scenario that ϑ players agree to cooperate in contrast to a fully cooperative scenario that n W players make joint strategies via full cooperation among those players.Under this scenario, let us build a dynamic programming formulation to derive an optimal solution to the cooperative dynamic power optimization under partial coalition M k (t).In this formulation, our target is to maximize the total utilities of all SWBs belonging to the partial coalition M k (t) throughout time interval [t 0 , t 0 + D Slot ] while satisfying the constraints (27b)-(27d), which can be precisely expressed as: where constraint (45c) indicates that the optimal power values of other n W − |M k (t)| non-cooperative players should be represented by the non-cooperative optimal power in (31).Let p m,i,k (t) denote the cooperative optimal transmit power and energy consumption for SWB i in CWBSN m over channel k under partial coalition M k (t), respectively.Invoking Pontryagin's maximum principle, we also assume that there exists a continuously differentiable auxiliary value function W m,i,k (M k (t)) satisfying the partial differential equation given as follows [37]: Theorem 3. The cooperative optimal transmit power p m,i,k (t) constitutes the cooperative optimal solution to the total utility maximization problem in (45) if and only if the cooperative optimal transmit power p M k (t) m,i,k (t) and the value function W m,i,k (M k (t)) can be formulated as: Proof : The proof of the theorem is similar to that of Theorem 2, and thus is omitted due to space limitations.
We now claim that the optimal power of all players including partial coalition M k (t) and other n W − ϑ noncooperative players can be determined as follows: Similarly, we can also say that the cooperative optimal energy consumption E m,i,k (t) follows that: where ξ 3 is the constant number.On substituting (50) and the results of Theorem 3 into (46), with the aid of some algebraic manipulations, we can rewrite: Applying the value functions given by ( 51) and ( 36) into (44), by direct computation, we immediately has the Shapley value under the cooperative dynamic power optimization.It is evident that the expression of the Shapley value seems too complicated.For ease of discussion, let us consider a simplified scenario with n W = 3 players denoted by N W = {1, 2, 3} to play the game, and the partial coalition comprises ϑ = 2 players denoted by M k (t) = {1, 2}.
Under the above setting, the corresponding Shapley value can be exactly reformulated as follows: Finally, we are then ready to compare the utilities between the non-cooperative and cooperative dynamic power optimization to evaluate the performance of dynamic power optimization from the perspective of non-cooperation and cooperation relations.For simplicity, let us define ξ 1 = ξ 3 = 1.From our previous calculations in (52), we deduce that: In this scenario, the circles with dashed lines correspond to the PMDs, and the dash-dotted lines stand for the CWBSNs.
if and only if the following inequalities should simultaneously hold that: Proof : Please refer to Appendix C.  We wish to remark that Theorem 4 provides the fundamental condition that the utility of cooperative scenario is larger than that of competitive scenario according to the actual behaviors of players in the proposed game framework.
In summary, we further conclude that the performance of the cooperative dynamic power optimization is better than that of the non-cooperative dynamic power optimization from two aspects: the optimal power and the actual utility gained by each player.

Simulation Setup
In this section, we present the simulation results to verify our theoretical analysis and evaluate the performance of our proposed optimization framework under a given system configuration.As depicted in Fig. 4, the simulations are conducted on a hospital room scenario within a three-dimensional space of 40 m × 40 m × 3 m.In this scenario, one CG and n W = 4 underlay CWBSNs coexist with a primary system consisting of n P = 2 PMDs in a hospital room.
More precisely, the CG is deployed on the center of the roof, i.e., the three-dimensional spatial point (20,20,3) 4.Moreover, the locations of two PMDs marked by PMD 1 and PMD 2 are assumed to be set at the location points (4, 7, 1.5) m and (35, 14, 2) m, respectively.We would like to mention that our proposed optimization framework is conducted within this given simulation configuration, which can be interpreted as a specific predefined scenario with limited network nodes due to the space limitations.However, the results about this framework will be easily extendable to the general case for increased density of nodes.The duration of physiological data sampling phase and physiological data transmission phase for every SWB is set to t D = 10 ms and t T = 12 ms, respectively.For TDMA-based access, we consider two access methods used in the simulations, i.e., ordered channel access and disordered channel access, respectively.In ordered channel access, n S = 5 SWBs in every CWBSN will access to the time slot successively, as shown in Fig. 3(a).In the case of disordered channel access, the time slot will be obtained by n S = 5 SWBs in every CWBSN according to the discrete uniform distribution on the integers 1, 2, • • • , n S coordinated by the SC.In this case, each SWB in every CWBSN has equal probability 1  5 to obtain one specific time slot to access the channel.For instance, SWB 1, SWB 2, SWB 3, SWB 4, and SWB 5 in CWBSN 1 all have the same probability 1  5 to access slot 1 in Fig. 3(b).
The whole authorized spectrum is assumed to be separated into n C = 12 licensed channels, and the bandwidth for every channel is set to W C = 5 MHz.Particularly, we employ a slow flat fading channel model to characterize the uplink wireless transmission from every SWB to the corresponding SC in CWBSN.In the considered model, the path-loss exponent is set to ω = 2.We suppose that the receiving reference power by every SC is defined as the same value P 0 = 8 dBm under the reference distance d 0 = 1 m.The background noise power spectral density at every SC is assumed to be n 0 = −15 dBm.We further adopt a constant processing gain factor given by Γ = −1.5/log 2 (5 • BER), where the acceptable BER is set to 10 −3 for multiple quadrature amplitude modulation with symbol period 52.5 µs.
In addition, the interference temperature limit is initialized as I max n,k = 20 dBm, ∀n ∈ N P = {1, 2} and ∀k ∈ N C = {1, 2, • • • , 12}, for every PMD to protect them from harmful interference generated by all the current active SWBs.
In all the simulations, the detection probability φ d k and the false alarm probability φ f k of each licensed channel for the CG are uniformly distributed over [0.95, 0.99] and [0.01, 0.2], respectively.The priori probability φ k of PMD's occupation on each licensed channel is also assumed to be uniformly distributed over [0, 1].For physiological data sampling, we suppose that each SWB can only sample A = 3 attributes of physiological data during the physiological data sampling phase, namely, Attribute 1, Attribute 2, and Attribute 3. The sampling intervals of the physiological data sampling phase are set to be the same value for each SWB, i.e., K = 20.Due to the lack of empirical data about physiological data sampling for different attributes, we assume that the minimum value and maximum value of physiological data sampling adopted by Algorithm 1 among three kinds of attributes for each SWB are initialized in Fig. 5.For every sampled attribute of physiological data sampling value, the number of equal subintervals in    Based on this setting, the approximate probability distribution of every sampled attribute of physiological data sampling value generated by Algorithm 1 can be assumed to satisfy the distribution for each SWB in the considered CWBSNs, as depicted in Fig. 6.For simplicity, we assume that the statistical probability distribution of every attribute of sampling value is set to be the same distribution for each SWB over the corresponding CWBSN, as shown in Fig. 7.For Attribute 1, Attribute 2, and Attribute 3, the weight of sampled attribute for every SWB is set to be 0.3, 0.3, and 0.4, respectively.As a result, we can obtain the relative divergence between the approximate probability distribution and the statistical probability distribution for every attribute of sampling value, as well as the weighted average relative divergence for each SWB over the corresponding CWBSN (see Table 5).
The proposed optimization framework using Algorithm 2 is compared against the classical distributed constrained power control (DCPC) algorithm in [39].The DCPC algorithm can be regarded as a SINR balancing constrained power allocation algorithm which distributively and iteratively searches for decentralized transmit power updated λ C E = 0.008.It is observed that the optimal transmit power for every SWB in the corresponding CWBSN no matter non-cooperative or cooperative scenarios will gradually increase with the continuous evolution of discount factor r form 0.1 to 0.9.This occurs due to the fact that discount factor r has a significant impact on the optimal transmit power of every SWB according to Theorem 1 and Theorem 2. That is, larger discount factor used in the proposed optimization framework means higher transmit power of every SWB, and vice versa.We observe from Fig. 8 that the cooperative optimal transmit power of every SWB increases greatly with the growth of discount factor r.However, with the same value of discount factor, the cooperative optimal transmit power of every SWB is obviously lower than that of the non-cooperative optimal power.Hence, the performance of the cooperative dynamic power optimization outperforms that of the non-cooperative scheme, which is in line with the theoretical analysis.Fig. 9 depicts the optimal transmit power using the proposed algorithm versus discount factor r for n S = 5 SWBs of CWBSN 1 under ordered channel access with D Slot = 22 ms and λ C E = 0.05.In order to focus on the main findings and make the plot clear, we only show the results for n S = 5 SWBs of CWBSN 1.It is interesting to observe that the non-cooperative optimal transmit power of every SWB in CWBSN 1 using our presented algorithm is always larger than the cooperative optimal transmit power of the same SWB.From Fig. 9, we also see that the optimal transmit power of every SWB in CWBSN 1 under a given discount factor satisfy the interference temperature constraint I max n,k = 20 dBm for any PMD.From the results, we can immediately observe that the cooperative optimal transmit power of SWB 1 is very less than the non-cooperative optimal power and the results by using the DCPC algorithm with = 250 iterations.Meanwhile, the non-cooperative optimal power of SWB 1 is also smaller than that of the benchmark DCPC algorithm.This figure further provides a hint to choose appropriate power control mechanisms.In addition, we can find from this figure that the optimal transmit power of SWB 1 by using the DCPC algorithm presents two fixed power values, i.e., nearly 74 mW for CWBSN 1 and 42 mW for CWBSN 3, respectively.Such behavior can be interpreted as follows: the optimal transmit power of SWB 1 by adopting the benchmark algorithm converges to an expected equilibrium point after = 250 iterations.Therefore, compared to the DCPC algorithm, both cooperative and non-cooperative optimal transmit power results own better performance, which validates the effectiveness of the proposed optimization framework in e-healthcare leveraging CWBSNs.Fig. 11 presents the impact of the cost pricing factor for energy consumption on the optimal transmit power of SWB 1 in CWBSN 1 and CWBSN 3 for the proposed algorithm and the DCPC algorithm under ordered channel access with D Slot = 22 ms and r = 0.5.It can be seen from the figure that the optimal transmit power of SWB 1 with our proposed optimization framework monotonically decreases with the increasing values of cost pricing factor for energy consumption.Also note that the optimal transmit power of SWB 1 obtained by the DCPC algorithm approaches two constant power values after 250 iterations, i.e., approximately 74 mW for CWBSN 1 and 42 mW for CWBSN 3, respectively.From Fig. 11, we can find that the optimal transmit power of SWB 1 under our proposed optimization  In Fig. 12, we show the comparison of average transmitted bits of physiological data for every CWBSN between the proposed optimization framework and the DCPC algorithm under ordered channel access with D Slot = 22 ms and λ C E = 0.01.From Fig. 12(a), with the given discount factor r = 0.2, we can observe that the proposed optimization framework provides more average transmitted bits of physiological data to every CWBSN than that of the DCPC algorithm with = 250 iterations.It could be concluded that despite the reduced transmit power using the proposed framework, the overall system capacity increases due to the improved SINR.From the perspective of CWBSN, the average transmitted bits of physiological data in CWBSN 1 is larger than that of other CWBSNs.Fig. 12(b) shows the results of average transmitted bits of physiological data for every CWBSN under the constraint of discount factor r = 0.8, which are similar to the results in Fig. 12(a).In contrast, the average transmitted bits of physiological data for every CWBSN with discount factor r = 0.8 outperforms the obtained average transmitted bits by using discount factor r = 0.2, and achieves the better performance.This highlights the importance of properly tuning the discount factor allowing, theoretically, the better system capacity for every CWBSN.Overall our proposed optimization framework via 2 much better than other compared DCPC algorithm.

Performance of the Proposed Optimization Framework in Disordered Channel Access
In this set of results, we move on to explore the performance of the proposed optimization framework in disordered channel access wherein n S = 5 SWBs in every CWBSN can obtain the time slot based on the discrete uniform distribution.In Fig. 13, the simulated optimal transmit power for n S = 5 SWBs of n W = 4 CWBSNs in terms of the competitive and fully cooperative scenarios using the proposed optimization framework are plotted against discount factor r with D Slot = 22 ms and λ C E = 0.008.It is noted that different SWBs in each CWBSN access to the same time slot due to the adopted disordered channel access in TDMA-based access scheme.From Fig. 13, we can find that the cooperative optimal transmit power performs better than the non-cooperative optimal power for SWBs in every time slot.This fact validates the analysis in Section 4 in the paper.Furthermore, we observe that the optimal transmit power for each SWB grows slightly, especially for cooperative power, as discount factor r increases, this is because the optimal transmit power of every SWB is influenced by discount factor.Referring to this figure, we see that the non-cooperative and cooperative optimal transmit power for SWB 1, SWB 4, and SWB 5 of CWBSN 3 is obviously larger than that of SWB 2 and SWB 3 in CWBSN 3 when discount factor r is greater than 0.2.Moreover, as discount factor r increases, the non-cooperative and cooperative optimal transmit power for SWB 1, SWB 4, and SWB 5 of CWBSN 3 then increases markedly.This implies that we need to properly reduce the discount factor to obtain the lower transmit power.It is further observed that the non-cooperative optimal transmit power of every SWB in CWBSN 3 via the presented algorithm is always larger than the cooperative optimal power of the same SWB.
In Fig. 15, we look at the performance comparison of the optimal transmit power of SWB 2 in CWBSN 1 and SWB 5 in CWBSN 3, i.e., during time slot 1, between the proposed optimization framework and the DCPC algorithm in disordered channel access with D Slot = 22 ms and λ C E = 0.01.We notice from Fig. 15 that the optimal transmit power of SWB 2 in CWBSN 1 and SWB 5 in CWBSN 3 within time slot 1 by using the proposed optimization framework is clearly lower than the simulated results via the DCPC algorithm with = 250 iterations.As shown in the figure, the cooperative optimal transmit power has the lowest values, followed by the non-cooperative optimal power, and finally the DCPC algorithm as the benchmark.The figure manifests the importance of the selection of the competitive and cooperative scenarios to improve the energy efficiency and maintain the long-term low-power operations for every SWB.From the results, we further observe that the transmit power using the DCPC algorithm for SWB 2 in CWBSN 1 and SWB 5 in CWBSN 3 converges to an expected equilibrium value after = 250 iterations.Thus, combining these results demonstrates that the overall performance of the proposed optimization framework is better than that of the benchmark algorithm.
Next, we analyze the impact of varying cost pricing factor for energy consumption on the optimal transmit power of SWB 2 in CWBSN 1 and SWB 5 in CWBSN 3 by comparing the proposed optimization framework and the DCPC algorithm in disordered channel access with D Slot = 22 ms and r = 0.5.We consider cost pricing factor λ C E for energy consumption varies from 0.004 to 0.018 while the iteration index for the DCPC algorithm is also set to = 250.The results plotted in Fig. 16 show that with the increase in the value of cost pricing factor the optimal transmit power of SWB 2 and SWB 5 monotonically decreases for our proposed optimization framework.We can also clearly observe that the optimal transmit power of SWB 2 and SWB 5 reaches the certain values 2.2 mW and 21.5 mW, respectively, after 250 iterations.From Fig. 16, we can conclude that the proposed optimization framework obtains a beneficial power reduction compared with the DCPC algorithm by adjusting the cost pricing factor.Moreover, the optimal transmit power of SWB 2 and SWB 5 with cooperative relations is always lower than that of competitive scenarios, which means that the cooperative power control outperforms the non-cooperative result.Such an insight, to some extent, is aligned with the fact that the SWBs' cooperation with each other can achieve optimal uplink transmission in intra-CWBSN when compared with the cooperative relations among the SWBs.
Finally, Fig. 17 exhibits the comparison of the average transmitted bits of physiological data for every CWBSN between the proposed optimization framework and the DCPC algorithm in the context of disordered channel access with D Slot = 22 ms and λ C E = 0.01.From Fig. 17(a), we can easily see that when discount factor r = 0.2, the proposed optimization framework achieves more average transmitted bits of physiological data than the DCPC algorithm with = 250 iterations.It can be also observed that the average transmitted bits of physiological data every CWBSN in cooperative scenario is larger than the results in competitive scenario.As expected, with the aid of the cooperative optimal power optimization, the overall system capacity for each CWBSN increases owing to the improved SINR over cooperative scenario.In addition, the average transmitted bits of physiological data for CWBSN 1 achieves best performance than that of other CWBSNs when discount factor r = 0.2.The same behaviors can also be observed in Fig. 17(b) where the proposed optimization framework with discount factor r = 0.8 achieves more average transmitted bits of physiological data than the DCPC algorithm with = 250 iterations.Therefore, from the above discussed results, it is revealed that for dynamic power optimization in our proposed e-healthcare system, rationally handling the discount factor and choosing the cooperative mechanism substantially influences the system performance.

Conclusion
In this paper, we studied the dynamic power optimization problem for e-healthcare leveraging CWBSNs with imperfect spectrum sensing, and proposed a distributed optimization framework of dynamic power optimization to mitigate inter-CWBSN co-channel electromagnetic interference and improve the energy efficiency for SWBs.We employed the mathematical statistics method to obtain the approximate probability distribution of sampling value for every SWB during the physiological data sampling phase.With this distribution, we designed the weighted average relative divergence for every SWB via the Kullback-Leibler divergence to characterize the quality of physiological data sampling.By considering the quality of physiological data sampling, we further defined the utility function of every SWB.Then the problem of dynamic power optimization for SWBs was formulated as a differential game model, in which the utility function was maximized throughout a continuous time interval while satisfying the evolution law  of energy consumption in the battery of every SWB.From the perspective of the non-cooperation and cooperation relations for SWBs, we converted the game model into two subproblems, namely, utility maximization problem and total utility maximization problem, and also derived the non-cooperative and cooperative optimal solutions for power optimization.We further showed that the performance of the cooperative dynamic power optimization is better than that of the non-cooperative optimization in terms of the optimal power and the actual utility gained by each SWB, respectively.Our simulation results demonstrated the practicality of implementing the proposed optimization framework to achieve considerable amount of power savings and improved system capacity in e-healthcare scenario.
Besides, we showed that better performance can be achieved by using the cooperative mechanism to maximize the overall utilities rather than the competitive scheme to obtain the individual utility maximization.which constitutes the NE to the non-cooperative game G k .This concludes the proof of our theorem.

Appendix B. Proof of Theorem 2
The proof is very similar to that of Theorem 1.The only difference is that the optimization objective of the total utility maximization problem in (37) is to maximize the sum of the utility functions of all SWBs.Likewise, in this appendix, we are also ready to transform the total utility maximization problem in (37)  λ C E = n W λ C E for each SWB.Upon solving the partial differential equation, we can immediately obtain the auxiliary function as the partial differential equation calculated by (40).By substituting the expression of value function in (40) into (B.1),we can easily get the cooperative optimal solution as formulated by (39).As a result, Theorem 2 is derived.

Appendix C. Proof of Theorem 4
From the right hand side of (53), we observe that it can be verified that the component For the discount factor 0 < r < 1, it then follows from (C.3) and (C.4) that the inequalities (54) and (55) are both exactly satisfied.This completes the proof.

Figure 1 :
Figure 1: Illustration of the application scenario of e-healthcare framework leveraging CWBSNs in a hospital environment.
channel k, for k ∈ N C .For CWBSN m, there are one near-body SC denoted by C m and n S SWBs, which are located on the patient to transmit the sampled physiological data including patient's vital signs and movements to SC C m over channel k, for m ∈ N W .Let N m,S = {1, 2, • • • , n S } be the set of SWBs in CWBSN m.Based on the collected physiological data via uplink transmission, every SC performs data fusion and decision making.Under the uplink

Figure 3 :
Figure 3: SC coordination mechanism for every SWB in the corresponding CWBSN based on TDMA-based co-channel media access policy.

Figure 4 :
Figure4: Simulation scenario with one CG, n W = 4 underlay CWBSNs, and n P = 2 PMDs located in a hospital room.In this scenario, the circles with dashed lines correspond to the PMDs, and the dash-dotted lines stand for the CWBSNs.

Figure 5 :
Figure 5: Comparison between minimum value and maximum value of physiological data sampling used in Algorithm 1 among A = 3 attributes, marked by A1, A2, and A3, respectively, for n S = 5 SWBs.

Figure 6 :
Figure 6: Approximate probability distribution of A = 3 sampled attributes for n S = 5 SWBs in n W = 4 CWBSNs.

Figure 7 :
Figure 7: Statistical probability distribution of A = 3 sampled attributes for n S = 5 SWBs for every CWBSN.
from the -th iteration to the ( + 1)-th iteration.Let us use γ tar m,i,k to denote the target received SINR of SC C m in CWBSN m over channel k to maintain a certain QoS requirement.In the simulations, the target received SINR is determined as γ tar m,i,k = 8 dB, ∀m ∈ N W = {1, 2, 3, 4}, ∀i ∈ N m,S = {1, 2, • • • , 5}, and ∀k ∈ N C = {1, 2, • • • , 12}.In this way, the DCPC algorithm iteratively updates the transmit power of SWB i in CWBSN m over channel k by using the following iterative function:p ( +1) m,i,k = min P max m,i , ,k , = 0, 1, 2, • • • .(56)5.2.Performance of the Proposed Optimization Framework in Ordered Channel AccessWe first evaluate the results of the non-cooperative and cooperative optimal transmit power allocation for the proposed optimization framework using Algorithm 2, by varying the discount factor in the context of ordered channel access.In Fig.8, we show the simulated non-cooperative and cooperative optimal transmit power of the proposed algorithm versus discount factor r for n S = 5 SWBs of n W = 4 CWBSNs in different time slots with D Slot = 22 ms and

Figure 8 :
Figure 8: Non-cooperative and cooperative optimal transmit power versus discount factor r for n S = 5 SWBs of n W = 4 CWBSNs using the proposed algorithm under ordered channel access, where D Slot = 22 ms and λ C E = 0.008.
(a) Non-cooperative optimal transmit power (b) Cooperative optimal transmit power

Figure 9 :
Figure 9: Optimal transmit power versus discount factor r for n S = 5 SWBs of CWBSN 1 using the proposed algorithm under ordered channel access, where D Slot = 22 ms and λ C E = 0.05.

Figure 10 :
Figure 10: Optimal transmit power versus discount factor r between the proposed algorithm and the DCPC algorithm during time slot 1 under ordered channel access, where D Slot = 22 ms and λ C E = 0.01.

Figure 11 :
Figure 11: Optimal transmit power versus cost pricing factor for energy consumption λ C E between the proposed algorithm and the DCPC algorithm during time slot 1 under ordered channel access, where D Slot = 22 ms and r = 0.5.

Fig. 10 displays
Fig.10displays the results of the optimal transmit power of SWB 1 (i.e., during time slot 1) in CWBSN 1 and CWBSN 3 versus discount factor r for the proposed optimization framework and the DCPC algorithm the context of ordered channel access with D Slot = 22 ms and λ C E = 0.01.From the results, we can immediately observe that the cooperative optimal transmit power of SWB 1 is very less than the non-cooperative optimal power and the results (a) Discount factor r = 0.2 and iteration index = 250 (b) Discount factor r = 0.8 and iteration index = 250

Figure 12 :
Figure 12: Comparison of average transmitted bits of physiological data between the proposed algorithm and the DCPC algorithm under ordered channel access, where D Slot = 22 ms and λ C E = 0.01.

Figure 13 :
Figure 13: Non-cooperative and cooperative optimal transmit power versus discount factor r for n S = 5 SWBs of n W = 4 CWBSNs using the proposed algorithm under disordered channel access, where D Slot = 22 ms and λ C E = 0.008.
(a) Non-cooperative optimal transmit power (b) Cooperative optimal transmit power

Figure 14 :
Figure 14: Optimal transmit power versus discount factor r for n S = 5 SWBs of CWBSN 3 using the proposed algorithm under disordered channel access, where D Slot = 22 ms and λ C E = 0.02.Fig. 14 compares the non-cooperative and cooperative optimal transmit power for n S = 5 SWBs of CWBSN 3 in ordered channel access by employing the proposed algorithm under the discount factor r varying from 0.1 to 0.9 with D Slot = 22 ms and λ C E = 0.02.Referring to this figure, we see that the non-cooperative and cooperative optimal transmit power for SWB 1, SWB 4, and SWB 5 of CWBSN 3 is obviously larger than that of SWB 2 and SWB 3 in

Figure 15 :
Figure 15: Optimal transmit power versus discount factor r between the proposed algorithm and the DCPC algorithm during time slot 1 under disordered channel access, where D Slot = 22 ms and λ C E = 0.02.

Figure 16 :
Figure 16: Optimal transmit power versus cost pricing factor for energy consumption λ C E between the proposed algorithm and the DCPC algorithm during time slot 1 under disordered channel access, where D Slot = 22 ms and r = 0.5.

Figure 17 :
Figure 17: Comparison of average transmitted bits of physiological data between the proposed algorithm and the DCPC algorithm under disordered channel access, where D Slot = 22 ms and λ C E = 0.01.

Table 1 :
Summary of key acronyms used in our proposed e-healthcare system.

Table 2 :
Summary of main notations.

Table 3 :
Four different cases for spectrum sensing carried out by the CG.
Definition 2. (Nash Equilibrium) A set of strategies p * m,1,k (t) , p * m,2,k (t) , • • • , p * m,nW,k (t) associated with all the current active SWBs is the NE to the non-cooperative game G k if and only if the following inequality must be satisfied for SWB i, for p m,i,k (t) ∈ P m,i,k (t) and E m,i,k (t) ∈ E m,i,k (t): i is the steady state of the non-cooperative game G k , which depends only on the present time t ∈ [t 0 , t 0 + D Slot ] and the present state E m,i,k (t), but not on the initial state E m,i,k (t 0 ).From a perspective of locality, our objective is to optimize the transmit power for every SWB individually over the considered interval [t 0 , t 0 + D Slot ].Thus, to simplify the analysis of the problem, it will be reasonable for us to relax the terminal time t 0 + D Slot to an infinite time horizon one, i.e., t 0 + D Slot → +∞.Thereby, (27)utility maximization problem in(27)is converted into an infinite horizon differential game model by the help of the time relaxation mechanism.With such a relaxation process in mind, Bellman's dynamic programming technique can be applied to derive the NE solution to by solving the partial differential equation associated with each player[37].The technique is given by Lemma 1 as detailed below.Lemma 1.An n W -tuple set of feedback strategies p * m,i,k (t) ∈ P m,i,k (t) |∀i ∈ A k (t)provides a NE solution to the infinite horizon differential game framework based on the utility maximization problem in(27), if there exist the continuously differentiable value function Ξ i (p m,i,k , E m,i,k ) for SWB i in CWBSN m over channel k, satisfying the following partial differential equation: Algorithm 2 Distributed Dynamic Optimal Transmit Power Update Algorithm Input: Current transmit power of nW active SWBs {p m,i,k (t) ∈ P m,i,k (t) |∀m, ∀i }.Output: Optimal Transmit Power of nW active SWBs { p m,i,k (t) ∈ P m,i,k (t) |∀m, ∀i }. 1: Initialization: i) System configuration: Set of nW active SWBs A k (t) and number of CWBSNs nW.ii) Transmission model: Reference distance d0, path-loss exponent ω, and distance from SWB to SC {dm,i |∀m, ∀i }. iii) Sampling model: Number of attributes A, attribute weight {w m,i,k,α |∀m, ∀i }, relative divergence {D m,i,k,α |∀m, ∀i }. iv) Game model: cost pricing λ C E and discount factor r. 2: Calculation: Maximum transmit power P max m,i |∀m, ∀i and weighted average relative divergence {E [d m,i,k ] |∀m, ∀i }. 3: Dynamic Optimal Transmit Power Allocation: 4: for i = 1 to |A k (t)| do 8: end for 9: Transmit Power Update Implementation: 10: for n = 1 to nP do for i = 1 to |A k (t)| do 14:for m = 1 to nW do 15:Update power p m,i,k (t) ← p m,i,k (t) × p m,i,k (t) / n W i=1 p m,i,k (t) .

Table 4 :
The predefined location points of the SWBs for every CWBSN in simulation scenario.

Table 5 :
The relative divergence between the approximate probability distribution and the statistical probability distribution as well as the weighted average relative divergence used in the simulations.
(38) an infinite horizon differential game model by relaxing the terminal time t 0 + D Slot as infinite time horizon.Performing the maximization operation of the right hand side of(38)with respect to the variable p m,i,k (t), we have the following result:∂F m,i,k p m,i,k , E m,i,k ∂E m,i,k (t)Plugging (B.1) back into(38), after some algebraic manipulations, we immediately have:F m,i,k p m,i,k , E m,i,k = ∂F m,i,k p m,i,k , E m,i,kFollowing (B.2), we start by deriving the derivative of value function F m,i,k p m,i,k , E m,i,k with respect to the cooperative optimal energy consumption E m,i,k (t) in (41).It is sufficient to prove that 2rE [d m,i,k ]   ∂F m,i,k p m,i,k , E m,i,k ∂E m,i,k (t)   2 + E R m,i,k (t) − t T p m,i,k (t) − E m,i,k (t) r • ∂F m,i,k p m,i,k , E m,i,k ∂E m,i,k(t) 1 6 (•) since we can exactly get the following main result: E (r + 1) E [d m,i,k ] > 0. (C.2) Furthermore, in order to guarantee that O m,i,k p m,i,k , E m,i,k > L m,i,k p * m,i,k , E * m,i,k, we can also obviously check that: