QoE-Maximized Beam Assignment and Rate Control for Dynamic mm-Wave-Based Full-Duplex Small Cell Networks

This study investigates the resource management problem for millimeter-wave-based switched beam (SWB) full-duplex small cell networks with the consideration of user equipment’s (UE’s) quality of experience (QoE) requirement and time-varying wireless channels. An optimization problem is formulated to maximize the long-term QoE by implementing beam assignment (BA) and rate control (RC) under short-term beam and long-term energy efficiency constraints. By leveraging the Lyapunov optimization technique, the original problem is converted into a series of BA and RC problems in each time slot. To solve the converted problem with affordable complexity, novel closed-form solutions for BA and RC are first derived by considering beam constraints in SWB systems. A decomposition-based BA and RC (DBR) algorithm with only polynomial computational complexity is then proposed based on the derived closed-form solutions. The simulation results demonstrate that the proposed DBR method can effectively[‘effectively’ appears to be a more suitable word in this case.] balance the performance and complexity because the DBR scheme outperforms the benchmark scheme and achieves nearly optimal performance in terms of system delay and QoE.


I. INTRODUCTION
T HE IMMENSE mobile traffic has increased and led to the 5G standard to aim for about 100 times the aggregate data rate of 4G [1]. This explosive growth imposes stringent data rate requirements on 5G mobile communication systems. Due to the large bandwidth in millimeter (mm)wave bands, mm-wave communications have been regarded as a promising technology to meet data rate requirements for future wireless cellular networks. The main challenges of mm-wave links emanate from the severe path losses and signal blockages [2], [3].
The reliable communication of the mm-wave band relies on the beamforming technique [4], [5] to compensate for the path loss effect, whereas the performance gain of mm-wave transmission is influenced by the beamforming resource utilization efficiency [6], [7]. In [4], a hybrid beamforming algorithm was developed to minimize power consumption while guaranteeing the receive quality of users in the network. Reference [5] extended the application of the beamforming technique to high-speed railway systems. It is demonstrated that reliable transmission along the railway can be achieved by using the proposed beam boundary determination method. In [6], a suboptimal beam allocation algorithm is proposed to prevent large computation complexity. To provide further insight into the performance of the suboptimal algorithm, [7] applied a submodular optimization theory to analyze the upper-bound performance of their proposed beam selection algorithm. It can be observed from [4], [5], [6], [7] that current beamforming and beam allocation schemes are implemented in half-duplex (HD) modes; however, they induce spectral efficiency (SE) reduction [8]. By transmitting and receiving signals in the same time slot and frequency band, full-duplex (FD) has the potential to double the SE of HD communications. The major challenge of FD is the self-interference (SI) created from the downlink (DL) transmission to uplink (UL) receiver [9]. Based on the recent breakthrough of SI cancellation techniques [10], [11], [12], [13], approximately 100 dB of the SI can be eliminated [14], and the feasibility to implement FD communications can be realized. In [15], a joint power allocation and clustering problem for drones in the FD networks is considered. The proposed algorithm, which adopts multi-agent reinforcement-learning framework and fractional programming, possess feasible complexity and superior performance over benchmark schemes. To balance system performance and clustering cost, [16] considers joint clustering and duplex mode selection problem for the ultradense networks. A deep reinforcement learning scheme is proposed to implement clustering, duplex mode switching, and resource allocation. The advantages of smart duplex over both HD and FD systems is validated in its numerical results. However, FD networks operating in mm-wave band is not considered in [15] and [16].
Moreover, energy efficiency (EE) is an important metric in mm-wave FD systems [8], [17], [18], [19], because the hardware operating in mm-wave band typically consumes more power compared to hardware working at a lower frequency [17]. Currently, SI mitigation in FD leads to considerable power consumption [19]. To obtain higher EE performance, [18] revealed that a small cell base station (SBS) is suitable for deploying FD because its low power and short transmission distance will maintain residual SI at low-level power. Reference [17] first investigated the EE-oriented resource allocation via power, subcarrier and throughput assignment under cross-layer constraints. To illustrate the influence of the cross-layer effect on the EE performance of FD, [19] discussed the power consumption and SI amount by applying different SI cancellation techniques, including passive suppression (PS), analog cancellation (AC) and digital cancellation (DC). Aside from EE performance, [8] regarded the quality of service (QoS) requirement in small cell networks (SCN) considering the residual SI and inter-user interference and ensuring that each flow can maintain high-rate transmission.
However, with the universal application of wireless services, user equipment (UEs) may require different quality of experience (QoE) types such as high-definition video and downloading a file. Conventional metrics such as data rate and EE cannot directly reflect the satisfaction of UEs [20], [21]. Therefore, each UE's QoE and QoE requirement that evaluates the performance of communication systems in terms of the UEs' subjective opinions should be considered. Additionally, current literature focuses only on the design for stationary channel models, i.e., snapshotbased model and infinite backlog assumption. In particular, problems in [8], [17], [18], [19] are subject to short-term EE or QoS constraints over a transmission period. These designs may result in inferior system performance since radio resources will be overused to achieve rigorous EE or QoS constraint when the channel condition in certain transmission periods is poor. Moreover, common optimization tools in snapshot-based models, such as interior point method, are not practical in dynamic wireless channel conditions since the demand of setting initial parameters and step sizes to guarantee convergence is a time-consuming process. The parameters have to be readjusted when the channel condition changes [22]. Furthermore, the quantity of data that can be transmitted at a physical layer is influenced by the result of the rate control (RC) at the application layer and amount of data in the queue backlog during the operation of practical systems [23]. Owing to this, radio resource utilization can achieve higher efficiency if the transmission rate targets the UE that has additional data to transmit/receive. Motivated by these factors, this study investigates the resource management problem of beam assignment (BA) and RC for a mm-wave-based switched beam (SWB) FD SCN considering QoE requirements and dynamic channel conditions. Unlike current designs for mm-wave FD systems, the objective function of the formulated problem aims to maximize long-term QoE of DL and UL communications subjected to short-term beam constraints and long-term EE constraints at the physical layer, as well as a long-term queue backlog stability constraint at the application layer. The main contributions of this study are summarized as follows: • Development of a novel QoE maximization framework: This framework jointly considers QoE maximization, EE requirements, queue stability, and beam usage limitations for the optimization of mm-wave-based FD networks. Moreover, the queue stability, average queue length, and QoE performance are investigated via both analytical and simulation results. • Comprehensive analytical results of mm-wave-based FD networks: By exploiting beam constraints in the considered networks and rigorous mathematical formulations, several analytical results for mm-wave-based FD networks are derived as follows: (1)    is comparably much lower than that of the ES method. Thus, the proposed DBR method achieves a feasible tradeoff between system performance and computational complexity.
The rest of the paper is organized as follows. Section II describes the system model and the problem formulation. Section III introduces the proposed DBR algorithm. The performance analysis is discussed in Section IV. Section V provides numerical results to illustrate the performance of DBR method. Finally, the conclusions is made in Section VI.
The key abbreviations and notations are summarized in Tables 1 and 2, respectively.

A. SYSTEM MODEL
As shown in Fig. 1, we consider an mm-wave based FD SCN consisting of K D DL UEs, K U UL UEs and N SWB FD SBSs. Each SBS has a linear array of M equally spaced identical isotropic antenna elements such that each SBS can form M beams. Each beam is able to implement signal transmission and reception simultaneously. Note that the SWB system serves multiple UEs by generating a fixed number of beams and pointing to predetermined directions [7]. Hence, multiple simultaneous beams in Fig. 1 are used to illustrate the directions of each beam. Moreover, the FD SCN operates in a time-slotted manner with the duration normalized to integer units; that is, slot t refers to [t, t + 1), t ∈ {0, 1, 2, . . .}. For each time slot, DL UEs require video services from servers on the Internet and video data is sent to the FD SBSs after the server receives requests. Part of the arrived data can be queued via the RC. These data are first buffered at the FD SBSs queue and sent to the corresponding DL UE via wireless channels. Moreover, UL UEs transmit queue backlog updates every time slot and the associated FD SBS will determine the amount of data entered into the corresponding data queue by sending permission signals via dedicated control channels.
Note that for UL transmission, queue backlog is at each UE. The traffic enters into queue backlog when UE uploads data to core networks, e.g., for streaming live video, which is a valid scenario as in [24]. Moreover, at the beginning of each time slot, each SBS can obtain DL queue state information (QSI) by observing its own queue backlog and acquire UL QSI by receiving queue updated signals from UEs, respectively. In addition, channel state information (CSI) can also be collected by FD SBSs from feedback channels. After the collection of QSI and CSI is completed, all of SBSs will exchange these information to the central scheduler via wired backhauls directly. With joint information of QSI and CSI, CS can reach appropriate resource allocation result through BA and RC. The central scheduler is at the core unit and can access all the data processing. Note that the considered scenario is consistent with centralized architecture of functional split networks option, which is regarded as a feasible system in future networks [25]. The BA result of the considered network for each time slot determines the UEs' transmission rate at the physical layer; whereas the RC policy determines the served quality for a user by adjusting the amount of data entering into the queue backlogs at the application layer as indicated in [26].

B. SIGNAL MODEL
Let data transmission rate from i-th FD SBS to k-th DL UE and m-th UL UE to n-th FD SBS at time slot t be denoted by R D i,k (t) and R U m,n (t), respectively. R D i,k (t) can be calculated using where d D i,k (t) is the power of the desired signals from i-th FD SBS to k-th DL UE at time slot t and is given as . The parameter c D i,j,k (t) ∈ {0, 1} is a binary indicator, where c D i,j,k (t) = 1 means i-th FD SBS allocates its j-th beam to k-th DL UE at time slot t. p D i,j (t) is the transmission power of i-th FD SBS allocated on the j-th beam at time slot t. It is considered that the total transmit power of i-th FD SBS is fixed at P D i (t) and is equally allocated to selected beams for DL transmission. Hence, p D i,j (t) in d D i,k (t) can be expressed as follows: with is the distance between i-th FD SBS and k-th DL UE at time slot t and α is the path loss exponent. D j (θ D i,k (t)) denotes the directivity of the j-th beam with regard to an angle of departure (AoD) θ D i,k (t) between i-th FD SBS and k-th DL UE at time slot t. In this study, the beams in FD SBSs are formed by applying the Butler method such that D j (θ D i,k (t)) can be rewritten as [7], [27] with M = 2 p (where p ≥ 1 is an integer), and the array factor with Note that to create fixed beams, the Butler method is applied as it is a representative solution with the advantage of straightforward design [28]. The number of beam formed by conventional Butler method is equal to the antenna elements, where total number of antenna element is limited to be any integral power of 2 [29]. Therefore, with M antenna elements, each SBS can form M beams in this paper. For more details of Butler method, please refer to [30]. This assumption is considered reasonable and has been widely adopted in current literature [27], [28], [29]. Moreover, the formulation of d D i,k (t) reflects that the line-of-sight (LoS) transmission is predominant in mm-wave band. Moreover, the dynamic variation of wireless channels is realized by varying the values of ρ −α D i,k (t) and D j (θ D i,k (t)) in each time slot. Such variation in channel condition can be interpreted as the outcome of the UEs' mobility. In addition, since it is difficult to obtain fast channel variations when the UE has high mobility, only the average CSI is considered in this study, which is indicated to be practical as shown in [31].
is the inter-user interference (IUI) power experienced by the k-th DL UE when receiving signals from the i-th FD SBS at time slot t and can be expressed as follows: . σ 2 D i,k is the variance of the additive white Gaussian noise (AWGN). Moreover, R U m,n (t) can be expressed as follows: d U m,n (t) is the power of signals transmitted from m-th UL UE to n-th FD SBS at time slot t and can be written as follows: where c U n,p,m (t), ρ −α U n,m (t), and D p (θ Un,m (t)) are defined in similar manner as c D i,j,k (t), ρ −α D i,k (t) and D j (θ Di,k (t)), respectively, and θ Un,m (t) represents the AoD between n-th FD SBS and m-th UL UE at time slot t. p U m (t) is the transmit power of m-th UL UE at time slot t. I SI m,n (t) in (7) is the SI power received by the n-th FD SBS when processing the m-th UL UE signals at time slot t and can be summarized as and D j (θ Si,n (t)) are defined similar to ρ −α U n,m (t) and D p (θ Un,m (t)) with θ Si,n (t) denoting the AoD between the i-th and n-th FD SBS at time slot t. γ is the SI cancellation amount and can be written as with γ PS , γ AC , and γ DC representing the effects from PS, AC, and DC [17], [19]. Note that the SI mitigation methods have been extensively developed in several literature [10], [11], [13], [17]. Thus, the resource allocation for FD networks consisting of SBSs with SI cancellation capability is considered practical. I U m,n (t) is the IUI power from other UL UEs received by the n-th FD SBSs when processing the m-th UL UE signals at time slot t and can be expressed as In addition, σ 2 U m,n in (7) is the variance of the AWGN. Furthermore, the total power consumption of the n-th FD SBS for UL communications at time slot t can be written as where p c,sta represents the static circuit power consumption. p γ is the power consumption for SI cancellation and can be written as [18] with p γ AC and p γ DC denoting power consumption of AC and DC respectively. Note that P U n (t) implies that SI cancellation will not be implemented if the n-th SBS does not serve any UL UE. The time-average EE of the n-th FD SBS for UL communications can therefore be given as follows: where the expectation of equation (14) is taken with respect to channel condition and the resource allocation result. The number of data briefly stored at the queue backlog of the i-th FD SBS to k-th DL UE and m-th UL UE at time slot t are respectively denoted by and Q U m (t) will evolve according to the following dynamic equations: (15) and Note that "briefly stored" implies that data packet will enter into queue backlog for a certain time duration and then transmitted to the UEs or SBSs. In addition, the mean rate of an individual queue which indicates that there exists finite constants ξ D and ξ U such that ξ D ≥ Q D i,k (t) and ξ U ≥ Q U m,n (t), ∀t. In other words, mean rate stable implies that the sum of transmission rates R D i,k (t) and N n=1 R U m,n (t) over long periods of time slots are larger than the amount of data admitted into the queue A D i,k (t) and A U m (t), in (15) and (16), respectively, i.e., and Besides, a network is stable if all individual queues in the network have a stable mean rate. A D i,k (t) and A U m (t) are the amounts of new data that are admitted into queue backlog of the i-th FD SBS to k-th DL UE and m-th UL UE at time slot t, respectively, and are given as and μ D i,k (t) and μ U m,n (t) are admission rates of data required to enter into the queue of the i-th FD SBS to k-th DL UE and m-th UL UE to n-th FD SBS, respectively. After new data enters into queue backlog, the QoE for DL and UL communications can be measured using the following metric function [32]: where X ∈ {D, U} represents either DL or UL communications. In particular, U X k (μ X k (t)) is refreed to as the PSNR [26] and can be expressed as where when X = D and X = U, respectively. U max D k and U max U k denote the maximum QoE requirement of the k-th DL UE and UL UE, respectively. ω D k and ω U k are predefined parameters that are related to the service content requested by the k-th DL UE and UL UE, respectively. Specifically, given maximum QoE requirement U max D k and U max U k , the content type for small value of ω D k and ω U k can be considered as high-definition video, which requires large amount of data to enter into the queue in order to satisfy QoE requirement. On the other hand, larger value of ω D k and ω U k corresponds to video with rather still scene, which requires comparably loose admission rate to achieve QoE requirement. The model (22) is suitable for applications with a QoE requirement where for a given user being allocated enough resource to achieve s X k (t) = 1, for 0 ≤ s X k (t) ≤ 1, the performance presented by the defined objective function cannot be further improved. Hence, the scheduler will not assign more resource to the given user and the performance can only be enhanced by assigning remaining resources to other users. In contrast, the satisfaction of quality will be severely degraded when the service rate is lower than the required one.

C. PROBLEM FORMULATION
In this study, each FD SBS is considered an SWB system that will limit each DL UE to select at most one beam to receive signals and this applies to UL UE as well to transmit signals. Currently, each beam can serve at most one DL and one UL UEs in each time slot for FD operations [7]. The corresponding constraints can be written as Owing to the practical limitations in the SWB system expressed in (24) to (25), spatial multiplexing cannot be utilized in the considered network. The SWB system is adopted in this study because it is one of the mainstream techniques for multi-user systems in mm-wave bands. This system has the advantage of low hardware complexity and signaling overhead required to serve multiple UEs [7]. The BA result for UEs in the SWB system is based on the received signal strength in the LoS path to achieve efficient beam switching, whereas LoS is dominant in mm-wave bands. Thus, the SWB system is considered suitable for mm-wave transmissions [27], [33]. Note that the system model proposed in this paper is developed based on feasible network scenarios and assumptions. Specifically, to consider beam assignment problem for mm-wave, we consider SWB system with the advantage of low feedback overhead compared with conventional codebook-based beamforming method [33]. To create fixed beam for SWB system, the representative method, i.e., Butler method, is adopted [28]. Moreover, feasible SI cancellation methods are considered in this paper, including PS, AC and DC with practical power consumption model based on current literature [17], [19] in order to provide insight regarding the influence of different SI cancellation schemes. This paper considers LoS scenario because in mm-wave channels, the LoS path is dominant and the non-line-of-sight (NLoS) paths are weak due to high propagation loss, scattering and blockage in mm-wave environments. Such assumption has been regarded as a valid scenario as stated in [7], [27], [34]. However, the derived analytical results in this paper are useful for more general cases and for further design. In particular, by adopting the extended Saleh-Valenzuela channel with clustered ray multi-path propagation [33], the analysis in this paper can be generalized to NLoS cases. The methodology discussed in this paper is a general framework that is applicable to both LoS and NLoS environments.
The proposed optimization problem aims to maximize long-term QoE via the BA and RC, with network stability and average EE constraint of each SBS being satisfied. From the abovementioned description, the problem can be formulated as follows: SWB is an analog beamforming-based system, the system is specified to provide sufficient transmission power without employing massive MIMO techniques discussed in current literature. For example, the SWB system discussed in [33] utilizes only 9 antennas for each remote antenna unit. Hence, massive MIMO is not considered in this study. Moreover, according to [35], video traffic accounts for over 75 percent of mobile traffic. QoE, which is defined as the overall acceptability of an application or service perceived subjectively by the end user [36], is a metric used to evaluate the performance of video transmission. Different from conventional communications, a user may experience different QoE even if data rates are the same due to various video characteristics [37]. Hence, from the resource management perspective, it is more efficient to maximize user satisfaction by solving QoE-based resource allocation problem than optimizing traditional mean rate.

III. PROPOSED BEAM ASSIGNMENT AND RATE CONTROL ALGORITHM
Solving the optimization problem (26) is difficult owing to the concurrent consideration of short-term constraints (24)−(25), (26d), (26e) and long-term constraints (26b), (26c). Long-term constraints result in the coupling of optimization variables and CSI collected over a long period, which introduces the drawbacks such as high system dimensionality and computational complexity. For this reason, the optimization problem (26) will be converted into a series of online BA and RC problems that can be implemented in real-time by leveraging with the Lyapunov optimization technique. Here, online means the converted problem depends only on the CSI and QSI in each time slot. Therefore, we propose a DBR algorithm for decoupling the converted problem into sub-problems. Note that the original objective function in (26a) implies that the considered problem involves CSI and QSI in the future. Converting the original problem to on-line problem is an important step to simplify the problem such that it depends only on the CSI and QSI in each time slot. After the conversion, the problem can be solved by the algorithm with feasible complexity and can be derived with analytical results to provide further technical insights, e.g., Lemma 3 and Theorems 1 to 3.

A. PROBLEM TRANSFORMATION
It should be noted that (26b) is a time-average limitation on the EE performance of FD SBSs. This constraint can be solved by constructing a virtual queue Z U n for the n-th FD SBS, that evolves as the following equation [23], [32]: Note that Z U n is not a real queue or data and it is created using the proposed algorithm to resolve the problem owing to the constraint (26b).

Lemma 1:
If the virtual queue Z U n (t) represents the mean rate stability, then the time-average EE constraint (26b) is automatically satisfied by the n-th FD SBS.
Proof: Please see Appendix A. Remark 1: From Lemma 1, the original optimization problem (26) can be transformed into a problem of maximizing long-term QoE requirements of UEs subjected to queue and virtual queue stability constraints together with (24) -(25), (26d) and (26e). The transformed problem is formulated as follows: are mean rate stable, ∀i, j, m, n, t. (28) To apply the Lyapunov optimization technique, let The Lyapunov function L(G(t)) and one-slot conditional Lyapunov drift (G(t)) are defined as follows: and Subtracting the expectation of , obtains the following drift-minus-reward term: Note that V ≥ 0 is a flexible control parameter that can be adjusted to achieve a tradeoff between the system delay and QoE. In particular, a higher ['higher' appears to be a suitable alternative since OoE is measurable.] QoE can be achieved by increasing V, whereas a smaller value of V is chosen if a strict delay is preferred. The impact of V is discussed in Section V. Based on the criterion of the Lyapunov method [23], [26], [32], the optimization problem in (26) can be solved by minimizing the upper bound of the drift-minus-reward term in each time slot t, which is given in the following lemma. Lemma 2: Suppose the elements of ρ(t) over time slots. Under any control algorithms and V ≥ 0, and all possible G(t), the drift-minus-reward term has the following upper bound [26], [38]: (34) and π D (t) and π U (t) can be interpreted as the amount of arrived data A D i,k (t) and A U m (t), minus the quantity of departed data R D i,k (t) and N n=1 R U m,n (t) weighted by the queue length Q D i,k (t) and Q U m (t) for transmission between the i-th FD SBS and k-th DL UE and m-th UL UE at time slot t, respectively. Similarly, π EE (t) can be regarded as the demanded transmission data η U req P U n (t) by the EE constraint η U req minus the actual transmitted data K U m=1 R U m,n (t) weighted by the virtual queue length Z U n (t) for the n-th FD SBS at time slot t. Note that the closed-form solution for BA and RC sub-problem in this paper are derived based on (32). Hence, B is an important upper bound used to simplify the objective function and to obtain further analytical results, and is a positive constant that satisfies the following inequality for all t: Proof: Please see Appendix B. According to Lemma 2 and the Lyapunov method, the transformed problem that minimizes the right-hand side (RHS) of (32) at each time slot t can be written as Remark 2: After being transformed using the Lyapunov method, the problem (37) depends only on the CSI and QSI in each time slot and they have already been collected by SBSs in the networks, the expectation in (37) can be omitted without influencing the solution of considered problem. For the purposes of brevity, π D (t), π U (t) and π EE (t) denote π D (t), π U (t), and π EE (t), respectively, after neglecting the expectation, e.g., π D (t) can be expressed as the solution of problem (37) tends to allocate additional resources to the UE or SBS with additional data to transmit in data/virtual queue. Besides, note that it is considered that only the average CSI is available in the networks. The reason for this assumption is that in practical networks, the collection of instantaneous CSI between all SBSs and UEs is challenging due to channel variation (small scale fading) and network sharing latency as discussed in [39]. With this channel model, we adopt i.i.d. distribution for distance ρ(t) to model the scenario, where UEs move with random directions and speeds. It should be noted that the slot duration for such channel condition to remain unchanged can be up to several seconds according to the numerical results in [40]. Moreover, in numerical results, UEs are uniformly distributed in service area 100 m × 100 m in this paper. Combining the above system model, the speed of a UE is approximately 100 km/h, which is considered a reasonable assumption in some network scenarios.

B. ALGORITHM DESIGN
It can be observed that the optimization variables of (37), i.e., BA and RC, are convincingly coupled. These optimization variables are decoupled to solve the problem with affordable computational complexity. In particular, UL BA c U is solved using c D and μ, DL BA c D is determined using c U and μ, and then μ will be solved using c.

1) BEAM ASSIGNMENT
First, problem (37) is decomposed into a sub-problem involving only UL BA and can be expressed as follows: Then, the following lemma is proposed to develop a closedform solution for (39): Lemma 3: Considering SWB FD SCN, the performance indicators in UL and DL communications possess the following properties: (a) The total transmission rate of the m-th UL UE and k-th DL UE can be described as where R U m,n,p (t) = log 2 1 + p Um (t)ρ −α Un,m (t)D p (θ Un,m (t)) I SIm,n (t)+I Um,n (t)+σ 2 Um,n , and , that can be interpreted as the achievable transmission rate when the m-th UL UE utilizes the p-th beam on the n-th FD SBS and the k-th DL UE is served using the j-th beam on the i-th FD SBS. (b) The PSNR of the m-th UL UE can be equally expressed as follows: Note that the result of (42) can be directly applied to the PSNR of DL UEs. (c) When each DL UE is served by the SBS in the FD SCN, the transmission rate between the m-th UL UE and n-th FD SBS has the following upper bound: with and (t) can be construed as the data rate of the m-th UL UE to n-th FD SBS under the influence of SI power caused by the j-th beam on the i-th FD SBS. Proof: Please see Appendix C. Based on Lemmas 3(a) and 3(b), the closed-form solution of the UL BA problem is provided as follows: Theorem 1: Let m * (t) ∈ R 1×2 represents the BA result for the m-th UL UE at time slot t which is given as with Therefore, we have Proof: Please see Appendix D. Remark 3: The principle behind the proposed BA policy for UL UEs is to allocate the beam that will contribute to the highest transmission rate and QoE to the connected UL UE, which is in accordance with the intuitional requirement. This study focuses on the BA design at the physical layer because the major challenge in mm-wave communications is its path loss effect, hence the BA result is regarded as the factor that dominates the performance of the network under consideration. It is worthwhile to note that the analytical results provide important foundation to facilitate joint BA and power allocation design for the mm-wave based FD network. It can be observed from Lemma 3(a) that the DL and UL transmission rate is converted into a form similar to the UE association problem. Hence, joint BA and power allocation design can be obtained using the framework in [41], [42].
By applying Lemma 3, the closed-form solution for the DL BA problem is summarized in the following theorem: Theorem 2: Let k * (t) ∈ R 1×2 denote the BA for the k-th DL UE at time slot t as k * (t) = arg min That is, the BA result for DL UEs is given by: Proof: Please see Appendix E.

Remark 4:
The solution structure of k * (t) or (53) is in good agreement with the intuitional understanding of FD systems as follows: (a) Each DL UE should be allocated with the beam that results in a high transmission rate. (b) Beams that induce a large SI power should not be utilized to avoid severe performance degradation on UL communications.

2) RATE CONTROL
Given BA results c, the RC sub-problem for UL UEs can be expressed as follows: Since s U m (t) is a concave function of μ U m,n (t), it can be verified that problem (54) is a convex optimization problem.
The following equation can be obtained by considering that the first order derivative of (54) with respect to μ U m,n is zero: . According to Karush-Kuhn-Tucker (KKT) conditions [26], the optimal RC decision can be expressed as where μ max U m is the admission rate that corresponds to the QoE requirement U max U m . Moreover, the RC subproblems for DL UEs can be solved in a similar manner as In (55) and (56), c D i,j,k (t) and c U n,p,m (t) are BA indicators for j-th beam of i-th FD SBS to k-th DL UE and for p-th beam of n-th FD SBS to m-th UL UE, respectively. Q D i,k (t) and Q U m (t) are the number of data briefly stored at queue backlog of i-th FD SBS to k-th DL UE and m-th UL UE at time slot t, respectively. U max D k and U max U k denote maximum QoE requirement of k-th DL UE and UL UE respectively. ω D k and ω U k are predefined parameters that are related to service content requested by k-th DL UE and UL UE respectively. V is the flexible control parameter to achieve tradeoff between system delay and QoE.
Remark 5: μ * U m,n (t) and μ * D i,k (t) imply that the admission rate is adjusted based on both the UEs' QoE requirement and current data queue length. In particular, when the queue length is short, the amount of data permitted into the queue by the DBR algorithm is larger compared to that of a relatively long queue length. Furthermore, the same quantity of admitted data results in a better QoE for UEs with a less strict QoE requirement compared to that of a more stringent requirement based on (22). Therefore, it can be observed that the DBR algorithm tends to admit data to the queue of UEs with a lower QoE requirement to maximize the total QoE of all UEs, which is similar to the water filling principle [43].
The proposed DBR algorithm is summarized in based on c 0 and μ U 0 / c D 0 , U and μ D 0 6: end if 7: end for 8: for m = 1 to K U / k = 1 to K D do 9: if λ U m * (t) = 0 / λ D k * (t) = 0 then 10: 14: end if 16: end if 17: end for 18:

21: end for
the symbolic representations of the parameter initialization results. After the parameters are initialized, the UL BA result is determined based on the closed-form solution ξ U m,n,p (t) (lines 3 ∼ 7). Next, if one beam is selected by more than one UL UE, it is assigned to UL UE with the lowest value of ξ U m,n,p (t) or ξ U m ,n,p (t). The UL BA process is complete when each UL UE is assigned to a beam (lines 8 ∼ 18). The UL RC result is updated based on μ * U m,n (t) for the corresponding μ m,n (t) of a non-zero element in U (lines 19 ∼ 21). Note that the element of U at the n-th row and p-th column is denoted by λ m * (t) when m * (t) = [n, p]. DL BA and DL RC will be implemented after UL BA and UL RC are accomplished. Since the procedures of DL BA and DL RC are similar to those in UL communications, the steps of DL BA and RC are appended with UL BA and RC, respectively.
Moreover, the slash symbol in the Algorithm 1 is used to separate the BA and RC processes for UL and DL communications; for example, line 5 represents UL BA result m * (t) is updated from ξ U m,n,p (t) based on c 0 and μ U 0 ; whereas the DL BA result k * (t) is updated from ξ D i,j,k (t) based on c D 0 , U and μ D 0 . The complexity of the proposed Algorithm 1 is dominated by the UE number and occurrence of beam reselection when different UEs are assigned to the same beam. In the worst case, all UEs select similar beams such that only one UE completes the BA process after each iteration, which causes complexity 2K 2 U + K U and 2K 2 D + K D for DL and UL, respectively. Hence, the computational complexity of Algorithm 1 can be expressed as O(K 2 U + K 2 D ), where K U and K D are the number of UL and DL UE in the considered network, respectively.

IV. PERFORMANCE ANALYSIS
In this section, essential and practical boundedness assumptions are given, and then the performance of the proposed DBR algorithm is analyzed.

A. BOUNDEDNESS ASSUMPTIONS
Let ρ(t) represent the set of all available BA and RC options. For all BA and RC decisions {c(t), μ(t)} ∈ ρ(t) under a given channel condition ρ(t), the following boundedness properties are satisfied: and τ , S min , and S max are finite constants and that η P = η U req P U n . These assumptions are considered feasible because all the physical metrics (e.g., admission rate, power consumption, and QoE) are all bounded in real systems.
Lemma 4: Suppose that (26) is feasible, i.e., there exists at least a BA and RC solution to satisfy constraints (24) − (25) and (26b) − (26e) along with the inequalities from the boundedness assumptions. Then, for any δ ≥ 0, there exists a stationary randomized algorithm and an arbitrarily small positive number that satisfies where and Note that S opt is the theoretical optimum of (26). When the BA and RC decisions of the considered networks are given as c * (t) and μ * (t), respectively, in the networks, the admission and transmission rates of the link between the i-th FD SBS and k-th DL UE are denoted by A * D i,k (t) and R * D i,k (t), respectively; admission rate of the m-th UL UE and transmission rate of link between the m-th UL UE and n-th FD SBS are expressed by A * U m (t) and R * U m,n (t), respectively; the power consumption of n-th FD SBS is represented by P * U n . Proof: The proof of this lemma is intuitive and is discussed from pages 58 to 62 in [38].

B. PERFORMANCE ANALYSIS OF PROPOSED DBR ALGORITHM
The performance analysis of the proposed DBR algorithm is provided as follows: Theorem 3: It is considered that the elements of ρ(t) are i.i.d. over time slots, problem (26) is feasible, Q(0) ≤ ∞, and Z U (0) ≤ ∞. For any control parameter V ≥ 0, the proposed DBR algorithm has the following properties: (a) Network Stability: All queues {Q(t), Z U (t)} are mean rate stability. Hence, satisfaction of the constraint (28) is guaranteed using Lemma 1. (b) Optimality of QoE performance: The bound of the timeaverage QoE satisfies the following inequality: (c) Average backlog of all queues: The DBR algorithm has the time-average queue bound as with ζ = min{ , τ, Nτ }. Proof: Please see Appendix F. Note that inequalities in Theorem 3 (b) and (c) exhibit a tradeoff of [O(1/V), O(V)] between the QoE and system delay.

V. PERFORMANCE EVALUATION
The performance of the proposed DBR algorithm is evaluated in this section. The elements of ρ(t) are i.i.d over time slots with SBSs and UEs uniformly distributed in the service area of 100 m × 100 m. There are K D = 4 DL UEs, K U = 4 UL UEs, N = 6 SBSs, and the number of beams that can be formed on each SBS was M = 8. The transmission power of each FD SBS and UL UE was P D i (t) = 30 dBm and p U m (t) = 20 dBm, respectively. SI cancellation amount and power consumption were γ PS = 40 dB, γ AC = 20 dB, γ DC = 15 dB, p γ AC = 20 mW, and p γ DC = 30 mW. The simulation results were conducted for T = 1000 time slots. The QoE requirement of DL and UL communications was set to U max D k = 25 and U max U m = 15 unless otherwise specified. Note that the simulation parameters in this study were mainly discussed in [19], [26], [44].   Figures 2 and 3 illustrate the stability of the data and virtual queues when applying the proposed DBR algorithm by studying the average data queue length of DL UEs, UL UEs and virtual queue length under different time-average EE constraints η U req versus time slots when V = 3800. It can be seen that the proposed DBR algorithm can maintain a stable data queue and virtual queue backlog for UEs and FD SBSs, respectively, after a long period of different time slots. The correctness of Theorem 3(a) is therefore verified in Figs. 2 and 3. It can be observed that more strict time-average EE constraints cause longer virtual queue lengths. Specifically, 27.2 and 49.2 bits are upper bound when η U req = 5 and η U req = 10, respectively. However, when η U req = 20, virtual queue stores more than 60 bits data in many slots. This tendency results from the fact that higher values of η U req introduce a larger amount of newly arrived data according to (27). Moreover, the fluctuation of queue length at different time slots implies the influence of dynamic variation of the wireless channel. In particular, the queue length of UL UEs in lower subplot of Fig. 2 when t = 100 and t = 180 can be interpreted as the result of relatively good and poor channel quality, respectively. The reason for the fluctuation of queue length at different time slots is that when channel condition is poor such that the rate of departure is smaller than the rate of entering into a queue for packets in the network, the amount of data in the queue backlog increases according to (15) and (16), and vice versa. Moreover, since variables in (15) and (16)

A. VALIDATION OF THE PERFORMANCE ANALYSIS
and R U m,n (t) depends on BA results and channel condition, while A D i,k (t), and A U m (t) are determined by rate control results.) are not constants and depend on (i) BA results; (ii) channel condition; and (iii) rate control results, different values of (i)-(iii) lead to different inputs for (15) and (16). Hence, the amount of data in queue backlog will not be a constant and usually fluctuate within a certain range over time slots. As dynamic variation of channel condition over time slots is considered in this paper, channel condition fluctuates between relatively poor and good quality (which is reflected by different queue lengths over time slots), there exists certain time slots that have worse channel conditions than the others. Figure 4 depicts the QoE performance of the proposed DBR algorithm by studying s D k and s U m versus V when the UEs' QoE requirements (U max D k , U max U m ) are chosen as (16,12), (20,16) and (25,20). It can be observed that the QoE is improved as V increased. In addition, although the QoE with a stringent UEs' QoE requirement is lower than that tolerable UEs' under a fixed value of V (e.g., s U m is lower at U max U m = 20 compared with U max U m = 16 and 12), the proposed DBR algorithm satisfies the UEs' QoE requirement when the control parameter V is sufficiently large. In addition, Fig. 4 validates Theorem 3(b). Figure 5 investigates the impact of the control parameter V on system delay by studying queue length versus V. Since the system delay is proportional to the amount of data in queue backlog, it can be seen that the system delay increases with as V. Validation of Theorem 3(c) is also attained in Fig. 5 considering the fact that a larger value of V yields the growth of the queue's length. Moreover, flexibility of the control parameter V can be observed by combining results of Figs. 4 and 5, i.e., increasing V achieves improved QoE at the price of a higher system delay, whereas a lower system delay can be obtained using the degradation of QoE by decreasing V.

B. IMPACT OF IUI AND PERFORMANCE COMPARISON
The impact of IUI on system delay is studied in Fig. 6  for the DBR method when the UEs' QoE requirement is satisfied. Two baseline methods are also implemented in the performance simulation comparison. Baseline 1 is the dynamic throughput optimal (DTO) method, that neglects the existence of the queue backlog and selects the beams that only maximize the sum of the DL and UL transmission rates in a wireless channel. Note that the DTO method can be regarded as an advanced version of the method in [7], where its BA algorithm is designed based on only signal strength excluding the influence of any interference. Baseline 2 is the ES method, that determines the BA and RC policies after comparing all possible results, and can be considered as achievable theoretical limits. However, a significantly high computational complexity is required to implement the ES method compared to other schemes. Since the increased UE number suggests the enlargement of IUI power, resulting in a degraded transmission rate of the wireless channel, the average queue backlog length becomes longer when the number of UE is increased. Moreover, a larger U max D k and U max U m will result in longer queue backlog length owing to the fact that the central scheduler has additional data to be admitted into the queue to satisfy UEs' QoE requirement. A small gap between the DBR and ES methods can be observed for small number of UE; whereas the gap is noticeable when the UE number is higher than 8. In this case, as the power of IUI is stronger and the systems operate in the interference-limited region, the gap between the DBR and ES schemes becomes larger owing to the inaccurate estimation of IUI when implementing the proposed DBR scheme. Furthermore, the existence of a queue backlog is not taken into account in the DTO scheme, which implies that DTO method fails to provide resources for the UE that has additional data to transmit in the queue backlog. It can be seen that additional data accumulated in the queue backlog when the DTO method was used, compared to our proposed DBR algorithm. The importance of considering the finite queue backlog is because the DBR method can maintain a shorter queue backlog length than the DTO scheme.
The influence of IUI on QoE is shown in Fig. 7 by comparing the required value of V to satisfy the UEs' QoE requirement versus UE number under different U max D k and U max U m . Note that a higher value of V implies that a lower QoE with fixed V is obtained. In particular, QoE is equal to 1 for the DBR and ES methods when V = 6000 with the UE number K D = K U = 2 and U max D k = U max U m = 20; whereas QoE of the DTO scheme is lower than 1 because the required value of V to satisfy UEs' QoE requirement is approximately 7000. It can be observed that a larger value of V is required to satisfy UEs' QoE requirement as U max D k and U max U m become more stringent, which is in accordance with Theorem 3-(b). In addition, it can be observed that the QoE is decreased by a stronger IUI power with a fixed V since the required value of V becomes larger as the number of UE increases. This result is caused by the large IUI power that induces a transmission rate degradation in the wireless channel and this results in a longer queue backlog length that satisfies the UEs' QoE requirement by increasing the value of V based on the RC policy μ * U m,n (t) and μ * D i,k (t). The reason for the performance gap between the DBR and ES  algorithms is because of the inaccurate information of IUI power, which similar to that shown in Fig. 6. Furthermore, QoE in the DTO method experiences queue backlog negligence; therefore, it requires a large value of V to satisfy the UEs' QoE requirement, in contrast to the proposed DBR scheme. Figure 8 investigates the impact of the requested content type by studying QoE s D k and s U m versus V under different ω D k and ω U m conditions. As shown by the metric function (22), low ω D k and ω U m values result in a lower QoE with the same V. Note that the content type of low ω D k and ω U m values correspond to a high-definition video; whereas the content type of higher values correspond to a requirement of lesser details such as a document file. Hence, low ω D k and ω U m values require additional admitted data to achieve a QoE similar to that obtained using higher values. Moreover, different values of V are needed to satisfy the UEs' QoE requirement under different ω D k and ω U m conditions. This implies that the UEs' requirement for different types of QoE services should not be neglected. Figure 9 shows the influence of different SI cancellation schemes on system metrics by comparing the average UL transmission rates and average EE versus PS cancellation amounts. Higher UL transmission rates result in a lower system delay because the UL transmission rate indicates the amount of data packet leaving the queue backlog at each time slot. It is observed that by activating all the SI cancellation schemes, the highest UL transmission rate can be achieved. The UL transmission rates of PS+AC+DC are higher than those of PS+AC and PS until the PS cancellation amount exceeds 45 dB and 65 dB, respectively. However, since the DC scheme consumes more power than the AC and PS schemes, the average EE of PS+AC+DC are lower than those of PS+AC and PS schemes when the PS cancellation amount is higher than 40 dB and 55 dB, respectively. The results of Fig. 9 indicates that the appropriate SI cancellation scheme for FD SCN with finite queue backlog is subject to the selection of performance metrics (i.e., systems delay or EE) and PS capability.

VI. CONCLUSION
This study addressed the resource management problem for mm-wave based SWB FD SCN considering each of the UEs' QoE requirements and time-varying wireless channel. The formulated optimization problem aims to maximize the long-term QoE via BA and RC under co-existence of short-term and long-term constraints. The problem was converted into a series of BA and RC problems, with the need to collect CSI over time slots excluded, by leveraging with the Lyapunov optimization method. To cope with the highly coupled optimization variables with practical complexity under time-varying channel conditions, we first derived novel closed-form solutions for the BA and RC by exploiting beam constraints in the SWB systems. The DBR method with only polynomial complexity was then proposed based on the derived closed-form solutions. The numerical results illustrate that the proposed DBR algorithm strikes a balance between performance and complexity by outperforming the benchmark scheme and achieving nearly optimal performance in terms of system delay and QoE.

APPENDIX A PROOF OF LEMMA 1
From (27), the following inequality can be obtained: By taking the iterated expectation and implementing telescoping sums over t ∈ {0, 1, . . . , T − 1}, it can be obtained that Dividing it by T and taking T → ∞ will yield From Jensen's inequality, we have 0 ≤ |E{Z U n (T)}| ≤ E{|Z U n (T)|}. Thus, if Z U n (T) is the mean rate stability, i.e., lim T→∞ (E{|Z U n (T)|}/T) = 0, we have lim T→∞ which proves Lemma 1.

APPENDIX B PROOF OF LEMMA 2
We will prove Lemma 2 using the following Lemma [38]: Lemma 5: For any non-negative real numbers, Q, b, and A, the following inequality holds: According to the definition of (G(t)), we have Using Lemma 5, we obtain Equation (32) is proven after subtracting VE{S(t)|G(t)} from the RHS of (76). Therefore, we have proven Lemma 2.

APPENDIX C PROOF OF LEMMA 3
(a) Based on the beam constraints in the considered SWB FD systems, the proof of Lemma 3 will be accomplished by discussing the scenarios where the constraint (24) (when X = U) is equal to either 0 or 1. When (24) (when X = U) is equal to 0, it can be observed that When (24) (when X = U) is equal to 1, without loss of generality, assuming that the m-th UL UE is assigned p -th beam on n -th FD SBS, it can be observed that Note that the equivalent expressions for the transmission rate and PSNR of k-th DL UE in Lemma 3(a) and Lemma 3(b) can be proved by following similar derivation steps. (c) The validity of this Lemma directly follows that I SI i,j m,n (t) ≤ I SI m,n (t).

APPENDIX D PROOF OF THEOREM 1
Based on Lemmas 3(a) and 3(b), the objective function (39) can be rewritten as follows: The BA policy m * (t) = arg min n,p ξ U m,n,p (t) or (50) follows that (80) can be minimized if the beam assigned to each UL UE leads to a minimum value of ξ U m,n,p (t).

APPENDIX E PROOF OF THEOREM 2
By applying Lemma 3(a) to Lemma 3(c), the objective function for DL BA in (51) can be rewritten as follows: with and¯ Similar to the proof of Theorem 1 in Appendix D, k * (t) = arg min i,j ξ D i,j,k (t) and (53) hold.

APPENDIX F PROOF OF THEOREM 3
Owing to space limitations, the discussion mainly focuses on BA results as follows: Similar results can be obtained via a similar derivation process for other BA outcomes. Because the proposed DBR scheme minimizes the RHS of (32) over the constraints (24) − (25), (26d), and (26e), we have When (84) holds, the following inequality can be obtained by applying Lemma 4 to (86) and taking δ → 0 as Using the iterated expectation and telescoping sums [38] over t ∈ {0, 1, . . . , T − 1} in the given inequality, we obtain Similarly, the following inequality can be achieved by taking δ → 0 when (85) holds: Using the iterated expectation and telescoping sums over t ∈ {0, 1, . . . , T −1} in the given inequality and boundedness assumptions, we obtain the following: (a) For the BA result in (84), rearranging (88) and using the boundedness assumption of QoE S(t) as well as considering that Q D i,k (t), Q U m (t), and Z U n (t) are nonnegative, we have ∀1 ≤ n ≤ N E Z 2 U n (T) ≤ 2TB + 2VT S max − S opt + 2E{L(G(0)}. (92) Because the variance of Z U n (t) is non-negative and can be denoted by E{Z 2 U n (t)} − E 2 {|Z U n (t)|}, we have E{Z 2 U n (t)} ≥ E 2 {|Z U n (t)|}. Thus, for all slots of t, we can acquire the following inequality Dividing it by T and using a limit of T → ∞, we can prove that lim T→∞ E{|Z Un (T)|} T = 0. For the BA result in (85), (92) can be obtained by rearranging the terms in (91) and considering that Q D i,k (t), Q U m (t), Z U n (t), and τ have positive values. Hence, queue Z U n (T) is the mean rate stability from definition (17), and thus constraint (26b) is satisfied based on Lemma 1. This is also applicable to Q D i,k (t) and Q U m (t). (b) Combining (88) and (91), we obtain Note that ζ D = ζ U = , and ζ Z = 0 for the BA results in (84) using the boundedness assumptions, whereas ζ D = τ , ζ U = Nτ and ζ Z = K U τ for the BA result in (85) using Lemma 4. The following inequality can be obtained after dividing (94) by T, V and considering that Q D i,k (t), Q U m (t), Z U n (t), and E{L(G(T)} have nonnegative values as The optimality of QoE for (b) can be proved by taking a limit on the inequality as T → ∞. (c) Rearranging the terms in (94) and considering that Z U n (t), E{L(G(T)} ≥ 0, gives The result of (71) can be achieved by dividing (96) with the smallest value of ζ D or ζ U . Without loss of generality, we assume that ζ D ≤ ζ U . After dividing (96) by Tζ D , and taking T → ∞, we obtain The proof of (c) is completed considering that ζ U ζ D ≥ 1. Note that ζ in (71) represents the minimum value of ζ D or ζ U .