Robust Power Optimization for Downlink Cloud Radio Access Networks with Physical Layer Security

Since the cloud radio access network (C-RAN) transmits information signals by jointly transmission, the multiple copies of information signals might be eavesdropped on. Therefore, this paper studies the resource allocation algorithm for secure energy optimization in a downlink C-RAN, via jointly designing base station (BS) mode, beamforming and artificial noise (AN) given imperfect channel state information (CSI) of information receivers (IRs) and eavesdrop receivers (ERs). The considered resource allocation design problem is formulated as a nonlinear programming problem of power minimization under the quality of service (QoS) for each IR, the power constraint for each BS, and the physical layer security (PLS) constraints for each ER. To solve this non-trivial problem, we first adopt smooth ℓ0-norm approximation and propose a general iterative difference of convex (IDC) algorithm with provable convergence for a difference of convex programming problem. Then, a three-stage algorithm is proposed to solve the original problem, which firstly apply the iterative difference of convex programming with semi-definite relaxation (SDR) technique to provide a roughly (approximately) sparse solution, and then improve the sparsity of the solutions using a deflation based post processing method. The effectiveness of the proposed algorithm is validated with extensive simulations for power minimization in secure downlink C-RANs.


Introduction
The cloud radio access network (C-RAN) [1,2] has been recognized as a promising paradigm for the fifth generation (5G) wireless network [3] in reducing both capital and operating expenditures. In C-RAN, the signal processing for filtering, modulate/demodulate, and detection are moved to the baseband unit (BBU) pool or central unit (CU), and all the low-powered base stations (BSs) are connected to the BBU pool through high capacity fibre fronthaul links or microwave wireless links [2]. Although joint transmission scheme can be adopted to boost the spectral efficiency, one particular technical challenge is the energy consumption due to the fluctuation of data traffic during day and night. Therefore, it is of great importance and interest to reduce the network power consumed by the BSs and the corresponding fronthaul links [4][5][6][7].
To achieve power efficient C-RAN, BS mode selection has been proved to be an promising paradigm to save energy [4][5][6][7][8]. In particular, the BSs with corresponding fronthaul links are switched into sleep mode under low traffic conditions. However, to provide guaranteed quality-of-service (QoS) requirements for each information receiver (IR), the transmit power consumption should be increased. Hence, the BSs mode selection and beamforming are jointly designed to reduce the power consumption [4,[6][7][8][9]. Due to the broadcast properties of wireless channels, information signals may be eavesdropped by the eavesdrop receivers (ERs). Although traditional cryptographic technique may In conclusion, all the aforementioned algorithms cannot be applied to solve the considered problem directly. Consequently, the major contributions of this paper can be summarized as follows: Firstly, we formulate the joint BS mode, beamforming and AN problem as an integer nonlinear non-convex programming problem, subject to IRs' SINR requirements, transmit power constraints, and PLS constraints with CSI uncertainties. We reformulate the original problem using a sparse beamforming formulation. By approximating the non-convex 0 -norm using a smooth function, we propose a general iterative difference of convex (IDC) based algorithm to find a local optimal solution for the difference of convex (DC) programming problem.
Secondly, using the S-lemma in conjunction with SDR technique, we propose a three-stage algorithm to find the sparse structure of the active BSs. Specifically, the rough sparse solutions are obtained by using the IDC-based SDP algorithm in stage 1. In stage 2, we further improve the sparsity of the solution using a deflation based post processing method, and we solve a transmit power optimization problem in stage 3.
Finally, we validate the effectiveness of proposed algorithm through extensive simulations. The results suggest that our proposed algorithm can provide robust solutions, and the algorithm can converge to a local optimal solution rapidly. The proposed algorithm achieves comparable performance compares with the exhaustive search algorithm, and our algorithm significantly outperforms the algorithm without considering the BS mode selection.
The remainder of this paper is organized as follows. The system model and the problem formulation are presented in Section 2. In Section 3, we transform the problem into a difference of convex programming, and propose a general iterative difference of convex algorithm. In Section 4, by applying the SDR technique, we propose a three-stage algorithm. Simulation results of the proposed algorithm are presented in Section 5. Finally, the concluding remarks are given in Section 6. Table 1 summarizes the major symbols used for the rest of this paper. Table 1. List of major symbols used in this paper.

Sym Description Sym Description
Tr [·] trace of a matrix E[·] expectation operator Θ k uncertainty channel shape of the ellipsoid of IR k C set of complex numbers Θ m uncertainty channel shape of the ellipsoid of ER m (·) T transpose H k set of uncertainty channel region of IR k (·) H conjugate transpose Ω m set of uncertainty channel region of ER m I N N × N identity matrix CN complex gaussian distribution 0 N N × N zero matrix P a,bs l hardware power consumption of BS l in the active mode H N N × N Hermite matrix P s,bs l power consumption of BS l in the sleep mode L BS set P c,bs l saved power by switching BS l in the sleep mode A active BS set P olt constant power consumption of OLT Z inactive BS set P a,tl l power consumption of ONU l in the active mode L\Z set of L exclude Z P s,tl l power consumption of ONU l in the sleep mode x 0 0 -norm P c l both BS l and its corresponding fronthaul link in active mode x 1 1 -norm P a l both BS l and its corresponding fronthaul link in sleep mode We consider a secure downlink C-RAN consisting of L BSs (each equipped with N antennas), where the BSs jointly serve the K single-antenna IRs, which is illustrated in Figure 1. Denote L, A and A (A = |A|) the BSs set, active BS set, and the number of active BSs, respectively, and A ⊆ L. Note that there are M single-antenna ERs in the network.   The received signals at IR k (k ∈ {1, · · · , K}) and ER m (m ∈ {1, · · · , M}) are respectively given by where x ∈ C N A×1 denotes the joint transmit data vector of the A active BSs to the K IRs. The channels from the A active BSs to IR k and ER m are denoted by h k ∈ C N A×1 and g m ∈ C N A×1 , respectively. n IR k and n ER m are respectively the noise at IR k and ER m, which are modeled as additive white Gaussian noise with zero mean and variances σ 2 IR k and σ 2 ER m , respectively. The transmitted signal a k with E[|a k | 2 ] = 1 for IR k is beamformed by w k before transmission. The beamformed signal for IR k is x k = w k a k . The transmit signal vector x at the A active BSs is given by T is the AN vector generated for the A active BSs (v l is the AN vector from BS l to IR k), which is modeled as a complex Gaussian random vector, i.e., v ∼ CN (0, V), where V ∈ H N A , V 0 is the covariance matrix of v.

Channel Model
The channel from the A BSs to the k-th IR is h k = [h T k1 , · · · , h T kA ] T . The downlink CSI of the BSs-to-IR channels can be obtained through measuring the uplink pilots in the handshaking or beacon signals via channel reciprocity. The BSs are able to refine the estimate of h k frequently via the pilot sequences embedded in each acknowledgement packet. Therefore, we assume that the channels from BSs to IR k are imperfectly known in the BBU pool, and the imperfect channel is modeled as the sum of two parts. According to [18,19], a deterministic model for characterizing the resulting uncertain channel is adopted whereh k is the estimated channel of IR k, and ∆h k = [∆h T k1 , · · · , ∆h T kA ] T ∈ C N A×1 is the channel uncertainty, which is assumed to satisfy an elliptic model, i.e., ∆h H k Θ k ∆h k ≤ 1 where Θ k ∈ H N A×N A is the shape of the ellipsoid with Θ k 0. Define the set H k {∆h k : ∆h H k Θ k ∆h k ≤ 1} which contains all possible channel uncertainty region of IR k.
The ERs do not interact with the BSs during information transmission, and the central unit does not know the location of ERs. However, to facilitate the resource allocation algorithm design, we follow the existing research [15] and the CSI between the BSs and the ERs are assumed to be known at central unit. As a result, we design the resource allocation algorithm assuming an unfavorable scenario (Without the instantaneous CSI of ERs at the central unit being known, stochastic geometry modeling may be an optimistic approach [23,24]. In this case, the system model is different from this paper, and we leave it for future work). In this paper, the same CSI model of ER is adopted as in [15,25].
Since the CSI of the ERs may be outdated during transmission, we use a deterministic model [17][18][19] for characterizing the resulting CSI uncertainty. The CSI model from BSs to ERs are where g m = [g T m1 , · · · , g T mA ] T ∈ C N A×1 , and ∆g m = [∆g T m1 , · · · , ∆g T mA ] T ∈ C N A×1 is the channel uncertainty which is assumed to satisfy the following elliptic model (To model the imperfection CSI of ERs, a deterministic CSI error model [15,18,19] or probabilistic CSI error model [20] are usually adopted. The probabilistic CSI error model for ERs investigates the statistics of CSI. However, since the probabilistic CSI error model results in a highly intractable form, this paper only focuses on the deterministic CSI model), i.e., ∆g H mΘ m ∆g m ≤ 1, whereΘ m ∈ H N A×N A is the shape of the ellipsoid with Θ m 0. Define the set Ω m {∆g m : ∆g H mΘ m ∆g m ≤ 1} which contains all possible channel uncertainty region of ER m. It is noted that the spherical error model is considered when Θ k = 1 where ε IR k ≥ 0 and ε ER m ≥ 0 are the radii of the uncertainty parts. In this case, the uncertainty vectors of IR and ER satisfy ∆h k 2 ≤ ε IR k and ∆g m 2 ≤ ε ER m , respectively. It is easy to see that the CSI of both IR and ER become perfect when ε IR k and ε ER m approach zeros. Unlike considering the second-order statistics of channels [26], in this paper, we consider the instantaneous channel in every coherence interval. The beamforming vectors and AN, as well as BS mode are recomputed every coherence interval, and it can be viewed as an ideal or special case of the second-order statistics model.
The SINR at IR k and ER m are respectively where (a) in Equation (5) constitutes an upper bound (The upper bound in Equation (5) is reasonable when all the other beamformed IRs' signals (except for IR k) and the corresponding channels of the m-th ER are orthogonal) on the received SINR at ER m for decoding the information of IR k [15].

Power Model
According to [1,27], the network power consumption of C-RAN consists of the power consumption by the BSs and fronthaul links. By moving the baseband processing into CU, a BS transceiver mainly comprises power amplifier, radio frequency module, direct-current (DC)-DC power supply, active cooling system and alternating current (AC)-DC main supply [27]. According to [27], P bs l can be modeled as typical linear function where P tx l = ∑ K k=1 w kl 2 2 + v l 2 2 is the transmit power, P a,bs l is the hardware power consumption of BS l in the active mode, and the constant P s,bs l (typically nonzero) is the power consumption of BS l in the sleep mode. Typically, since P s,bs l < P a,bs l , it is beneficial to switch BSs into the sleep mode to save energy. The static power for BS l is defined as P c,bs l = P a,bs l − P s,bs l , which indicates the saved power of switching the BS from active into sleep mode, and we assume all the power amplifier has the same efficiency for simplicity, i.e., η = η l , ∀l.
According to [28], optical fiber is usually adopted by C-RAN as a fronthaul network to connect BSs and CU using a passive optical network (PON). The fronthaul network is comprised by an optical line terminal (OLT) that connects one optical fibre with a set of optical network unit (ONU). One efficient way to save energy for PON is to switch ONU into sleep mode. However, OLT cannot be switched into sleep mode and it consumes constant power. Therefore, the overall power consumed by fronthaul network is where P olt is the constant power consumption of OLT, P tl l = P a,tl l is the power consumption of ONU l in the active mode, and P tl l = P s,tl l is the power consumption of ONU l in the sleep mode, and P olt = 20 Watt, P a,tl l = 3.85 Watt and P s,tl l = 0.75 Watt. Therefore, switching off ONUs into sleep mode is an efficient way to save energy on the fronthaul network.
It is noted that, when both BS l and its corresponding fronthaul link l are in active mode, the power consumed by BS l and ONU l is P a l = P a,tl l + P a,bs l . We define P s l = P s,tl l + P s,bs l the power consumption of BS l and ONU l in the sleep mode. Then, the network power consumption of C-RAN is where P c l = P a l − P s l is the saved power by switching the BS l and ONU l into the sleep mode, and the second equality in (8) is based on the fact ∑ l∈Z P s l = ∑ l∈L P s l − ∑ l∈A P s l . Since this paper explores the power efficiency in secure downlink C-RAN, the constant term ∑ l∈L P s l + P olt is omitted. Therefore, the network power (NP) consumption is given by It is noted that, switching off more BSs into sleep mode, more transmit power (summation of data transmission power (DTP) and AN) will be consumed in order to satisfy the QoS requirements. Therefore, the BS mode and transmit power should be jointly designed under certain QoS constraints.

Problem Formulation
To release the impact of limited-capacity fronthaul links on performance, fronthaul compression strategy is introduced in C-RAN. By introducing compression noise (or called quantized noise), the fronthaul capacity requirement is largely reduced. On the other hand, AN is introduced to interfere ERs to provide PLS. According to [29], compression noise on fronthaul links can be viewed as AN to provide PLS for C-RAN since they have the same mathematical form. Therefore, adding the fronthaul capacity constraints does not affect the convex form of the considered problem, and the solutions proposed in this paper can be directly extended to solve the problem with fronthaul compression under limited fronthaul capacity constraints.
We formulate the joint optimization problem of BS mode, AN and beamforming, subject to IR's SINR requirements, transmit power constraints, and PLS constraints with CSI uncertainty, as a nonlinear non-convex programming problem, given by where w = [w 1 , · · · , w K ]. γ k is the SINR threshold of IR k, and constraints (10b) indicate that each IR has its minimum SINR requirement for a given channel uncertainty set H k . Γ k m is the SINR threshold of ER m for IR k, and constraints (10c) indicate that each ER has its maximum SINR requirement for a given channel uncertainty set Ω m . Constraint (10d) the power constraint for each BS with maximum transmit power P max l . It is interesting to investigate the secrecy data rate maximization subject to the same constaints of problem P 0 , and we leave it for future work.
The challenges of solving problem P 0 arise from: (i) the combinatorial objective function for BS selection; (ii) non-convex quadratic constraints (10b); and (iii) the infinite number of non-convex PLS constraints (10c). Even with convex constraints, the optimal solutions of problem P 0 are only achieved by exhaustive search procedure. Moreover, the computational complexity will exponentially increase with the network size. Therefore, this paper proposes a low-complexity algorithm based on DC procedure to find a local optimal solution to problem P 0 .

DC-Based Sparse Beamforming Design
In this section, we first reformulate the above problem into a sparse beamforming one and convert the transformed problem to DC programs by using the smooth 0 -norm approximation. Then, we propose a DC-based algorithm to find a local optimal solution.

Problem Reformulation
In this subsection, we rewrite the combinatorial composite problem P 0 as a sparse beamforming form and approximate the 0 -norm using a smooth function.

Sparse Beamforming Problem
It is observed that the mode of BSs can be specified with the beamforming w and v. In particular, when ∑ K k=1 w kl 2 2 = 0 and v l 2 2 = 0, BS l is switched into sleep mode, otherwise the BS should be active. By introducing a non-negative auxiliary variable s l = ∑ K k=1 w kl where s l can be viewed as the soft transmit power of BS l.
Since the 0 -norm indicates the number of nonzero elements of a vector, the indicator function f (s l ) can be replaced by the 0 -norm of s l without loss of optimality. Therefore, the original optimization problem P 0 is equivalently rewritten as where s = [s 1 , · · · , s L ] T and the sparsity of the BS mode is controlled by s. It is noted that, if the parameters of γ k , Γ k m and P max l are poorly chosen, problem P 1 may become infeasible. Then, user admission control or SINR relaxation should be applied, which is beyond the scope of this paper.
Although we have transformed problem P 0 to a sparse beamforming problem P 1 , it is still challenging due to the non-convex 0 -norm in the objective, the non-convex quadratic QoS constraints and the infinite number of non-convex PLS constraints.

Smooth 0 -Norm Approximation
To address the non-convex discontinuous 0 -norm in the objective, in this paper, we employ a general smooth function, denoted by f (x). The smooth function f (x) should satisfy the following three properties: Specifically, the logarithmic function, exponential function, and arctangent function are frequently adopted to approximate the non-convex 0 -norm [30,31], given by where τ > 0 is used to control the smoothness of approximation. It is easily observed that a smoother function is obtained with a larger τ, but the approximation performance is worse, and vice versa.
Then, by replacing the non-convex 0 -norm in the objective with the smoothed function, the problem P 1 can be approximated as: Since the first term in the objective is convex in s l and the smooth function f τ (s l ) is concave in s l , the objective function is in a form of "a convex function + a concave function". Therefore, the objective of problem P 2 is the difference between two convex functions. If the constraints in problem P 2 can be rewritten as the difference between two convex functions or be transformed to convex ones, problem P 2 is a general form of a DC programming problem. Then, the DC programming algorithm can be developed to deal with the DC programming problem. Therefore, an IDC procedure can be adopted to find a local optimal solution of problem P 2 .

Generalized IDC Procedure
In this subsection, the IDC algorithm is proposed to solve problem P 2 .

IDC Procedure
The main idea for the IDC algorithm is to convexify the concave parts by their first order Taylor expansions, and then solve a sequence of convex problems successively. In particular, the objective function f τ (s l ) is linearized, which is given by where t ≥ 1 is the iteration number, and f τ (s Therefore, with the knowledge obtained from iteration (t − 1), the following problem is solved at the t-th iteration where the superscript (t) is dropped here, and L ] T , problem P 2 is iteratively solved through solving a sequence of problem P 3 . Note that the smooth function f τ (s l ) is strictly monotonic decreasing. Therefore, the IDC algorithm is guaranteed to converge to a local optimal solution of problem P 2 , and the convergence proof can be found in [32,33]. Now, the main challenge of the IDC algorithm is to solve problem P 3 . In the next subsection, the SDR technique with S-lemma is applied to solve problem P 3 efficiently.

Updating Rule of τ
As mentioned in the above subsection, the approximation performance of f τ (x) depends on the smoothed factor τ. Clearly, when x is small, τ should be small so that f τ (x) approximate 0 -norm well, and τ should be large when x is large. As shown in [31], when τ is chosen to maximize the gradient of the approximation function, the three smooth functions in (13) have almost the same performance. Similar to the updating rule of τ in [31], we set a large value of τ for initialization, and decrease τ by a given factor ς, i.e., τ ← ςτ. The τ is iteratively updated until it is sufficiently small.
It is noted that this general IDC algorithm can be applied to solve the unconstrained or linearly constrained problems [34]. Indeed, this algorithm is suitable for solving the general DC programming problems [35,36], for instance, a DC objective function with convex constraints or a convex objective function with DC constraints. However, the difficulties for the considered problem mainly come from the CSI uncertainty and PLS. Even with S-lemma and semi-definite relaxation technique, the rank one solutions still need to be investigated. In the next section, we will propose a three-stage algorithm by applying the IDC framework in conjunction with constraints transformations to solve the considered problem.

Proposed Optimization Algorithm
In this section, we develop a three-stage low-complexity algorithm with the SOCP transformation. Specifically, in stage 1, a rough sparse solution for the BS mode is obtained by applying the IDC-based SDP algorithm. In stage 2, a post processing procedure with a newly defined incentive metric is proposed to further improve the sparse structure of the solutions, and followed by optimizing the transmit power in stage 3.

Stage 1: IDC-Based SDP Algorithm
The challenge of the IDC procedure for solving problem P 2 is the non-convexity of problem P 3 . While the non-convexity comes from the non-convex quadratic SINR constraints in constraints (10b) and the infinite number of PLS constraints (10c), which makes problem P 3 formidable to solve. Therefore, our focuses are to transform constraints (10b) and (10c) into tractable ones, and then recast problem P 3 into an SDP one.
To deal with the non-convex constraints (10b), we first transform it into an equivalent form, given by Letting (17) as It is found that constraint (18) refers to an infinite number of constraints because of the uncertainty in h k . Towards this end, we transform (18) into linear matrix inequality (LMI) using the S-lemma [21]. Specifically, there exists ∆h k satisfying the SINR constraints (18) for any ∆h H k Θ k ∆h k − 1 ≤ 0, if and only if there exists x k ≥ 0, ∀k, such that the following LMI constraints hold: where x = [x 1 , · · · , x K ] is the introduced variables for IRs. Therefore, constraint (19) is a semi-definite. For constraints (10c), we first transform it into the following form max Following the similar transformation using the S-lemma, ∆g H m ∆g m ≤ σ 2 ER m implies holds if and only if there exist α k m ≥ 0, such that the following LMI constraints hold. where Hence, by directly dropping the rank one constraints, rank(W k ) = 1, ∀k and rank(V) = 1, problem P 3 can be transformed as (19) and (22), Since problem P 4 is an SDP problem, we can solve it using the standard convex optimization software, such as CVX [37]. It is noted from the following proposition that the solution s l of problem P 4 cannot equal to zero, which means that the sparse structure of active BSs cannot be obtained.

Proposition 1.
The solution of problem P 4 can be expressed as s l = ∑ K k=1 Tr(Φ l W k ) + Tr(Φ l V ), where W k and V are respectively the optimal solutions of W k and V, and W k cannot be zero matrices. The proof can be found in Appendix A.
To illustrate the results in Proposition 1, we will give an experimental example in Section 4. Thus, the sparse solution of problem P 4 is generated by iteratively penalizing the BS with smaller transmit power consumption.

Proposition 2. The worst-case SINR constraints of IRs and the SINR constraints of ERs in problem P 4 should be active at the optimal solution.
Proof. Since constraints (19) and (22) are respectively the equivalent transformation of (10b) and (10c), we prove (10b) and (10c) are active at the optimal solution. Firstly, we prove that constraints (10b) are active at the optimal solution. If the left-hand side of (10b) is greater than γ, one can decrease DTP to save power. Since constraints (10c) must be satisfied, decreasing DTP does not affect the inequality. Thus, constraints (10b) must be active at the optimal solution. The first part proof of this proposition is completed. Now, we prove the second part of this proposition. If Γ is greater than the left-hand side of (10c), one can decrease DTP or increase AN to save power. However, with the decrease of DTP or increase of AN, the left-hand side of (10b) is decreased. Then, the equality of (10b) is not satisfied, which conflicts with the equality constraints in (10b). The second part of the proof is completed.
Since the rank one constraints are dropped in problem P 4 , it is important to investigate tightness of such a relaxation. If the solutions of problem P 4 (W k ) are rank one, i.e., rank(W k ) = 1, the optimal beamforming w k of problem P 4 can be extracted by eigenvalue decomposition from W k . Otherwise, the Gaussian randomization method [38] can be employed to obtain the approximate solutions.
Finally, the algorithm is summarized in Algorithm 1.
Although the active BSs are obtained from Algorithm 1 by checking the nonzero elements of s , the minimum network power consumption may not be attained since Algorithm 1 converges to a local optimal solution. Since the IDC-based SDP algorithm tries to switch maximum number of BSs into sleep mode, the minimum network power consumption cannot be attained. Moreover, as mentioned in [4], by investigating the system parameters after obtaining the beamforming and AN, the performance can be enhanced. Therefore, we develop a post processing procedure to further reduce the network power consumption in the next subsection.

Stage 2: Post Processing Procedure
The IDC-based SDP algorithm tries to enforce s to zero, and the BSs with zero soft transmit power are switched into sleep mode. For instance, when s l = 0, BS l is switched into sleep mode. Since Algorithm 1 converges to a local optimal solution, using only s l to find the active BSs will cause a performance loss. However, the amount of soft transmit power indicates the BS activation priority, and a smaller s l indicates that BS l has a lower priority to be switched off. According to [4,7], a better network power performance can be obtained by exploiting the key system parameters. Hence, in this subsection, we define a new incentive metric Λ l to indicate the priority of activating BS l as where ∑ K k=1 h kl 2 2 indicates the channel gain of BS l. A larger Λ l means that BS l has a higher priority to be activated.
It should be noted that the BSs with no transmit power (i.e., s l = 0) should be switched off since the corresponding Λ l equals to zero when s l = 0. Thus, Z [0] = {s l = 0} denotes the roughly inactive BS set. DenoteÂ = L \ {s l = 0} the rough active BS set obtained from Algorithm 1, and A =Â the rough number of active BSs. Then, all the incentive metric are sorted in ascending order: θ π 1 ≤ · · · ≤ θ πÂ , where π i is the active BSs order. By fixing the active BS set A [i] at the i-th step, problem P 4 becomes The post processing procedure switches the BSs with the lowest incentive metric into sleep mode one by one until problem P 5 becomes infeasible or the network power consumption of problem P 5 at the i-th step is larger than the power consumption obtained from the previous [i − 1]-th step, where A [i] is the instantaneous active BS set at the i-th step, and Since the BSs are switched into sleep mode one by one, problem P 5 needs to solve by no more thanÂ times. Moreover, since problem P 5 has a similar form as problem P 4 , the rank one conditions are also attainable, which can be easily proved.

Stage 3: Transmit Power Optimization
Once the active BS set A is obtained using Algorithm 2, the joint beamforming and AN design problem needs to be solved. In other words, by replacing A [i] with A , we solve the following problem: (22), (23d), whereÂ = |A | is the active number of active BSs, and problem P 6 is a SDP problem which can be solved using the interior-point method [21]. It is obvious that problem P 6 has rank one solutions. Finally, the post processing is summarized as Algorithm 2.
Algorithm 2 Post processing procedure for finding the final active BSs.
Step 1: Update the iteration number i = i + 1.
Step 3: If problem P 5 is infeasible, go to Step 5; Otherwise, go to Step 4.
Step 4: total , go to Step 1; Otherwise, go to Step 5.

Step 5: Obtain the final active BS set
Step 6: Solve problem P 6 and obtain the final network power consumption P A total , beamforming and AN (i.e., {W k , V }).

Initialization and Complexity Analysis
In this subsection, we first discuss the initial points for Algorithm 1, and then analyze the computational complexity of the proposed algorithm.

Initialization
Since Algorithm 1 needs to solve problem P 4 iteratively, it is important to set the starting point to be feasible. In this paper, we get a feasible point through solving the following initialization problem with all BSs active (i.e., let A = L).

Simulation Results and Discussion
We consider a secure downlink C-RAN with L = 7 two-antenna BSs, K = 4 single-antenna IRs and M = 1 single-antenna ER. One BS locates at the circle centre, and the other six BSs are located on the circle rim with a radius of 0.2 km. The IRs and ER are randomly distributed in the circle with uniform distribution. The layout of the considered scenario in a secure downlink C-RAN is depicted in Figure 2. We assume that all the BSs have the same fronthaul link power (P c = P c l , ∀l) and maximum transmit power (P max = P max l , ∀l). We set P c l = 5 Watt and P max l = 1 Watt for all BSs [4]. The channel model considered in this paper consists of path-loss, shadowing and small-scale fading. In particular, the channel from BS l to IR k ish where L(d kl ) = 148.1 + 37.6 log 10 d kl is the path-loss fading from BS l to IR k at d kl km [4,40], ϕ kl = 9 dBi is the transmit antenna gain at each BS, δ kl is the log-normal shadowing and zero mean and standard deviation of 8 dB of δ kl are adopted in this paper [4], and b kl ∼ CN (0, I) is the small-scale fading coefficient. Similarly, the channels (g m , ∀m ∈ M) from BSs to ERs are generated. The channel uncertainties of IRs and ERs are assumed to satisfy Θ k = 1 ε 2 IR I NL , ∀k [19] andΘ m = 1 ε 2 ER I NL , ∀m [15], respectively, and without specified ε 2 IR = ε 2 ER = 0.05. We default Γ = −10 dB [15] and γ = γ k = 15 dB. The noise power (with 10 MHz bandwidth) is σ 2 IR k = σ 2 ER m = −104 dBm. We utilize the general smooth function as f (s l ) = ln(1+s l τ −1 ) ln(1+τ −1 ) [31]. Without specified, all the data are averaged over 40 independently IR and ER locations, and the CSI on each location is averaged over 40 channel realizations. Three algorithms are taken into consideration for comparison. 1 / 2 -norm algorithm [7]: this algorithm is also called re-weighted 1 -norm algorithm, which approximates 0 -norm in the objective function of problem P 0 by the re-weighted 1 / 2 -norm, and updates the weights iteratively until the convergence condition is met. The BSs with zero transmit power consumption will be switched into sleep mode. Since the 1 / 2 -norm algorithm needs to solve a series of SDP problems and its computational complexity is in the order of O(I max √ MKNL(MK 3 (NL) 2 + MK 2 (NL) 3 )), where I max is the maximum iteration number. ExSearch algorithm: this algorithm is the optimal one since it computes network power consumption for all possible combinations of the active BSs, and chooses the one with lowest power consumption. The corresponding BSs are active, and this algorithm achieves global optimal solution since the problem is globally solved for any fixed active BS mode. Since the Exsearch algorithm searches for all possible combinations of the active BSs, the computational complexity is on the order of O(2 L √ MKNL(MK 3 (NL) 2 + MK 2 (NL) 3 )) . Baseline algorithm: this algorithm assumes that all the BSs are active. This algorithm is taken for comparison in order to investigate the performance gain of BS mode selection. ExSearch algorithm can be regarded as a special case of Baseline algorithm if the obtained active BS set A of ExSearch algorithm is L. The Baseline algorithm solves the SDP problem for only one time and its computational complexity is in the order of O( √ MKNL(MK 3 (NL) 2 + MK 2 (NL) 3 )).

Power Distribution of the BSs
In this subsection, we investigate the power distribution of the BSs with all BSs active to show the efficiency of proposition 1. Letting A = L (all BSs are active), we solve problem P 5 using mathematical tool, i.e., CVX. It is noted that problem P 5 is a power minimization problem without considering BS mode selection, and the BS power distribution will give some insights into which BSs are active when BS mode selection is taken into consideration. Since the fronthaul link power consumption is fixed, we only analyze the transmit power consumption of each BS for eight typical randomly channel realizations, and the results are listed in Table 2. Note that we use "Ch" as the abbreviation of "Channel". As shown in Table 2 that the transmit power of each BS is not equal to zero for all the eight random channel realizations. In fact, we have tested the transmit power distribution of the BS mode for more than 1000 randomly channel realizations, and found that the transmit power of the BSs is not equal to zero exactly. In other words, the value of dual variable β l is no less than 1/η in general. However, it is interesting to find that the transmit powers of some BSs are close to zero compared to other ones. For instance, BS 2, 3 and 4 are lower transmit power than that of the other four BSs for Channel 1, and BS 2, 4, and 7 also consume lower transmit than that of the other four BSs. It is also observed from Table 1 that BS 4 seems to be switched into sleep mode for Channel 7 since it consumes much lower transmit power than other BSs. The value of the transmit power provides some insights in selecting which BSs should be switched into sleep mode in the following subsections.

Convergence Analysis
To demonstrate the convergence rate of the proposed SDR-based DC algorithm, we plot the objective value of problem P 3 over two randomly channel realizations. It is shown in Figure 3a that the proposed algorithm converges to a local optimal solution in less than 15 iterations. In Figure 3b, we show the corresponding DTP and AN distribution of Algorithm 1. It indicates that both DTP and AN increase with the iteration progress, and Algorithm 1 converges when DTP and AN converge to a local optimal solution. This is mainly due to the fact that Algorithm 1 tries to switch the BSs into sleep mode with the updated weights, and the transmit power must be increased in order to satisfy the same QoS requirements and PLS constraints. According to the expressions of the SINR of IRs and ERs, by increasing DTP, the SINRs of IRs increase while the SINRs of ERs decrease. On the other hand, the directions of transmitted signals try to align the intended IRs, and a small amount of AN is needed to interfere the ERs. As a result, the power consumption for data transmitting is larger than AN (AN accounts for only 19.5 % and 13.85 % of overall transmit power for Ch 1 and Ch 2, respectively).   To investigate further, we take the active BSs of Ch 2 (the same channel as in Figure 3) under different algorithms as an example, which is given in Figure 4. It is observed from Figure 4 that 1 / 2 -norm algorithm needs four active BSs in order to support the required SINR of IRs and PLS constraints. However, the proposed three-stage algorithm consumes only three active BSs, which has the same active number as the ExSearch algorithm. This is mainly because the power consumption increased by the transmit power is lower than the power saved by switching one more BS into sleep mode while satisfying the same constraints. These observations reveal that the BSs with lower transmit power are inclined to be switched into sleep mode. The active BS number, network power, and transmit power consumption of the proposed three-stage algorithm are given in Table 3. Since all BSs are assumed to be active for the baseline algorithm, there are seven active BSs in Table 3. It can be seen from Table 3 that the proposed algorithm and ExSearch algorithm consume a smaller amount of network power than that of 1 / 2 -norm algorithm and the baseline algorithm.

Impact of SINR Threshold of IR on Performance
In Figure 5, we evaluate the impact of the minimum required SINR (SINR threshold) of IRs on the overall network power consumption. It can be observed that the network powers of all the algorithms are a monotonically nondecreasing function of γ. This is mainly because more active BSs and transmit power are consumed in order to support higher required SINR of IRs. The network power consumption of the proposed algorithm is close to the ExSearch one, and is lower than the baseline and 1 / 2 -norm algorithms by respectively 35% and 8% in the low required SINR region, and by about 12% and 5%, respectively, in the high required SINR region.  Figure 6 studies the impact of the ERs' SINR threshold Γ on the network power consumption. It is observed from Figure 6 that a smaller Γ results in a higher overall power consumption. This is because more AN has to be generated to satisfy a smaller Γ, bringing in a larger interference. As a result, more DTPs are consumed to satisfy the minimum required SINR of IRs, resulting in a higher overall network power consumption. Moreover, the performance of the proposed algorithm is approximately the optimal one, and it consumes around 5% less network power than the 1 / 2 -norm algorithm.

Performance of the Proposed Algorithm
In this subsection, we investigate the relationship between network power consumption and ER number under different channel errors of IRs and ERs. All the IRs and ERs are randomly distributed in the circle region. For each IR and ER location, a single channel realization at each location is considered. Since the proposed algorithm has comparable network performance as the ExSearch one, and both achieve much better performance than 1 / 2 -norm algorithm and Baseline algorithm, we only plot the proposed algorithm in this subsection. In the simulations, we assume that all the IRs and ERs have the same CSI error radiuses, i.e., ε 2 = ε 2 IR = ε 2 ER .

Impact of BS Antennas on Performance
Firstly, we compare the performance of different algorithms under different BS antennas, and the results are given in Table 4. Intuitively, the network power consumption decreases with the increased BS antennas. More BS antennas provide a larger diversity gain, and less active BSs as well as transmit power are needed to guarantee QoS requirements and PLS. As can also be seen from Table 4, the total power consumption decrease with the increased number of BS antennas. With more BS antennas, degree of freedom provided by BSs is increased and the desired signals are concentrated to align IRs. As a result, less amount of AN is needed to provide PLS. To investigate further, we study different CSI error radius of IRs and ERs on the performance of the proposed algorithm. The results are given in Table 5. As can be seen from Table 5 that both the transmit power and network power are increased with the increasing CSI error radius of ERs. This is due to the fact that a larger CSI error radius of ERs decrease the information signal quality, and it also increases the interference for ERs.

Network Power Consumption versus ER Number
We also investigate the impact of ER number on network power consumption under different CSI error radius. The results are shown in Figure 7. It can be seen from Figure 7 that the more overall network power that is consumed in order to interfere, the larger the number of ERs. This is mainly because more power is needed to guarantee all the SINR of ERs be no larger than their SINR thresholds. On the other hand, by increasing the channel error radius, more network power is consumed. This is because, with a higher CSI error ε 2 , a larger amount of DTP and AN alongside the active BS number are required to maintain the SINR thresholds of ERs and IRs. In particular, when ε 2 = 0, the channels of IRs and ERs are perfectly known at the BBU pool. In this case, to support guaranteed QoS, the network power consumption is lower than the ones when the channels are imperfect.

Conclusions
This paper has investigated the power efficiency of a secure downlink C-RAN system with CSI uncertainty. The BS mode, beamforming, and AN are jointly optimized to minimize the overall network power consumption with imperfect CSI at both IRs and ERs. With problem transformation and approximation, a general IDC algorithm is proposed to provide a local optimal solution for the DC programming problem. A three-stage algorithm is proposed, which combines the IDC-based SDP algorithm and post processing method. Specifically, a rough sparse solution is obtained by the proposed IDC-based SDP algorithm, and the sparsity of the solution is further improved by a post processing procedure. Numerical results showed that the developed algorithm can significantly reduce the overall network power consumption. Moreover, by increasing the channel error or the number of ERs, more overall network power was consumed. The algorithm is developed under the assumption of imperfect CSI of ERs in this paper, and a more practical scenario without knowing the CSI of ERs is interesting, and we leave it for future work.