On the Optimality of the Stationary Solution of Secrecy Rate Maximization for MIMO Wiretap Channel

To achieve perfect secrecy in a multiple-input multiple-output (MIMO) Gaussian wiretap channel (WTC), we need to find its secrecy capacity and optimal signaling, which involves solving a difference of convex functions program known to be non-convex for the non-degraded case. To deal with this, a class of existing solutions have been developed but only local optimality is guaranteed by standard convergence analysis. Interestingly, our extensive numerical experiments have shown that these local optimization methods indeed achieve global optimality. In this paper, we provide an analytical proof for this observation. To achieve this, we show that the Karush-Kuhn-Tucker (KKT) conditions of the secrecy rate maximization problem admit a unique solution for both degraded and non-degraded cases. Motivated by this, we also propose a low-complexity algorithm to find a stationary point. Numerical results are presented to verify the theoretical analysis.

antenna receiver and a multi-antenna eavesdropper (MISOME) system. The problem of computing the secrecy capacity for the case of a multi-antenna transmitter, a multi-antenna receiver and a multi-antenna eavesdropper (MIMOME) was studied in [5], where the authors showed that Gaussian signaling indeed achieves the secrecy capacity and also derived the optimal covariance structure. Independently, the secrecy capacity of the MIMO WTC was also studied in [6]. There is indeed a rich literature concerning analytical and numerical solutions to find the secrecy capacity of the MIMO WTC channel [7]- [14].
It is well known that the secrecy capacity problem for the non-degraded MIMO WTC is non-convex in the original form. 1 Consequently, there has been a concern that algorithms based on convex optimization techniques applied to the original form can be trapped in a locally optimal solution far from the secrecy capacity. To avoid this, the equivalent convexconcave reformulation of the secrecy capacity problem has been used to find the optimal signaling for the non-degraded MIMO WTC [6], [10]. Against this background, the main contributions in this letter are as follows: • We give a rigorous analytical proof that there exists a unique Karush-Kuhn-Tucker (KKT) point for the secrecy capacity problem for both degraded and non-degraded Gaussian MIMO WTC. This interesting result in fact establishes that existing local optimization methods such as [8] for the secrecy problem indeed yield the globallyoptimal solution. • Motivated by this result, we propose an accelerated gradient projection algorithm with adaptive momentum parameters that solves the secrecy capacity problem directly, rather than the equivalent convex-concave form. The proposed algorithm is provably convergent to a KKT point, and thus solves the MIMOME precoding problem globally and efficiently. Notation: We use bold uppercase and lowercase letters to denote matrices and vectors, respectively. tr (X) , |X| and X denote the trace, determinant and Frobenius norm of X, respectively. By X i,j we denote the j-th element of the i-th row of matrix X. (·) † and (·) T denote the Hermitian transpose and (ordinary) transpose, respectively. C M ×N denotes the space of complex matrices of size M × N , and E {·} is the expectation operator. diag (x) denotes the square diagonal matrix which has the elements of x on the main diagonal, and [x] + max {x, 0}. I and 0 represent identity and zero matrices, respectively. By A ( )B we mean that A − B is positive semidefinite (definite). The maximum eigenvalue of X is denoted by σ max (X).

II. SYSTEM MODEL AND SECRECY CAPACITY PROBLEM
We consider a communication system that includes Alice as the transmitter, Bob as the legitimate receiver, and Eve as the eavesdropper. In this MIMO system, Alice wants to transmit information to Bob in the presence of Eve, where Alice is equipped with N t number of transmitting antenna, Bob and Eve are equipped with N r and N e number of antennas respectively. Let us denote, H ∈ C Nr×Nt and G ∈ C Ne×Nt as the channel matrices corresponding to Bob and Eve. The received signal at legitimate receiver and at eavesdropper can be expressed as where q is the transmitted signal q ∈ C Nt×1 . The additive white Gaussian noise at the legitimate receiver and at the eavesdropper represented as z b ∈ C Nr×1 ∼ CN (0, I) and z e ∈ C Ne×1 ∼ CN (0, I). In this paper, we assume that H and G are perfectly known at Alice and Bob. Let Q = E{qq † } 0 be the input covariance matrix. Then the secrecy capacity under a sum power constraint (SPC) has been expressed as [6] where Q = Q tr(Q) ≤ P T ; Q 0 , C s (Q) = ln |I + HQH † | − ln |I + GQG † |, P T > 0 is the total transmit power. Problem (2) is non-convex in general.

III. UNIQUENESS OF KKT POINT OF (2)
We remark that if H † H − G † G is negative semi-definite, C s is zero. Thus, the next theorem provides a complete characterization of the uniqueness of the stationary point of (2).
Then problem (2) has a unique KKT point. Therefore the KKT conditions are necessary and sufficient for the optimality of problem (2).
Proof: The Lagrangian function of problem (2) is where λ ≥ 0 and Z 0 are the Lagrangian multiplier for the constraints tr(Q) ≤ P T and Q 0, respectively. Note that the gradient of C s (Q) is Thus, the KKT conditions of (2) are given by Throughout the proof we use the following equality which is a special case of the matrix inversion lemma [ In the first part of the proof we assert that λ > 0 and thus tr(Q) = P T if Q is a KKT point. To proceed we note that applying (6) to (5a) yields Suppose to the contrary that λ = 0. Then (7) reduces to which is equivalent to It is clear that the right hand side of (9) is negative semidefinite, which contradicts the assumption that H † H − G † G is positive semidefinite or indefinite. Thus we can conclude that λ > 0 and thus tr(Q) = P T . Next, we show that there is a unique solution Q to the KKT conditions. Suppose to the contrary that (Q 1 , Z 1 , λ 1 ) and (Q 2 , Z 2 , λ 2 ) are two different KKT points of (2). Let us define . Then (5a) for those two KKT points is Now, subtracting (13) from (12), we obtain Our purpose in the sequel is to show that (14) is impossible if Q 1 = Q 2 . To this end we first express Φ 1 − Φ 2 as in (15) shown at the top of the following page. Note that in (15b) we have used the fact that invertible matrices X and Y, and that (15c) is due to (11). Similarly we can write Combining (16) with (15c) produces (17) shown at the top of the following page. Substituting (17) into (14) we can rewrite the first term in the left-hand side of (14) as in (18) shown at the top of the following page. Comparing (18) and (14), our next step is to bound the first two terms of the right-hand side (RHS) of (18) properly. To this end we multiply both sides of (11a) with Q 1 , and note that Q 1 Z 1 = 0, which yields and thus we have where Γ = H † HQ 1 . It is easy to see that I + Γ −1 Γ I and thus I + Γ −1 Γ − I Z 2 Q 1 0, which leads to Similarly, we have Adding (20) and (21) gives which results in (23) shown at the top of this page. It is not difficult to check that (23) holds due to (22) and also the fact that the last four terms of the RHS of (23) are non-positive. Now subtracting both sides of (18) by tr(Z 2 Q 1 + Z 1 Q 2 ) and using (23) produces We remark that if Q 1 = Q 2 , then λ 1 λ 2 tr (Q 2 − Q 1 )(Q 2 − Q 1 ) > 0. Consequently, the inequality in (23) is strict, and so is (24), which contradicts (14) and completes the proof. An immediate consequence of Theorem 1 is that any local optimization method is also a global optimization method for the secrecy problem. In the next we exploit this property to derive an efficient numerical method to solve (2).

A. Algorithm Description
The proposed method is based on the accelerated projected gradient method for non-convex programming with adaptive momentum presented in [16]. The pseudo-code of the proposed method is provided in Algorithm 1, which is explained in detail as follows. Let Y k be the current operating point. Then we take a projected gradient step to obtain the current iterate Q k (cf. Line 3). Note that the notation Π Q (X) in Line 3 denotes the projection of a given point X onto the feasible set Q, i.e., Π Q (X) = argmin{||U − X|| | U ∈ Q}. In contrast to [16] where a constant stepsize is used, we implement a backtracking line search as done in Lines 2-7 Algorithm 1: Iterative algorithm for solving (2) Input: to find a proper step size, which is adopted from [17]. For this purpose we define a quadratic model of C s (Q) as holds [17]. In the Appendix we show that L = σ 2 max H † H + σ 2 max G † G is a Lipschitz constant for of ∇C s (Q). To find a proper β in each iteration, we start from the value of β in the previous iteration and increase it by γ u > 1 until µ β (Y k ; Q k ) becomes a lower bound of C s (Q k ). In this way, the projected gradient step always produces an improved iterate. Next, for acceleration, we compute the extrapolated point Z k = Q k + α(Q k − Q k−1 ), where α is called the momentum parameter (cf. Line 9). For convex optimization, the momentum parameter is fixed. However, since the objective in (2) is non-convex, the extrapolation can be bad and thus α needs to be adapted in accordance with   the extrapolated point. To this end a monitor process needs to be considered [16]. Specifically, if the extrapolation reduces the current objective (i.e. bad extrapolation), then the current iteration is taken for the next iteration and α is reduced with a rate ξ (cf. Line 13). Otherwise, the extrapolated point is taken to the next iteration and α is increased by 1/ξ (cf. Line 11). The stopping criterion for Algorithm 1 is when the increase in the last iteration is less than a small pre-determined parameter . Algorithm 1 is very simple to implement because ∇C s (Q) is given in closed-form in (4) and the projection of a given point X onto Q, Π Q (X), admits a water-filling like algorithm as [18] where X = U diag(x)U † is the eigenvalue decomposition of X and c is the root of the following equation

B. Convergence Analysis
We now show that the iterate sequence {Q k } returned by Algorithm 1 converges to a stationary of (2). First, since ∇C s (Q) is Lipschitz continuous, the backtracking line search terminates in an finite number of steps. More specifically, the number of line search steps at iteration k ≥ 1 is bounded by 2 + log γu β k /β k−1 . Now suppose that C s (Z k ) ≥ C s (Q k ) and Z k is feasible. Then we have Y k+1 = Z k (cf. Line 11). The projection at iteration k + 1 can be explicitly written as Note that (28c) implies It follows from the condition in Line 7 that which, by using (29), yields Similarly, if Y k+1 = Q k , then we can also prove that C s (Q k+1 ) ≥ C s (Q k ) following the same procedure. Note that the inequality is strict if Q k+1 = Q k . By noting that the feasible set is compact convex, we can conclude the objective sequence {C s (Q k )} is convergent and there exists a subsequence {Q k } converging to a limit point Q * . The proof that Q * is a stationary point of (2) is standard and thus omitted here for the sake of brevity [16].

V. NUMERICAL RESULTS
To illustrate Theorem 1 and also the convergence rate of Algorithm 1, we plot the residual error (i.e., C s − C s (Q k )) for both degraded and non-degraded channels in Figs. 1(a) and 1(b), respectively. The channels H and G are generated as CN (0, I). The secrecy capacity C S is found using existing optimal algorithms. More specifically, for the degraded MIMO WTC, problem (2) can be reformulated as a standard semidefinite program and thus can be optimally solved by off-the-shelf solvers such as MOSEK [19]. For the non-degraded case, we implement the barrier method [10,Alg. 3] and [13,Alg. 3], both of which can find the secrecy capacity but are based on the equivalent convex-concave reformulation. Note that Algorithm 1 is applied to problem (2) directly. As can be seen clearly in Fig. 1, Algorithm 1 achieves monotonic convergence as proved in (31). Also, the residual error is reduced quickly to zero as the iteration process continues for both degraded and non-degraded cases. These results indeed confirm that Algorithm 1, even when applied to the nonconvex form of the secrecy capacity problem, can still compute the optimal solution, which is explained by Theorem 1.
To further demonstrate the benefit of Theorem 1 and Algorithm 1, in Fig. 2, we compare the run time of Algorithm 1 as a function of the number of the antennas at Eve. The simulation codes are built on MATLAB and executed in a 64bit Windows PC system with 16 GB RAM and Intel Core-i7, 3.20 GHz processor. We plotted the average actual run-time for 200 different channel realizations. The stopping criteria for all the algorithms is when the increase in the resulting objective is less than 10 −5 during the last 5 iterations. We can see our proposed algorithm outperforms other known methods such as [13, Algorithm 1 and 3] and [10, Algorithm 3] in terms of time complexity.

VI. CONCLUSION
We have proved that the secrecy rate maximization problem of the general MIMOME WTC (i.e. no assumption is made on whether the channel is degraded) under a sum power constraint, despite its non-convexity, has a unique KKT solution. The proof basically implies that any local optimization method that aims to find a stationary solution can indeed solve secrecy capacity problem for non-degraded MIMO wiretap channels which are known to be nonconvex. Motivated by this interesting result, we have also presented an accelerated projected gradient method with adaptive momentum to solve the secrecy problem. Simulation results have demonstrated that the proposed algorithm can find the optimal solution very fast.

APPENDIX LIPSCHITZ CONSTANT OF ∇C s (·)
Recall that L > 0 is a Lipschitz constant of ∇C s (Q) on Q if the following inequality holds We now recall the following well known inequality: where λ max (·) denotes the maximum singular value of the matrix in the argument. Applying the above inequality and by noting that λ max I + HXH † −1 ≤ 1 we have ∇C s (X) − ∇C s (Y) ≤ σ H † H + σ 2 max G † G X − Y which means that L = σ 2 max H † H + σ 2 max G † G is a Lipschitz constant of ∇C s (·).