An Efficient Manifold Algorithm for Constructive Interference Based Constant Envelope Precoding

In this letter, we propose a novel manifold-based algorithm to solve the constant envelope (CE) precoding problem with interference exploitation. For a given power budget, we design the precoded symbols subject to the CE constraints, such that the constructive effect of the multiuser interference is maximized. While the objective function for the original problem is not complex differentiable, we consider the smooth approximation of its real representation, and map it onto a Riemannian manifold. By using the Riemmanian conjugate gradient algorithm, a local minimizer can be efficiently found. The complexity of the algorithm is analytically derived in terms of floating-points operations (flops) per iteration. Simulations show that the proposed algorithm outperforms the conventional methods on both symbol error rate and computational complexity.

Abstract-In this letter, we propose a novel manifold-based algorithm to solve the constant envelope (CE) precoding problem with interference exploitation.For a given power budget, we design the precoded symbols subject to the CE constraints, such that the constructive effect of the multiuser interference is maximized.While the objective function for the original problem is not complex differentiable, we consider the smooth approximation of its real representation, and map it onto a Riemannian manifold.By using the Riemmanian conjugate gradient algorithm, a local minimizer can be efficiently found.The complexity of the algorithm is analytically derived in terms of floating-points operations (flops) per iteration.Simulations show that the proposed algorithm outperforms the conventional methods on both symbol error rate and computational complexity.

I. INTRODUCTION
A S ONE of the most promising approaches in 5G technol- ogy, massive multi-input multi-output (mMIMO) communication systems are expected to provide significant benefits over conventional MIMO systems by employing much larger antenna arrays [1], [2].Nevertheless, such systems face numerous challenges brought by the increasing number of antennas, e.g., higher hardware costs and power consumption, which may delay its deployment in future 5G systems.Hence, cheap and efficient RF power amplifiers (PA) are required for making the technology realizable in practical scenarios.
It is important to note that most of power-efficient PAs are made by nonlinear components; therefore, waveforms with low peak-to-average-power ratio (PAPR) are needed to avoid signal distortions when the PA is operated at the saturation region [3].Pioneered by Mohammed and Larsson [4], [5], the constant envelope precoding (CEP) has been proposed as an enabling solution, where the MUI is minimized subject to the CE constraints.The optimization in [5] is a nonconvex nonlinear least square (NLS) problem, and is solved by sequential gradient descent (GD) method, which converges to a local minimum.To further improve the performance, a cross-entropy optimization (CEO) solver is introduced in [6].Recent contributions exploit the geometric properties of the single-user CE problem and develop approaches for exact phase recovery [7], [8].More relevant to this letter, by viewing the feasible region of the CE problem as a complex circle manifold, an Riemannian conjugate gradient (RCG) algorithm is proposed in [9], where the NLS problem is solved with much lower complexity than both GD and CEO.While the interference reduction (IR) methods in above-mentioned works are straightforward, their performance is strongly dependent on the constellation energy [5], which is difficult to optimally set in advance.In view of this, a constellation scaling and rotation optimization has been formulated in [10], and solved by semidefinite relaxation (SDR).By noting the fact that MUI is known to the base station (BS), and thus can be utilized as a source of useful power, the previous work [11] considers a novel CEP approach with the concept of constructive interference (CI) [12]- [16], which obtains a significant performance improvement compared to IR approaches.However, both of the above-mentioned methods demand large amount of computations inevitably.
Based on the previous works on manifold optimizations [17], [18], we consider a manifold-based algorithm to solve the CI-CEP problem in this letter.Since the objective function is not complex differentiable, we first equivalently transform the problem into its real representation, and use a smooth upper-bound to obtain a differentiable approximation.By viewing the feasible region as an oblique manifold, an RCG algorithm is employed to find a local minimizer of the problem.Unlike the relaxed convex problem in [11], the proposed algorithm is guaranteed to yield precoded symbols with exactly constant envelopes, and has better performance than the methods of [11] in terms of both symbol error rate (SER) and complexity.

II. SYSTEM MODEL
We consider a multiuser multi-input single-output (MU-MISO) downlink scenario where an N-antenna BS transmits signals to M single-antenna users.The received signal vector is This work is licensed under a Creative Commons Attribution 3.0 License.For more information, see http://creativecommons.org/licenses/by/3.0/given as where y = [y 1 , y 2 , . . ., y M ] T ∈ C M ×1 with y m being the received symbol for the mth user, is the channel matrix, with h m being the channel vector for the mth user.Without loss of generality, the channel is assumed to be Rayleigh fading, i.e., each entry of H subjects to independent identically distributed complex Gaussian distribution with zero-mean, and is perfectly known to the BS.The transmitted signal is expected to have constant envelope, which is where P T is the total transmit power and θ n is the phase of the nth transmitted symbol.Assume that the desired symbol for the mth user is , where E m and φ m denote the power and the phase of the symbol, respectively.The received symbol for the mth user can be written as where the second term represents the interfering signal for the user.The total MUI power is then given by where s = [s 1 , s 2 , . . ., s M ] T is the desired symbol vector.

III. PROBLEMS FORMULATION
Aiming at minimizing the MUI power, the conventional CEP approaches are designed to solve the following optimization problem [5] min ( Problem ( 5) is an NLS problem, which is obviously nonconvex, and has multiple local minima.Fortunately, it has been proven that most of the local minima yield small values [5], and can be obtained by a variety of approaches [5], [6], [9].However, by treating all the interference as harmful, these techniques ignore the fact that MUI can be employed as a green signal power source to benefit the symbol demodulation.This has been first proposed in [19], where the MUI is classified as constructive and destructive parts.CI-based beamformers aim to minimize destructive and exploiting constructive interference, which enable a relaxed feasible region for the optimization [13].Based on this, previous work [11] focuses on maximizing the constructive effect of the MUI to achieve CE precoding.
While the CI approaches have already been applied to quadrature amplitude modulation (QAM) [20], [21] here we recapture the CI-CEP problem with PSK modulations for notational simplicity as follows [11]: where s m = ue j φ m , ψ = π/L, u is the amplitude for the PSK symbols, and L is the PSK modulation order.The abovementioned problem can be solved by CEO suboptimally, and has been further relaxed as a convex problem by replacing the equality constraints on x n as inequalities, i.e., |x n | ≤ P T /N , ∀n.Such a convex approximation problem can be efficiently solved by numerical solvers, e.g., CVX toolbox.The results are then normalized to obtain transmitted symbols with constant envelopes [11].Nevertheless, using CEO or CVX to solve (6) requires significant computation resources.In the next section, we propose a manifold based optimization technique to solve (6), which has much lower complexity.

IV. PROPOSED ALGORITHM BASED ON OBLIQUE MANIFOLD
Since Re (•) and Im (•) are not complex differentiable, we formulate the real representation of ( 6).First we rewrite t m as where hm = h m e −j φ m .We then separate the real and imaginary parts of complex notations as follows: where H = h1 , h2 , . . ., hM .It follows that By using the fact that |a| = max (a, −a), and denoting β = tan ψ we have where Denoting X = N P T [x R , x I ] T , the real representation of the problem (6) can be written compactly as follows: where i = 1, 2, . . ., 2M .It is clear that the feasible region of ( 12) can be given as We say that M forms a manifold, and X is a point on M. To be more specific, M is a 2N -dimensional oblique manifold [22].
In Riemannain geometry, a manifold is defined as a set of points that endowed with a locally Euclidean structure near each point.
Given a point p on M, a tangent vector at p is defined as the vector that is tangent to any smooth curves on M through p.
The set of all such vectors at p forms the tangent space, denoted by T p M, which is an Euclidean space.Specially, the tangent space at X is given as [23] T If the tangent spaces of a manifold are equipped with a smoothly varying inner product, the manifold is called Riemannian manifold [24].Accordingly, the family of inner products is called Riemannian metric, which allows the existence of rich geometric structure on the manifold.Here we use the usual Euclidean inner product as the metric, which is U, V X = tr U T V , where U, V ∈ T X M.
The algorithm that we employ is the so-called Riemannian conjugate gradient (RCG) algorithm [25], which performs a gradient-dependent line search on the Riemannian manifold rather than the Euclidean space.Since the objective function in ( 12) is still not differentiable, we consider the well-known smooth log-sum-exp upper-bound f X for the max function [26], which is where ε > 0 is some small positive number.The gradient of f X is thus given as where xn is the nth column of X.Noting that x R = where e n ∈ R N ×1 have all-zero entries except that its nth entry equals 1.Based on (17), the nth column of the gradient is given by In the RCG algorithm, ( 16) is called the Euclidean gradient, and can be used to compute the Riemannian gradient, which is defined as the tangent vector belongs to T X M that indicates the steepest ascent direction of f X .It can be viewed as the orthogonal projection of the Euclidean gradient onto the tangent space [23], [27], which is given as where P X (•) denotes the projector, diag (•) sets all off-diagonal entries of a matrix to zero.At the kth iteration, the descent direction Π k is obtained as Here the projector is used as vector transport, which maps the vector from one tangent space to another.μ k is given by the Riemannian Polak-Ribière formula, which is [27] The k + 1th update is thus given by where R Xk (•) is called retraction, which maps a point on T Xk M to M with a local rigidity condition that preserves gradients at Xk [23], [27], and is given as where Xk + δ k Π k n is the nth column of the matrix Xk + δ k Π k , and the stepsize δ k is obtained by backtracking line search algorithms, e.g., Armijo rule [27], [28].Fig. 1 shows a single iteration of the RCG algorithm on M, which has also been summarized in Algorithm 1.According to [29], the solution obtained by RCG for the problem (12) satisfies the Karush-Kuhn-Tucker (KKT) conditions on the manifold.
Remark: The complexity of Algorithm 1 mainly comes from the computation of the Euclidean gradient in (16), where 16MN + 6M flops are needed, leading to a total complexity of O (MN) for each iteration.By contrast, the complexity of GD and RCG-IR are O MN 2 and O (MN) per iteration [5], [9], respectively.For CEO, the complexity is O (KM N ) in each iteration [11], where K stands for the number of random samples, which may be quite larger than M and N .For clarity, we will   (21).end while show the overall complexity for all the methods numerically in the next section.

V. NUMERICAL RESULTS
In this section, numerical results based on Monte Carlo simulations have been provided to compare the performance of different algorithms.We consider the following six algorithms: 1) the proposed RCG algorithm for CI (RCG-CI); 2) convex relaxation for CI (CVX-CI) [11]; 3) cross-entropy optimization for CI (CEO-CI) [11]; 4) RCG algorithm for IR (RCG-IR) [9]; 5) gradient descent algorithm for IR (GD-IR) [5]; and 6) cross-entropy optimization for IR (CEO-IR) [6].Without loss of generality, we use QPSK modulation for all the approaches.We set u = 1, ∀m, which is a common assumption in related literature for the reason that the optimal u is difficult to determine for IR methods [5], [6] whereas arbitrary u can be accepted by CI methods [11].We also assume that P T = 1, N = 64 for all the algorithms, and each entry of the channel H subjects to standard complex Gaussian distribution, i.e., h n,m ∼ CN (0, 1) , ∀n, ∀m.For CEO methods, we use the same parameter configuration with [11], which is T = 1000 (the number of iterations), K = 500 (the number of initialized random samples), ρ = 0.05 (quantile), and α = 0.08 (the smooth parameter).For GD-IR, the number of iterations is set as 50.
While the analytic complexity per iteration of the most algorithms has already been given, we compare the overall complexity in terms of average execution time in Fig. 2(a) since it is difficult to specify the complexity of the CVX-CI approach.The simulation is performed on an Intel Core i7-4790 CPU 32 GB RAM computer with 3.6 GHz.As expected, the RCG methods require least execution time to solve the problem while other methods need much more.Although the proposed RCG-CI algorithm has the same complexity with RCG-IR by each iteration, it needs more iterations for convergence in general.This is because the objective function in the RCG-IR is simpler than that in the RCG-CI.Nevertheless, the total time needed for both methods is still comparable.
In Fig. 2(b), we show the error performance of all six approaches in terms of SER with increased transmit signal-tonoise ratio (SNR), where M = 20 and SNR = P T /N 0 .Note that all the IR methods show negligible difference under the given parameter configuration, and all the CI methods outperform the IR methods thanks to the utilization of the MUI power.It is worth noting that the proposed RCG-CI has the best performance among all the six approaches with 2 dB gain over IR methods, and 1 dB gain against the CVX-CI algorithm.
We further consider the error performance with increased number of users in Fig. 2(c), where the SNR is fixed at 8 dB with the number of users ranging from 12 to 24.Once again, we see that the proposed RCG-CI achieves the lowest SER among all the approaches.Note that for the case of small number of users, the IR methods are effective in zeroing interference [5], which results in an almost constant SINR, and leads to the flat SER curves at 10 −2 , as shown in Fig. 2(c).Nevertheless, by exploiting the MUI as useful power source, the resultant SINR for CI methods increases when the number of users decreases.This gives us a larger gap between the two methods for lower number of users in Fig. 2(c).

VI. CONCLUSION
A low-complexity manifold optimization algorithm has been introduced to solve the CEP problem with the exploitation of the MUI power.By viewing the feasible region of the optimization as an oblique manifold, the proposed method can efficiently find a near-optimal solution using the Riemannnian conjugate gradient algorithm.Numerical results show that the proposed RCG-CI algorithm outperforms the existing five other approaches in terms of error performance, with a comparable complexity to the fastest RCG-IR algorithm.
2), which are the first and second column of XT , respectively, we have ) where a n,m , b n,m , c n,m , and d n,m denote the (n, m)th enrty of the following matrices: A = HI − β HR , B = − HI − β HR C = HR + β HI , D = HR − β HI .(

Fig. 2 .
Fig. 2. Numerical results.(a) Average execution time versus number of users for different algorithms.(b) SER versus SNR for different algorithms.(c) SER versus user for different algorithms.