A First-Order Primal-Dual Method for Saddle Point Optimization of PAPR Problem in MU-MIMO-OFDM Systems

This paper investigates the use of a particular splitting-based optimization technique for constrained l∞norm based peak-to-average power ratio (PAPR) reduction problem in multiuser orthogonal frequency-division multiplexing (OFDM) based multiple-input multi-output (MIMO) systems. PAPR reduction and multi-user interference (MUI) cancelation are considered in a saddle-point formulation on the downlink of a multi-user MIMO-OFDM system and an efficient primal-dual hybrid gradient (PDHG) inspired algorithm with easy-to-evaluate proximal operators is developed. The proposed algorithm converges significantly faster to satisfactory solutions with much improved asymptotical convergence rate than existing methods. Numerical results illustrate the superior performance of the proposed algorithm over existing methods in terms of PAPR reduction for different MIMO configurations.


Introduction
Very-large multi-user multiple-input multiple-output (MU-MIMO) system or massive MIMO, with hundreds of base station antennas, is an emerging technology to meet the explosively increasing demand of throughput.Marzetta [1] firstly conceived the idea of employing large number of transmit antennas at the base-station (BS), in co-located or distributed manner, to simultaneously serve dozens of userequipments (UEs) in the same time-frequency source.Massive MIMO arrays of realistic physical sizes provide substantial improvement in spectral and energy efficiency compared to the conventional small-scale MIMO systems.Moreover, the use of large-scale antenna arrays yields favorable propagation where the channel vectors between the base station and the user terminals become mutually orthogonal.Hence, the simple linear signal processing methods become efficient and optimal for multi-user interference suppression [2].Massive MIMO systems are promising candidates for 5G [3].
Practical wireless channels generally suffer from frequency-selective fading.The intersymbol interference resulting from the frequency-selective fading can be compensated by orthogonal frequency division multiplexing (OFDM).OFDM divides the entire bandwidth of a wideband channel into a set of orthogonal narrowband subchannels such that each individual sub-channel is exposed to flat fading rather than frequency selective fading.
MIMO and OFDM technology are widely used at the same time in the physical layer for wireless communication systems [4], [5].However, the OFDM signal incurs high peak-to-average power ratio (PAPR) of the system due to the fact that phases of sub-carriers may combine in a constructive or destructive manner.Constructive superposition of subcarriers via inverse fast Fourier transform (IFFT) operation will result in a signal with high value of envelope peaks.To avoid high out-of-band radiation caused by high side lobes of modulated subcarriers, and inter-modulation distortion among subcarriers, we require power amplifiers and digital-to-analog converters (DAC) with large linear dynamic ranges, which is costly and power-inefficient [6].Therefore, signals with low PAPR become more important to make massive MIMO systems affordable as well as power efficient.
Several schemes have been proposed to handle high PAPR in single-input single-output (SISO) OFDM systems, such as companding [7], [8], precoding [9], clipping [10], partial transmit sequence (PTS) [11], tone reservation [12], selected mapping (SLM) [13] and active constellation extension (ACE) [14].Although it is easy to extend most of these schemes to single user MIMO systems [15][16][17][18], their applicability to the multi-user (MU) MIMO systems is not straightforward.This is due to the fact that, since the user terminals are spatially distributed, joint processing of the signals is only feasible at the transmitter end.So it is more challenging to apply PAPR reduction schemes in multiuser scenarios.In [19], the null-spaces of massive MU-MIMO-OFDM channels are exploited based on a linear constrained l ∞ optimization problem to improve the PAPR.Convex optimization via fast iterative truncation algorithm (FITRA) jointly performs pre-coding, OFDM modulation, and PAPR minimization.It is shown theoretically that constant-envelope signals with PAPR close to unity can be obtained for infinitely large antenna limit.FITRA results in dramatic PAPR improvement (around 10 dB) for massive MIMO systems but shows the limited PAPR improvement when applied to the small scale MIMO configurations, i.e., 4 × 2 or 8 × 2 MIMO.Similarly, in [20], the perturbation signals lying within the null spaces of the associated channel matrices are added to the pre-coded signals in order to reduce the PAPRs of the transmitted signals via convex optimization problem.Unlike [19], where the proximal operator of the l ∞ -norm is basically a clipping operator, an efficient algorithm is developed by resorting to the alternating direction method of multipliers (ADMM).A PAPR reduction scheme analogous to the tone reservation was proposed in [21], where a peak clipping scheme is used for some transmit antennas at base station, while other antennas are reserved to compensate for the distortions due to peak clipping.This antenna-reservation scheme has a low computational overhead but can achieve only a moderate PAPR minimization.Moreover, the antennas reserved to compensate for clipping distortion may also experience large PAPRs.Methods proposed in [22], [23] formulate the PAPR reduction problem as approximate message passing (AMP)-based Bayesian inference problem that use priors to encourage constant magnitude solutions.The Bayesian approach in [22] treats MUI cancelation as an underdetermined linear inverse problem and offers better PAPR reduction than the FITRA algorithm [19], with much lower computational complexity.
In this paper, we present a generalized usage of the primal-dual hybrid gradient (PDHG) algorithm [24] for joint consideration of PAPR reduction and multi-user precoding in multiuser MIMO-OFDM systems.PAPR reduction problem is formulated as a saddle-point problem and is solved by using a variant of the PDHG algorithm.The algorithm progresses by taking forward and backward proximal steps that alternately maximize and minimize a constrained form of the saddle function.The associated proximal operator for the convex non-differentiable l ∞ -norm is implemented in linear time with low computation cost.
The remainder of the paper is structured as follows: in Sec. 2, the description of the system model, basic assumptions and introduction to the PAPR reduction problem, are provided.Section 3 describes the formulation of PAPR reduction as a saddle-point problem along with the development of modified PDHG algorithm.In next section simulation results are presented and conclusions are drawn in Sec. 5.
Notations: Throughout the paper, the lowercase and uppercase symbols such as "x" and "X" respectively are used to represent column vectors and matrices.The transpose is denoted as (.) T and conjugate transpose as (.) H .The inner-product of two vectors x and y will be denoted by x,y = x T y.Moreover ║x║ 1 , ║x║ 2 and ║x║ ∞ will be used to denote the l 1 -norm, l 2 -norm and l ∞ -norm of a vector x.{x} and {x} respectively represent the real and imaginary part of a vector x.The U × U unitary discrete Fourier transform, the N × N identity matrix and M × N zero matrix are represented as F U, I N and 0 M × N , respectively.The symbol  denotes the Kronecker product.The notation abs(x) denotes the component-wise application of the absolute value to a vector x.

System Description 2.1 System Model
Consider a multiuser MIMO-OFDM downlink system depicted in Fig. 1 that consists of a base station (BS) having N transmit antennas.The BS simultaneously serves M single antenna equipped terminals, where M << N.For a total of U OFDM tones (subcarriers), the M × 1 signal vector s u contains information at u-th tone for each of the M terminals.The subcarriers are divided into two complementary sets φ and φ C .The set φ contains the data transmission tones and φ C indexes the inactive guard band tones, such that for each tone u  φ C , s u = 0 M  1 .
To eliminate the multi-user interference (MUI) at the receivers, the information symbols on the u-th subcarrier are linearly precoded as where p u = C N  1 is the pre-coded vector and W u  C N  M represents the precoding matrix for the u-th OFDM tone.To eliminate MUI, the widely used precoding schemes are maximum ratio transmission (MRT), zero forcing (ZF), and minimum mean-square error (MMSE) precoding [25].For known channel matrix H  C M  N at the transmitter, the standard ZF linear pre-coder can be written as The normalized precoded vectors p u , u are then reordered to N transmit antennas according to the following mapping: , ,......, , ,...., where U-dimensional vector r n contains the frequencydomain signal to be transmitted from the nth antenna.
Then an Inverse Discrete Fourier Transform (IDFT) F is applied on the precoded signal to obtain the time-domain signal, i.e., After applying IDFT, a cyclic prefix is appended to the time domain samples of each antenna to avert intersymbol interference (ISI).Lastly, the time domain samples are transmitted through the wireless channel.

Peak-to-Average Power Ratio (PAPR) Reduction
The time domain samples {â n } have large dynamic range due to the superposition of the individual subcarriers, which can be characterized by peak-to-average power ratio (PAPR) metric on the n-th antenna as [19]: where the  l -norm is defined as  is used because RF-chains process and modulate the real and imaginary parts independently.
Alternative PAPR definitions also exist in the literature where l ∞ -norm instead of  l and U instead of 2U are used.The relation [26,Eq. 12] ensures that reducing the PAPR as defined in (5) also reduces an l ∞ -norm-based PAPR definition.
The best and worst case PAPR is bounded as Since the transmit antennas are more in number than the user terminals i.e., N >> M so there will be infinitely many precoded signals, p≜ [p 1 T , p 2 T ,……, p U T ] T that satisfy the precoding constraint s = Hp.In this paper, we search for the precoded signal p whose related time domain signals {â n } result in low dynamic range and which also satisfy the following conditions to cancel out the MUI. 1 , , , .
where s ͞  C UM  1 is the aggregation of all user data symbols and inactive tones, and H ͞ is a block diagonal matrix having the main diagonal blocks as H u , uφ and As defined in (3), the entries of each precoded vector are assigned to N transmit antennas through a linear transformation T as: . r Tp (8) Using ( 4) and ( 7) we obtain T  s HT Fa (9) where For symbol vector s ͞ the aim is to find time domain samples â that satisfy (9) such that each antenna emits signals with low PAPR.Inspired by [19], the PAPR minimax reduction problem can be framed as a l ̃-norm based constrained optimization which reduces the largest PAPR existing among all the transmit antennas as .(11) where

HT F HT F HT F
HT F .

Problem Formulation and Proposed Algorithm
The problem     P in (11) can be rewritten in the constrained form (12) After introducing Lagrange multiplier λ  C M to enforce the equality constraint v = s -Hx, the saddle-point formulation takes the following form: where the characteristic function χ δ (ν) used to remove the constraint ║ν║ 2 ≤ δ is defined as follows:

PDHG-PROXINF Algorithm
The primal-dual hybrid gradient (PDHG) [24], [28], [29] is an efficient splitting method that works directly with the original saddle-point problem by alternating between primal and dual variables.In our case, the saddle point of the minimax Lagrangian formulation (13) corresponds to a minimizer of (P ∞ δ ).In order to find the saddle point of ( 13), we propose a variant of standard PDHG algorithm, referred to as the PDHG-PROXINF, in Algorithm 1.
The parameters τ k , and σ k are step sizes of the primal and dual steps, respectively.During any iteration k, PDHG-PROXINF updates both a primal (in step 1-2) and dual variable (in step 4-5) using a combination of forward and backward (or proximal) steps.In steps (1-2), PDHG-PROXINF updates the x  C N by first taking a gradient descent step followed by the proximal step involving ║x║ ∞ .Whereas the dual variable λ  C M is updated through a gradient ascent step 4 followed by the backward step 5.While the proposed algorithm is nearly as effective as constant non-adaptive parameter versions of PDHG, well selected adaptive step sizes τ k , σ k from recently developed method in [30] can enhance the performance.The FITRA [19] requires a manually chosen regularization parameter to achieve balance between PAPR reduction and multiuser interference, whereas PDHG-PROXINF is free of this issue.

Evaluation of the l ∞ Proximal Operator
Algorithm 1 takes a gradient descent step initially followed by the computation of the proximal operator associated with l ∞ -norm in step 2. Proximal operator does not have a simple closed-form solution; however, it can be computed explicitly as explained in the following section.

The proximal operator
defined in step 2, finds a point close to the minimizer of ║x║ ∞ without straying too far from the starting point [31].The step size τ k controls the extent to which the proximal operator maps points towards the minimum of ║x║ ∞ .In plain terms,

 
. prox   w is the function that maps the vector w  C N to the unique solution of where

Proj
arg min subject to Ignoring the trivial case ║w║ 1 ≤ τ, there exists for τ each a Lagrange multiplier γ(τ) such that shares the same solution as (17).
As shown by Chambolle et al. [32], the solution of separable convex optimization problem (18) has a closed form accomplished by applying component-wise soft thresholding operation S γ (w) to the vector w for a properly chosen value of γ.
Thus, the proximal operator associated with l ∞ -norm in (16) becomes   w w w that can be computed by sorting the elements of w  C N in decreasing order [33] followed by component-wise truncation, which can be performed in (n logn) expected time.An improvement of this algorithm was proposed in [34], that avoids having to sort the entire vector w; since only the largest elements of w are involved in the determination of γ.This heap-sorting based approach reduces the complexity to (log n).We have exploited the linear-time (n) mediansearch like procedure [35] to compute the proximal operator.The overall algorithm to minimize (15)

Simulation Results
To illustrate the PAPR reduction performance of PDHG-PROXINF algorithm, simulations are carried out for a massive MIMO scenario having N = 100 transmit antennas and 10 single-antenna user terminals.OFDM modulation with U = 128 tones and spectral map φ with 114 data tones for transmission are considered [36].For the purpose of simulation, coded transmission is employed, i.e., the information bits for each user are encoded by a convolutional encoder (rate-1/2, generator polynomials [133 0 ,171 0 ] and constraint length 7) [37] which are then interleaved, mapped to 16-QAM constellation, precoded, and finally transmitted over the assumed frequency-selective channel.Convolutional code-1/2 rate is opted since it provides significant coding gain and low PAPR values as compared to higher coding rates.The channel is modelled as a tap-delay line with D = 4 taps and impulse response matrices Ĥ d , d = 1,…,D, have independent and identically distributed entries drawn from circularly-symmetric normal distribution.The frequency-domain response H u on the u-th tone can be expressed as [38]: After demodulation at each user terminal, a soft-input Viterbi decoder is used to regenerate the transmitted bits.
In the simulations, the probability of PAPR exceeding a threshold PAPR 0 is considered as the measurement index which is described as complementary cumulative distribution function (CCDF) [39].
Also, in order to evaluate the out-of-band radiation of the solution, we define the out-of-band (power) ratio as:  To evaluate the performance of the proposed primaldual hybrid gradient proximal infinity (PDHG-PROXINF) algorithm, we compare it with the fast iterative truncation algorithm (FITRA) [19] and the conventional zero-forcing (ZF) precoding scheme for 1000 independent trials.It is worth mentioning here that the time taken for per packet transmission and overall simulation will not be the same for the considered iterative schemes.CPU computational time depends on the tolerance level and maximum number of iterations taken.The maximum number of iterations is, however, explicitly stated for each iterative scheme in the ensuing simulation results.
Firstly, we examine the signals estimated by respective schemes.Figure 2 depicts the real part of the first antenna's time-domain samples (i.e.â 1 ) obtained by all considered schemes.For PDHG-PROXINF and FITRA algorithms, their solutions have most of the entries located near to a ceiling, which results in small dynamic range in terms of PAPR.The solution of ZF scheme shows a large variation with few high peaks.Simulation results verify that PDHG-PROXINF obtains the lowest PAPR of 1.42 dB linked with the first transmit antenna, whereas ZF and the FITRA schemes render higher PAPRs of 9.62 dB and 1.75 dB, respectively.
In order to assess the PAPR reduction performance, we compare the CCDF of the PAPR values for respective schemes obtained in 1000 simulation trials in Fig. 3.The PAPR associated with all N transmit antennas is considered while calculating the empirical CCDF.For system configuration with (N, M) = (32, 4), our proposed algorithm reduces the PAPR by 9.7 dB (corresponding to a complementary CDF of 10 -3 ) compared to ZF scheme and by 0.3 dB compared to the FITRA algorithm with 2000 iterations.Meanwhile, PDHG-PROXINF exhibits a much lower OBR (averaged over 1000 independent runs) than FITRA algorithm.Their OBRs are given by -62.49dB and -56.51 dB, respectively.
The symbol error rate (SER) performance of each considered scheme is represented in Fig. 4. The average signal-to-noise ratio (SNR) across user terminals is defined as ║x║ 2 2 /NN o , where N o denotes the noise variance at the receivers.At SER of 10 -3 , we observe that PDHG-PROXINF suffers a SNR loss of about 1.66 dB and 0.38 dB compared to ZF and FITRA schemes, respectively.This performance loss is mainly caused due to the fact that norm of the obtained solution has more significant increase.It is also worth noting that the ZF precoding attains the least-norm solution.
We now examine the convergence behavior of FITRA and PDHG-PROXINF algorithms for system configuration (N, M) = (100, 10).Figures 5(a   the PAPR and OBR performance of these iterative-based algorithms as a function of the number of iterations, respectively.We observe in Fig. 5(a) that PDHG-PROXINF clearly yields much faster initial as well as asymptotic convergence than the FITRA algorithm.PDHG-PROXINF algorithm obtains a PAPR of 2.96 dB within only 500 iterations, while the FITRA algorithm requires about 1550 iterations to achieve the same PAPR reduction performance The impact of the transmit antenna configuration and the channel model to the PAPR reduction is investigated in Fig. 6 for fixed number of user terminals M = 10, where results are averaged over 1000 independent trials.It is evident that increasing the number of transmit antennas from 20 to 100 and non-zero channel taps from 4 to 8, yields improved PAPR performance for both FITRA and PDHG-PROXINF algorithms.This result is expected because increasing N and number of channel taps D will result in the increase of the degrees-of-freedom (DoF) at the base station.However, the PDHG-PROXINF exploits the inherent DoFs more efficiently as the number of transmit antennas and channel taps increases.

Conclusion
We have introduced a variant of primal-dual hybrid gradient (PDHG) algorithm for joint consideration of PAPR reduction and MUI cancelation in multiuser MIMO-OFDM systems.The associated proximal map is computed using a linear-time projection algorithm.The proposed method exhibits an optimal rate of convergence for PAPR and OBR metrics in terms of its dependence on the number of iterations.Simulation results show that the proposed algorithm efficiently exploits the large degrees-of-freedom inherent in large-scale antenna systems in order to obtain lower PAPRs than FITRA algorithm [19], meanwhile producing lower out-of-band radiation.

Fig. 1 .
Fig. 1.System model of the MU-MIMO-OFDM downlink scenario, with U OFDM tones, N transmit antennas and M user-terminals.H ˆ. n U n  a F r eventually be transformed into equivalent real-valued l ∞ -norm based problem  

w
onto the closed ball of radius τ of the dual l 1 -norm, satisfy the relation
is detailed in Algorithm 2.