Digital Predistorter Design Using a Reduced Volterra Model to Linearize GaN RF Power Amplifiers

In this paper, a novel method for reducing a Simplified Volterra Series (SVS) model size is proposed for GaN RF Power Amplifier (PA) Digital Predistorter (DPD) design. Using the SVS-modified model, the number of coefficients needed for the PA behavioral modeling and predistortion can be reduced by 60% while maintaining acceptable performances. Simulation and implementation tests are performed for a Class AB GaN PA and Doherty GaN PA using a 20-MHz Long Term Evolution-Advanced (LTE-A) signal. The Adjacent Channel Power Ratio (ACPR) attains –40 dB and –41 dB for the Doherty and Class AB GaN PAs, respectively. The implementation complexity is also studied and the obtained results prove the capability of the proposed model to linearize PA using 3% of the Slice LUTs and 87% of the DSP48E1 available in the Xilinx Zynq-7000 FPGA.


Introduction
To satisfy the increasing demand for higher data rates, modern communication systems require wideband signals.By considering this wideband requirement in Long Term Evolution Advanced (LTE-A), the RF transmitter design becomes more challenging, notably the behavior of the RF Power Amplifier (PA) which is the major source of nonlinear distortion in RF transmitters [1].
Generally, the PA stage is expected to achieve a higher power efficiency level with respect to the linearity requirements.Thus, in order to enhance the PA efficiency, Digital Predistortion (DPD) has been widely used as one of the most advantageous linearization technique [1][2][3][4][5][6][7][8][9][10][11][12][13][14].The DPD basic idea is to produce an inverse mathematical model of the PA nonlinear behavior.This inverse function, which is called predistorter function, is then introduced before the PA stage to correct its nonlinearity and the final output signal will be linear according to the original input signal.
Commonly, the models with memory where the output signal depends on some depth of the preceding samples have been the most used with essentially the conventional Memory Polynomial (MP) model [7].The Generalized Memory Polynomial (GMP) model is characterized by the adding of a supplementary polynomial function to the MP basis function which contains cross terms as results of the combination between the complex input signal with lagging and leading terms [8].The GMP is mainly adapted for multi-carrier signals where strong nonlinear memory effect is present.The Hybrid Memory Polynomial (HMP) model [9] is defined to introduce cross terms by combining the advantages of the MP and the Envelope Memory Polynomial (EMP) models [10].However, the number of coefficients of the HMP to be identified is extremely large compared to the other models.
Volterra Series (VS) behavioral models and their reduced forms have gained increased attention in the current wireless communication systems such as LTE-A standard, because of their high accuracy compared to the memory polynomial models [11], [15], [16].However, when wide signal bandwidth is used, the number of required coefficients for the PA behavioral characterization and predistortion is extremely large [11].This design challenge is motivating recent research work to prune the proposed models while maintaining acceptable performances.In [12], the order of nonlinearity in each branch of the MP model is manually adjusted.This approach is validated using 3-carrier WCDMA signals and results show that the MP model size is reduced by 30% without sacrificing its accuracy but obtained performances are expected to decrease for wide signal bandwidth.Sparse Bayesian Learning (SBL) method is used in [13] to extract the coefficients of the PA behavioral model and the parameters of the predistorter inverse function.Test results of the SBL method show a significant model order reduction while meeting communication standards requirements.In other works [11], [14], compressed-sensing (CS) theory is applied to an assortment of GMP and MP models to estimate their coefficients and reduce the DPD implementation complexity while maintaining sufficient performances.Test results given for a Doherty PA using a 100-MHz LTE-A signal show that the proposed technique reduces by 90% the number of coefficients.Processing complexity of CS-based DPD introduce a delay that has an impact on real-time performances.
A Simplified Volterra Series (SVS) model was developed in [11].First, it takes advantage of an additional complex branch where the complex conjugate of the input samples is used to enhance the ability of the DPD to suppress the IQ-imbalance drawbacks [17].Then, the Dynamic Deviation Reduction (DDR) approach [18] is used to reduce the number of input samples combinations.Finally, the magnitudes of the input samples help to decrease the overall number of coefficients as reported in [19].Even though the SVS model uses simplification aspects such as the DDR approach to reduce the size of the VS based model the number of coefficients needed for the PA behavioral characterization is still high.
In this work, a new pruning approach is proposed to reduce the number of coefficients for the SVS model [11] without sacrificing its accuracy.The performances of the new proposed model are illustrated when linearizing the Doherty GaN PA and the Class AB GaN PA, using a 20-MHz-wide LTE-A signal.FPGA design is carried out to optimize the implementation architecture of proposed SVSbased DPD.The Normalized Mean Square Error (NMSE), and the Adjacent Channel Power Ratio (ACPR) metrics are used to compare the performances of the new modified model to the existing models.

Proposed New SVS-Modified Model
The SVS model structure, based on the analytical formulation given in the appendix, is illustrated in Fig. 1.The SVS model basic function is given as the combination of 2 × K branches.Branches 1, 2,…, K contain the polynomial functions of y A (n), correspondingly, Branches 1*,2*…, K* contain the conjugate polynomial functions.
As shown in Fig. 2, the number of coefficients needed to obtain the best NMSE is 616 for both used PAs.This result corresponds to a nonlinearity order K = 7, a memory depth Q = 4 and a DDR order D = 2 which implies that each one of the model output y A (n) and y B (n) is composed from seven branches.
The first modification that we propose in our work consists of reducing the number of branches used to estimate the output signal.Only a subset of branches from the SVS model formulation will be taken into account.The choice of branches to be discarded without sacrificing the model accuracy is performed by calculating the loss of NMSE after the cancellation of each branch.
The NMSE performances of the SVS model after the application of the proposed approach are summarized in Tab. 1.According to test results reported in Tab. 1, it can be seen that the cancellation of the sixth branches (Branch 6  As depicted in Fig. 1, the SVS model last branches contain the largest number of coefficients.In our cases, where 616 coefficients are used, the SVS model seventh branches (Branch 7 + Branch 7*) contain 40% of the coefficients.

Eliminated branches Number of coefficients NMSE(dB)
The second simplification is inspired from the nonuniform memory polynomial model [12] and it consists of editing the SVS model last branches in order to get a reduced size model while maintaining a good accuracy.
Figure 3 shows the block diagram of the K th branches of the SVS model.They are composed of Q sub branches, where Q is the memory depth and K is the nonlinearity order of the SVS model.The same nonlinearity order value is used for all sub branches which causes an overestimation of the model parameters.
The proposed approach consists of sweeping the nonlinearity order in each sub branch from 1 to K, then evaluating the accuracy of each order using the NMSE as a metric.Consequently, the nonlinearity order of each sub branch can be defined independently of others.
In our context, the seventh branches of the SVS model are composed from four sub branches (Q = 4), the NMSE values for each sub branch are presented in Fig. 4, when the nonlinearity order is sweeping from 1 to 7 (K = 7).Accordingly, we can minimize the nonlinearity order of the second sub branch to 5, the third sub branch to 3 and the fourth sub branch to 3, noting that this reduction does not lead to a notable degradation in the NMSE.
Table 2 summarizes the number of coefficients and NMSE performances of the full and proposed modified SVS models.
The proposed adjustments of the SVS model decrease the overall number of coefficients from 616 (when the SVS model is applied) to 266 with an NMSE degradation lower than 0.2 dB for the Doherty GaN PA and 0.3 dB for the Class AB GaN PA.

DPD Implementation Design
Based on the defined PA model in Sec. 2, we detail in this section the implementation design of the DPD processing.Figure 6 presents the block diagram of the DPD processing units where u(n) is the input signal to the predistorter unit, whose output x(n) feeds the PA to produce the output y(n).

DPD Coefficients Identification Processing
Various least squares algorithms have been carried out to estimate the DPD coefficients.The straightforward method is to formulate this problem using the generic linear system expressed as [20] y y .
The vector x ̃ of dimension M × 1 is the predistorted input signal as shown in Fig. 6, and M is the total number of samples.Each estimated input sample x (n) at an instant n corresponds to a M × 1 vector Φ y (n) created with the output samples [8].Φ y is M × N matrix that assembles all the Φ y (n) vectors where N is the total number of the model coefficients.W y is a N × 1 vector including the predistorter coefficients.The estimation error is given by (2) To identify the DPD coefficients, we use the least square (LS) method, which minimizes the square of the residual function ║e║ 2 , ║.║ 2 stands for the Euclidean norm.
As demonstrated in [8], the predistorter coefficients are defined by ( 3) W y is the estimate LS solution of (3) where Φ y H denotes the complex conjugate transpose of the matrix Φ y .
The resolution of (3) requires a complex inversion of the covariance matrix (Φ y H Φ y ) of dimension N × N.That is why the iterative algorithm least square QR (LSQR) [21] is used in this work.
Based on the bi-diagonalization method, LSQR uses a regularization parameter β LSQR and solves (3) such as
For each iteration, LSQR algorithm generates a new estimated solution W yi in order to minimize the residual norm ║Φ y H x -Φ y H Φ y W y ║ 2 .

Predistorter Processing Implementation Design
The estimated coefficients obtained from the DPD coefficients identification block are copied to the predistorter unit, which uses an identical model to predistort the input signal.
In this work, the full SVS model (616 coefficients) and the SVS-modified one (266 coefficients) have been used for the DPD assessment.Look Up Table (LUT) based digital hardware design is considered [22] to avoid the implementation complexity of the large number of multipliers and adders in SVS models polynomial function.Therefore, the various power terms can be calculated and stored in advance, then the magnitude of each sample can be used as an associated address.Accordingly, all the terms related to the same memory depth can be saved in advance in one LUT.With this method, only Q LUTs are employed for each branch, where Q is the model memory depth.Consequently, the number of adders and multipliers will decrease as well as the implementation complexity.
In this work, the order of nonlinearity and memory depth were set to K = 7 and Q = 4, respectively.Thus, only 4 LUTs are employed to behave each branch of the predistorter model.
Figure 7 illustrates the proposed LUT-based implementation architecture, for only one branch, where u(n) is the predistorter complex input signal.The address generator block generates the LUT address from the magnitude of each input sample.

Performance Evaluation Results
Matlab-VHDL co-simulation test results are presented and discussed in this section to highlight the linearization performances of the proposed SVS-modified DPD design by comparison to SVS-CS DPD [11] and the conventional SVS DPD without pruning.

Co-Simulation Set-Up Description
To consider the DPD FPGA-based digital hardware implementation the predistorter VHDL IP is coded, synthesized than optimized, by using vivado synthesis tool, on the target Xilinx FPGA Zynq-7000 SoC XC7Z020-CLG484-1.
The co-simulation test set-up to evaluate the linearization performances of the proposed DPD design is developed by integrating the following:

 Matlab-based model of a Doherty and Class AB GaN
PAs.  Matlab-based processing of the DPD coefficients identification unit using the LSQR algorithm and experimental measurement from existing PAs devices. Modelsim-based simulation of the FPGA-based design of the predistorter unit.
One carrier LTE-A signal with 20 MHz channel bandwidth and a baseband sampling frequency of 92.16 MHz is applied to the input of the two PAs.The peak-to-average power ratio (PAPR) value is 7.5 dB.

DPD Coefficients Estimation Performances
DPD coefficients identification block is implemented by using the LSQR algorithm.The PA measured input and output vectors are composed of M = 8000 samples.The AM-AM characteristics of the Class AB and Doherty GaN PAs with and without DPDs are plotted in Fig. 8.
Results indicated in Tab. 3, in terms of number of coefficients and NMSE, show that the SVS-modified and the conventional SVS models provide similar improvement in the linearization performance.This result confirms that the proposed pruning method allows the reduction of the DPD complexity with an NMSE degradation lower than 0.3 dB compared to the SVS DPD before pruning.Also, the proposed model performs better than the SVS-CS one.It achieves an NMSE of -37.13 dB and -41.94 dB for the Doherty and Class AB GaN PAs, respectively, while the SVS-CS model is unable to reach these values using the same number of coefficients.

Predistorter FPGA Design Performances
The reduced complexity of the proposed SVS-modified DPD allowed lower logic resources for the FPGA implementation design of the predistorter unit on the Xilinx Zynq-7000 SoC as summarized in Tab. 4.
The FPGA implementation complexity analysis shows that the SVS based predistorter occupies 100% of the DSP48E1 while the proposed SVS-modified one requires only 87.73% of this resource.
The spectrum of the PA output signal after applying the proposed DPDs is shown in Fig. 9.This result confirms the improved linearization performance in terms of removing the out of band emission.Measured ACPR of the full and the pruned SVS based DPDs are given in Tab. 5.
The proposed SVS-modified based DPD provides two crucial advantages.First, the reduced number of coefficients used in PA behavioral modeling leads to lowcomplexity implementation of the predistorter processing.Then, the modified model gives similar or even better performance than the existing models (SVS DPD and SVS-CS DPD).Thus, the value of ACPR reduces to -40 dB and -41 dB for the Doherty and Class AB GaN PAs, respectively, while the total number of coefficients is reduced at 60%.

Conclusion
The use of wideband signals in modern communication systems requires a PA modeling and predistortion with an extremely large number of coefficients.To overcome this limitation, a new approach is proposed to reduce the SVS model size while maintaining sufficient performances.Then, a DPD implementation design is achieved in order to evaluate the performances of the SVS-modified model in terms of linearization capability.
Matlab and Modelsim co-simulation results performed for two GaN PAs where the input is a 20-MHz LTE-A signal show that the proposed approach reduces the number of coefficients by 60% with an NMSE degradation lower than 0.3 dB.
The FPGA implementation comes to an agreement with simulation results and confirms the gain in complexity reduction since the logic resources consumption is notably reduced when the modified model based DPD is established.Future work will be dedicated to the DSP implementation of the DPD coefficients identification algorithm and the experimental validation of the proposed DPD.
q q D q q Q q q q Dq q D q q Q K K q q q D q q D y n w q x n q w q q x n q x n q w q q q x n q x n q x n q w q q q x n q x n q q q D q q D y n w q x n q w q q x n q x n q w q q q x n q x n q x n q w q q q x n q where y svs (n) and x(n) are the complex output and input samples, respectively, x(n)* is the complex conjugate input sample, w a and w b are the coefficients of the SVS model, K and Q are the nonlinearity order and the memory depth, respectively, and D represents the DDR order.

Figure 5 Tab. 2 .Fig. 3 .
Figure 5 shows the accuracy of the proposed SVS-modified model when used to characterize the nonlinear

1
About the Authors ... Haithem REZGUI received the B.Eng. degree from the Naval Academy of Menzel Bourguiba, Bizerte, Tunisia, in 2012, and he is currently working toward the Ph.D. degree at the Ecole Supérieure des Communications de Tunis.His research interests include signal processing techniques, digital predistortion of nonlinear power amplifiers cognitive radio systems and architectures of communication systems.Fatma ROUISSI received the B.Eng. and the M.Sc.A degrees in Communications from the Ecole Supérieure des Communications de Tunis, Ariana, Tunisia in 2001 and 2002, respectively.Then, she received the Ph.D degree in 2008 in Information and Communication Technology from both the Ecole Supérieure des Communications de Tunis, and the Université des Sciences et Technologies de Lille, France.At present, she is an Assistant Professor at the Ecole Nationale d'Ingénieurs de Carthage, Tunisia, and a member of the GRES'COM research Laboratory, Ecole Supérieure des Communications de Tunis.Her current research interests include signal processing, digital communications, architectures of communication systems, broadband and narrowband PLC system optimization.Adel GHAZEL received the Electrical Engineer Diploma and the M.Sc.A degree in Systems Analysis and Digital Processing from Ecole Nationale d'Ingénieurs de Tunis (ENIT), Tunis, Tunisia, both in 1990, the Ph.D. degree in Electrical Engineering from ENIT and the Habilitation degree in ICT from Ecole Supérieure des Communications Sup'Com, Tunisia in 1996 and 2002, respectively.He is currently a Professor in Telecommunications and the Director of GRESCOM (Green & Smart Communication Systems) Research Lab. at Sup'Com, University of Carthage, Tunisia.He is a visiting professor at the Institute Mines-Telecom in France and being involved as a Senior R&D Consultant, in collaboration with North American industrial partners in the development of innovative technologies related to wireless communication and intelligent sensors.He started his professional experience in 1990 as a Specialist Engineer for design and field supervision of industrial communication systems.In 1993, he joined the University of Carthage where he occupied at Sup'Com the position of the Head of the Department of Electronics and Propagation from 1999 to 2004 and the Dean of Planning from 2005 to 2010.He supervised over 30 PhD and more than 100 Master students and participated in more than 50 international R&D projects.His current research interests include Software and Cognitive Radio systems, reconfigurable digital architectures and embedded systems design for energy efficient heterogeneous communication networks.His research led to over 350 publications.