Improved Phase Noise Compensation in OFDM Systems

. Phase noise (PN) consists of common phase error (CPE) and inter carrier interference (ICI). In an OFDM symbol, CPE has the same impact on each subcarrier, which is easy to be suppressed. However, ICI destroys the orthog-onality of subcarriers, which is diﬃcult to be eliminated. Therefore, an additional method is needed to be performed in the OFDM receiver to compensate the ICI. The interpolation method is considered an eﬀective way to eliminate the ICI caused by PN in the OFDM system. To enhance the accuracy of the PN estimation and compensation, we propose a linear method, LI-ICI-EE1 method based on LI-ICI-E1. Multiple interpolation slopes are ﬁrst calculated by selecting multiple pairs of observation samples, then the slope with the maximal linear ﬁtting degree based on the least square (LS) criterion is selected to improve the LI precision. Fur-thermore, to improve the estimation accuracy of PN in the LI-ICI-EE1, we propose a Shrinkage-based on LI-ICI-E1 method named SLI-EE1, which is implemented by adding an 𝑙 2 norm penalty term to the error function. At last, to optimize the low accuracy of LI-ICI-EE1 and SLI-EE1 when the PN compensation problem is a high-order problem, we propose a non-linear method Shrinkage-based Third-order Lagrange method named STL. Simulation results show that the improved methods have better BER performance.


Introduction
Orthogonal Frequency Division Multiplexing (OFDM) has been a critical technology of the fifth generation communications (5G) for its irreplaceable advantage in high speed communications [1]. However, the OFDM system is sensi-tive to the phase noise (PN) generated by the local oscillators (LOs). Thus, PN containing inter carrier interference (ICI) and common phase error (CPE) may cause the OFDM system to suffer from more severe transmission performance degradation [2].
To achieve the goal of reducing the impact of PN by reducing the ICI, the authors of [3] and [4] adopt encoding or new description of PN based on new mathematical model. The authors of [3] is to approximate the PN statistics by a finite number of realizations, i.e., a PN codebook to approximate the PN statistics. However, the complexity of the algorithm is determined by the size of the codebook. To lessen the computational burden and delay incurred onto the receiver, the authors of [4] estimate PN using only scattered pilot subcarriers but not use tentative symbol decisions. To obtain a significant performance improvement with similar complexity, and achieve a more effective performance than the direct local linearization of the observation model in the soft input extended Kalman smoothing approach (Softin EKS), the authors of [5] design an iterative receiver to achieve the joint PN estimation, equalization, and decoding by combining belief propagation (BP), mean field (MF), and expectation propagation (EP).
In addition, deep learning (DL) is used in a wide range of problems in the physical layer design, including transceiver designs to compensate for PN [6][7][8][9]. And some approaches have been proposed in DL-based communication systems. The authors of [6] apply convolutional neural network (CNN) for channel estimation in OFDM systems and proposed the time-frequency (TF) grid of channel coefficients is modelled as an image. Then, the estimated value obtained at the pilot position is interpolated in time and frequency, filtered through a super resolution network and a denoising network and filtered through a super resolution network and a denoising network. However, this work does not consider the problem of channel estimation in the presence of PN. Several methods have been reported for PN estimation and compensa-tion, e.g., [7][8][9]. To suppress the higher PN encountered at the sub-THz spectrum bands, the authors of [7] proposed a DNNbased PN compensation framework, which made hard decisions with respect to each data subcarrier. The authors of [8] proposed learning based schemes to estimate PN and decode symbols in doubly selective channels. However, the approach assumes the channel to be static over the subframe duration. The proposed DL methods replace conventional estimators' iterative process and therefore reduces complexity [7][8]. In order to obtain a PN compensation method for low complexity learning, the authors of [9] proposed a novel low complexity learning based scheme for channel estimation in time-varying channels and PN compensation in OFDM systems. In the proposed method, two-dimensional (2D) CNN is employed for effective training and tracking of channel variation in both frequency as well as time domain. Further use the estimated channel coefficients, a simple and effective PN estimation and compensation scheme is devised. In order to solve the problem of PN compensation, the above methods utilized neural networks, which will inevitably bring about the problems of difficult hardware implementation while improving the accuracy of PN compensation.
Furthermore, in order to obtain a lower complexity of the algorithm, not consider using the neural network to solve the PN compensation problem, the decision-aided iterative ICI mitigation was proposed in [10], which estimated the CPE by pilots and then evaluated the low-order spectrum components of the ICI. However, the estimation results at the symbol boundaries are inaccurate, since the truncated Fourier series for the ICI approximation causes severe spectral leakage. To solve this problem, a linear interpolation (LI) based ICI mitigation method, named LI-ICI, was presented in [11]. In addition, the authors of [12] proposed an enhanced LI-based ICI mitigation method, named LI-ICI-E1, with improved interpolation interval and slope. Unfortunately, in the existing LI-based methods, only one pair of observation samples for the PN is used to design the interpolation slope in each symbol, which has weakness in resisting PN and additive white Gaussian noise (AWGN) limiting their performances. As a result, the performance of the existing LI-based methods may be significantly degraded in the multi-path wireless fading channels. To this end, the authors of [13] applied Kalman filtering for the LI. Furthermore, instead of the LI, the Lagrange interpolation was employed in [14], followed by the Kalman filtering in the coherent optical orthogonal frequency division multiplexing (CO-OFDM) systems. Then, projection approximation subspace tracking (PAST) was introduced in [15], in order to achieve a better performance as opposed to the LI-based PN mitigation methods. The existing problem is that the slope selection in the linear method is unreasonable and the accuracy needs to be improved when the complexity is too high in the non-linear method.
In this paper, we proposes three ICI mitigation methods to improve the PN compensation performance in the OFDM system. The main contributions of this paper are summarized as follows: • To optimize the selection of the interpolation slope in linear interpolation methods, we propose the LI-ICI-EE1 method. By comparing the interpolation slopes of different observation samples, and utilizing the least square (LS) criterion to select a slope with the best fitting effect, the proposed LI-ICI-EE1 method can achieve the purpose of improving the accuracy of PN estimation and compensation.
• To improve the estimation accuracy of PN in the traditional LI method, we propose the SLI-EE1 method. By using the shrinkage technique which is implemented by adding a 2 norm penalty term to the error function, and then the proposed SLI-EE1 method can achieve the purpose of reducing the bit error rate (BER) of the OFDM system.
• To optimize the low accuracy of LI-ICI-EE1 method and SLI-EE1 method when PN compensation problem is a high-order problem, we propose a Shrinkage-based Third-order Lagrange method called STL method. By using the joint use of shrinkage technique and Thirdorder Lagrange method, the higher-order PN compensation problem is solved with better BER performance.
The rest of this paper is organized as follows. The system model is introduced in Sec. 2. In Sec. 3, we introduce our propose improved PN compensation methods. In Sec. 4, simulation results are presented and discussed. Finally, this paper is concluded in Sec. 5.
Notation: Bold uppercase letters denote matrices and bold lowercase letters denote vectors. A * , A T , A −1 , A H are, respectively, the conjugate, transpose, inverse, conjugate transpose of A. A † is defined as A † = A H A −1 A H , which is the Moore-Penrose pseudo-inverse of matrix A.
[A] , denotes the entry in the -th column and the -th row of matrix A, and [a] denotes the -th entry of vector a. diag(a) denotes the diagonal operation that arranges all the elements of vector on the diagonal position. A is the norm of matrix A.
[I] is a -dimension identity matrix and the symbol ⊗ denotes the Kronecker product, vec(A) denotes vectorizing matrix A by column.

System Model
Considering perfect frequency offset and timing synchronization, the received samples within the -th OFDM symbol is [10] ( where ⊗ denotes circular convolution, ( ), ℎ ( ), ( ), and ( ) are the transmitted signal, the channel impulse response, excess phase of the receiver-side frequency synthesizer, and AWGN, respectively. AWGN has been added after observing the signal power in the completely desired band, thus not all sub-carriers are having the same signal-to-noise ratio (SNR) due to the frequency-selectivity of the channel.
When the transmitted signal is acquired, the receiver will remove cyclic prefix (CP) and apply discrete Fourier transformation (DFT) to the data section, then the received signal at the -th subcarrier of the -th OFDM symbol can be expressed as where ( ), ( ) and ( ) are the frequency domain expressions for ( ), ℎ ( ), and ( ). (0) is the CPE brought by the direct current (DC) component of PN, which is identical for all the subcarriers in a symbol which can be estimated and eliminated by the decider. The first term in (2) is the ICI, which can be considered as the weighted sum of the DFT coeffificients ( ) on all the subcarriers except for the -th one, and ( ) is given by According to [10][11][12], the excess phase is always assumed to follow a Wiener process, which is the sum of white Gaussian process, expressed as where ( ) is a Gaussian variable with zero mean of -th OFDM symbol and variance 4 Δ 3dB S [12]. Here, Δ 3dB denotes the 3-dB bandwidth of the PN, and S denotes the sampling period.
Since the PN shows the low-pass characteristic [10][11][12], the ICI mitigation after the CPE removal can be achieved by estimating a small part of the PN samples, which contains the most significant PN spectral components ( ) with = − , . . . , . Accordingly, after removing the CPE obtained by pilot ( ), equation (2) can be expressed as where R , is the receive signal vector on the subcarriers { 1 , . . . , } ⊂ {0, . . . , − 1} in the -th symbol, is an empirical value, which denotes the size of the truncated ICI, W and ( ) = ( )ˆ( ) represent the sums of the AWGN and the residual uncompensated ICI components for = − , . . . , , respectively.

Phase Noise Compensation Methods
In this section, three methods of the PN compensation are improved to enhance the error performance. To avoid the influence of overfitting, we first proposed an LI-based PN estimation method according to [11] and [12], named LI-ICI-EE1 method. Then, to improve the estimation accuracy of PN in the traditional LI method, we propose a Shrinkagebased LI-ICI-EE1 method, named SLI-EE1 method. Finally, we propose a Shrinkage-based Third-order Lagrange method named STL method, in order to make up for the shortcomings of the first two methods when PN is a higher-order problem and further to enhance the compensation robust and improve its accuracy.

Conventional PN Compensation Methods
To eliminate the ICI, the LS-based estimation for (5) is solved asĴ LS In order to avoid the amplified AWGN caused by the LSbased estimation, the minimum mean squared error (MMSE)based estimation is expressed aŝ where 2 is the variance of W , and is a covariance matrix ofĴ LS , , which is usually known in the communication system or can be calculated using the PAST method [15] .
As for the time-domain PN of the data section, it needs some methods, e.g., using the linear interpolation method to estimate. After the time-domain PN sequencê is obtained, we can apply DFT operation to calculate its frequency-domain form. At last, after applying R , U to the receiving signals, the ICI is compensated, where To improve the estimation accuracy of the PN on the boundaries between adjacent OFDM symbols after the IDFT, the LI-ICI method [11] utilizes linear interpolation ofˆover 2 samples across adjacent symbols, and the corresponding LI slope is expressed as where is a fraction variable that denotes the ratio of the number of interpolation to the number of carriers in one symbol.
An enhanced LI-based method, denoted as LI-ICI-E1 [12] changes the interpolation interval from 2 samples to 2( + ) +1 samples, in order to make the method more suitable for the system with large CP. Then the LI slope is expressed as where is also a fraction that represents the ratio of the number of CP length to the number of carriers in one symbol. Figure 1 shows the phenomenon of overfitting, one problem of the traditional LI-based methods is that they may overfit the PN curve and may profit the PN as well as AWGN at the mean time.
Considering the AWGN is zero mean and an independent variable, if we can utilize multiple pairs of PN, it is easy to have lim →∞ =1 ( + ( + ) + + ( + )) / = E ( ( )). The traditional LI methods are not satisfied this equation as they are not continuous functions in the time-domain. The authors of [14][15] realized this property of PN, and they design a non-linear function to approximate PN and thus have an improved BER.

The Proposed LI-ICI-EE1 PN Compensation Method
We propose a new way to obtain the LI slope. We first calculate multiple interpolation slopes by selecting multiple different pairs of observation samples, and then choose the one with the highest linear fitting degree to be the finally determined interpolation slope. In this way, the interpolation operation would furthest fit the real PN curve.
Specifically, the -th slope obtained by the -th pair of observation samples between symbols with indexes and + 1 can be expressed as where and are both fraction variables, represents the length of the truncated window on the edge of one symbol and denotes the CP length, e.g., in long term evolution (LTE) system = 1/16, Δ is the variable step size and is a positive integer, = + Δ/ is a fraction denoting the ratio of the number of interpolation to the number of carriers in one symbol with < 1 − Δ/ , = 1, 2, . . . , and is the quantity of the slopes. Obviously, the interpolation interval in the proposed interpolation method is 2 + .
Next we determine the most reasonable slope among the obtained slopes. Based on LS criterion and by calculating the square sums of the differences between the -th and the other -1 slopes, we formulate the slope with the minimum square sum as in (11), the computational complexity is relatively high and is proportional to ( 2 ), which specifically contains ( -1) times square operation. In order to simplify (11), we transform it according to the properties of LS and calculate the derivative of (11), which can be expressed as Then, to find the optimal slope EE1,opt , i.e., the one that makes (11) to be theoretically minimal, we set (12) equal to zero and finally get when confronted with large fluctuation of AWGN, EE1,opt in (13) is still not robust enough as the energy of AWGN may affect the EE1,opt as well. What is more, the EE1,opt in (13) may not get across any PN samples. The flowchart of LI-ICI-EE1 method is shown in Algorithm 1, and we give two-step processing to achieve equivalent LS operation.
Firstly, according to (12) and (13), we had proved that the average of slopes is equal to EE1,opt , which can smooth the Gaussian noise. However, considering the actual slope can be obtained due to the integer interpolation interval, we should find the actual slope, which is closest to the ideal one to guarantee the interpolation performance based on the interpolation interval.
Secondly, we search the closest slope to EE1,opt , which can be expressed as Note that the complexity in (13) and (14) is reduced from ( 2 ) to ( ) compared to (11), i.e., the proposed approach only contains one time division operation.
Based on the slope in (14), the interpolation operation can be performed. Here, the interpolation interval we adopted is 2 final + with final = + final Δ/ and final is denoted as When linear interpolation is used, the system performance is mainly determined by the number of truncated ICI , the number of equations , and the accuracy of interpolation compensation. Linear interpolation has been investigated to be an effective method in solving the PN problem for OFDM systems [11][12]. In [12], a figure is presented to clearly show the PN distribution on the edge of OFDM symbols and the different slopes obtained by different methods for PN interpolation, which is not shown repeatedly in this paper for simplicity. From the previous work, we know that both LI-ICI [11] and LI-ICI-E1 [12] methods perform interpolation based on the slope that is calculated by only one pair of observation sample and one interval randomly selected in an empirical range. However, the interpolation precision will be impacted due to the AWGN and the fluctuation and nonlinearity of PN.
In the proposed LI-ICI-EE1 method, the interpolation slope is obtained from a set of observations and intervals based on LS principle, and is therefore the optimal slope for linear interpolation. The interpolation performance is more robust and adaptive to the PN fluctuation and the AWGN. The MSE between the real × 1 PN and the approximate fitting result, i.e., E ( −ˆ) H ( −ˆ ], will be lower than that of existing methods.
The performance of the proposed LI-ICI-EE1 method is related to the parameters including , , and Δ, which should be properly set. According to [11] and [12], should be typically set between 6% and 12%, on which condition the system performance can be guaranteed. And, as the parameter is fixed which depends on the system itself, the performance improvement effect brought by the proposed method is mainly reflected in and Δ. Therefore, based on a premise here that the setting of and makes = + Δ/ between 6% and 12%, some conclusions can be obtained as follows. When Δ is fixed, the bigger is, the more accurate the fitting operation can be obtained, for the AWGN and the fluctuation of PN will be smoothed more effectively. When is fixed, the performance will be enhanced either, with the increasing of Δ, for the fitting operation can better reflect the fluctuation of PN.

The Proposed Shrinkage-Based LI-ICI-EE1 Method
To enhance the accuracy of the PN estimation and compensation, we use shrinkage technique which belongs to the field of machine learning, by adding a 2 norm penalty term to the error function.
The proposed LI-ICI-EE1 method can be rewritten into the following form where A contains the polynomial coefficients, =ˆ− 2 + , Q is the coefficient matrix and its dimension depends on the order of the polynomial and the number of PN samples, and +1 are the subscripts indicating the position ofˆandˆ+ 1 , is a vector contains PN samples.
In the proposed LI-ICI-EE1 method. If the coefficient matrix is highly ill-conditioned, the phenomenon of overfitting will occur, especially if the order of the polynomial is less than the number of sample points, there will be multiple eigenvalues describing the same interpolation function. If these eigenvalues are relatively close, it is possible that the fitted curve basically passes through all sample points; but if these eigenvalues are far apart, the final fitted curve may deviate seriously from some sample values.
Shrinkage technique is used to avoid overfitting in the field of machine learning. The most popular shrinkage technique are the 2 -norm-based ridge regression and the 1norm-based Least absolute shrinkage and selection operator (LASSO) regression. Since the LASSO method may not converge, in this paper we only consider the ridge regression. The feature reduction based on the 2 norm is to add an 2 penalty A 2 on the basis of the error function (A) where is the penalty term, and is also undetermined in most communication scenario. In [16] cross-validation method is applied to calculate the value of . As the cross-validation method needs certain machine learning technique to support, which is far beyond the discussions of this paper, we only applies ridge regression method to estimate . Take the partial derivative of this and set it to zerô To reduce the performance deterioration brought by the overfitting, we propose a Shrinkage-based LI-ICI-EE1 method, named SLI-EE1 to compensate the PN. The flowchart of SLI-EE1 method is shown in Algorithm 2, and we give processing to achieve SLI-EE1 operation as follows.
Firstly, we selectˆPN samples to form the vector = ˆ,ˆ+ 1 , . . .ˆ+ˆ T . Secondly, we con- Thirdly, divide the value interval of penalty term , substitute into equation (18) to obtain the estimated value ofÂ. Fourthly, defineÂ is the estimation result obtained in the last calculation process, calculate Â −Â 2 2 , and judge whether Â −Â 2 2 < is true, if yes, output the current value ofÂ, or jump to the third step, select another value for until Â −Â 2 2 < is satisfied. Lastly, after calculatingÂ, the time-domain PN of the data block in every OFDM symbol is estimated, after transforming the time-domain sequence into frequency-domain, use R , U to perform ICI compensation.

The Proposed Shrinkage-Based Third-Order Lagrange Method
The LI-ICI-EE1 and the SLI-EE1 methods can be applied directly in most communication scenarios as the PN shows qusi-static property in some extent. But when PN compensation is a high-order problem, the efficiency of these two methods is low. To improve PN compensation accuracy, we propose a non-linear PN compensation method called Shrinkage-based Third order Lagrange (STL) method in this subsection.
The non-linear compensation method can also be modeled as (16). For the second-order Lagrange method in [14], it can be modeled as where A contains the polynomial coefficients, Q is the coefficient matrix, is a vector contains PN samples.
By using shrinkage technique and Third-order Lagrange to improve the BER performance of communication system as well as the robust of communication, we propose a STL method, which can be modeled as where A contains the polynomial coefficients, Q is the coefficient matrix, is a vector contains PN samples, adding an 2 -norm penalty term to (20), we can obtain We use R , U to perform ICI compensation, and then the flowchart of STL method is shown in Algorithm 3.
The non-linear compensation methods greatly improve the estimation accuracy of PN and reduce the BER of the OFDM system. However, in all the non-linear compensation methods, LS estimation is used to solve the undetermined unknowns so that the equation (18) obtains the LS solution. However, although (18) can minimize the sum of the MSE, it can not guarantee that the fitting curve produced by the final result is consistent with the trend of the original sample points. Generally speaking, the PN sample points are approximately scattered around a fitting curve. However, due to the influence of channel noises and other physical factors, some sample points will inevitably deviate seriously from this fitting curve. Although the final fitted curve may pass through or approximately pass through all sample points, the fitted curve will has very strong fluctuation, obviously inconsistent with the trend of the sample points. This is the overfitting phenomenon in the field of machine learning.

The Complexity Analysis
We analyze the computational complexity from the number of multiplications experienced in each formula of the proposed method, when all methods calculate the slope of the estimated phase. The LI-ICI method in (8) performs 3 multiplications. The LI-ICI-E1 method in (9) performs 3 multiplications. The proposed LI-ICI-EE1 method in (10) performs 3 multiplications and in (11) performs ( −1) 2 multiplications. In the proposed SLI-EE1 method, i.e. equation (16), we denote each row-column multiplication between matrices as a multiplication, then equation (16) performs (2) multiplications, where represents the number of iterations of the proposed method. The proposed STL method in (20) performs (3) multiplications. In the proposed LI-ICI-EE1 method, since we need to obtain an optimal slope here, the LI-ICI-EE1 method has ( − 1) 2 more computations for this multiplication than the LI-ICI method. In addition, we can obtain that the complexity of the SLI-EE1 method and the STL method is not only determined by the amount of multiplication once, but also related to the threshold for stopping the iteration, so we introduce here to represent the number of iterations, which can better analyze the complexity of each method. Then a detailed comparison of the number of multiplications required by each method is shown in Tab. 1.

Simulation
To investigate the performance of the proposed methods, we build an OFDM system in which the parameters are similar to the trial held by international mobile telecommunications (IMT) -2020 China group in Chengdu, China, in 2016 [17]: 64QAM, = 25%, = 6%. The total number of channel path is 6, subcarrier number = = +1 is 1024 and 15 KHz subcarrier interval. The number of equations in (5) is 50 and the iteration time is 6. The detailed parameters are shown in Tab. 2.
In the initial time, the channel frequency response (CFR) in (1) and (2) are unknown, and we applies training sequences which takes up 64 OFDM time slots to estimate the CFR. As the training sequence uses all the OFDM subcarriers to transmit known signals, the PN approximately obeys Gaussian distribution and thus, it can be incorporated into channel noise, and (1) will become a standard channel estimation model.
In Fig. 2, the system BER versus with different methods are shown, where Δ 3dB = 100 Hz, Δ = 6, = 5 , and SNR = 27 dB. We can see that when is increased from 1 to 5, the BER of each method are reduced and reach the lowest point when = 5. When is increased from 5 to 9, the BER gradually increases. The reason lies on that with the increasing of , the reconstruction result is definitely improved, but this will also reduce its averaging effect. Since there are equations and 2 + 1 unknown in (5). After > 5, with the increase of , the number of unknown equations in (5) will increase, which will further increase the uncertainty of (5), so the BER performance of each method including the PN compensation method proposed in this paper will decrease. The system is almost undetermined when is larger than ( − 1)/2. Thus, the further increase effort of will lead to the recovery loss of PN. Figure 3 shows the system BER versus SNR of LI-ICI-EE1 and SLI-EE1 under different steps Δ and number of observation slopes , where Δ 3dB = 100 Hz and = 9. The setting of all the values of Δ and make = + Δ/ between 6% and 12% to guarantee the system performance. Figure 3 shows that when Δ = 60, the performance of BER is significantly better than that of Δ = 30, whether in the proposed LI-ICI-EE1 method and SLI-EE1 method or other methods.
As is shown in Fig. 3, when Δ is fixed, the BER is lower for the system with larger . When the searching range Δ is fixed, the performance of the method with larger Δ slightly degrades but correspondingly obtains lower computational complexity. Therefore, considering the estimation of PN in frequency may definitely suffer from energy leakage, to enhance the robust of PN estimation result, we set = 9.
In order to clearly show the relationship between system performance and the sample number and the step size Δ, relevant detailed simulation data are given in Tab. 3 ( Δ ≤ 60), which Δ 3dB = 100 Hz and = 9 and SNR = 27 dB. It can be seen from this, when Δ is fixed, the BER gets lower along with the increase of . When is fixed the performance of the method is enhanced as Δ increases. When the searching range Δ is fixed, such as Δ = 10, 30, 60, the performance of the method with larger Δ gets slightly lower than the one with larger , and correspondingly obtains lower computational complexity. The above analysis of LI-ICI-EE1 also holds for SLI-EE1.    Figure 4 shows the correspondence between system BER and SNR for different values in the SLI-EE1 method. When changes from 0 to 0.5, the system BER performance tends to improve. This is because the shrinkage technique can significantly improve the overfitting problem caused by matrix inversion in the process of compensating for PN and avoid the final interpolation result from excessively deviating from the original curve. However when changes from 0.5 to 1, the BER performance of the system suffers performance loss. This is because although shrinkage can reduce the ill-conditioned degree of the inverse matrix, it is at the expense of the MSE performance of the system. If is too large, the penalty term will continue to increase, but in fact, each entry inÂ has actually stabilized, while the MSE of the interpolation curve and the fitting curve is increasing, which makes the interpolation result deviate from the phase noise curve. Therefore, the value of can not be too small. According to the tuning result, the ideal value of is 0.035. For the SLI-EE1, the ideal value of is a little lower than 0.035, here we assume that for SLI-EE1, is also 0.035. Figure 5 shows the system BER of different compensation methods versus SNR Compared with traditional LI method in [8]. We propose the LI-ICI-EE1 method can smooth the channel noise and thus has a lower BER. However, compared with the Shrinkage-based methods, the performance of BER in [8] is limited and is hard to get further improved. Then, we propose the SLI-EE1 method introduces shrinkage to improve PN compensation accuracy. However, neither the LI-ICI-EE1 method nor the SLI-EE1 method considers that when the PN compensation problem is a high-order problem, the BER performance of the proposed methods still have room for improvement. Therefore, we further propose the STL method and achieve the improvement of BER performance.
From the above simulation experiments, the proposed three ICI mitigation methods have great advantages in PN compensation accuracy, compared with other existing methods. By applying these to the existing OFDM communication system, the effect of PN caused by the frequency receiver at the receiving end will be greatly reduced in channel estimation. For the further 5G systems with larger antenna scale, these can also be utilized to improve communication performance. On the other hand, to realistically implement the proposed PN compensatin methods, the issue of hardware complexity should be carefully taken into account. The power consumption and the hardware implementation issues should also be on the agenda. In addition, it is also considering whether a common feature can be used to optimize the algorithm for the PN compensation problem in channel estimation, such as the common feature of space [18], which needs further study.

Conclusion
In this paper, to improve the accuracy of PN compensation, an improved LI-based ICI PN compensation method named LI-ICI-EE1 method, implemented a better slope selection; a Shrinkage-based LI-ICI-EE1 named SLI-EE1 method, implemented the introduction of a penalty term based on the 2 norm on the error functionand. To greatly improve the accuracy of the PN compensation and when phase noise is a high-order problem, a Shrinkage-based Third-order Lagrange method STL are proposed for OFDM system. Compared with the traditional LI-based methods, the LI-ICI-EE1 and SLI-EE1 utilize a novel interpolation slope calculation method. The STL method can greatly improve the accuracy of PN compensation and the excellent PN compensation effect when phase noise is a high-order problem, which can acquire extra performances gain compared with traditional LI methods methods.
In the future work, the proposed methods are planned to be implemented in an universal software radio peripheral (USRP)-based hardware experiment platform and the actual performance will be tested. Meanwhile, these methods will be optimized by further exploring how to calculate coefficients in machine learning for the PN compensation in the massive MIMO channel.
Lei QIAN (corresponding author) received the B.Eng. and Ph.D. degrees from the College of Communications Engineering, Jilin University, Changchun, China, in 2016 and 2021, respectively. Now she is with the Tianjin Key Laboratory of Optoelectronic Detection Technology and System, School of Electronic and Information Engineering, Tiangong University, as a Lecturer. She was a visiting Ph.D. student with the School of Engineering, University of British Columbia, Canada, from 2019 to 2020, sponsored by the Chinese Scholarship Council. Her current research interests include delay QoS guarantee, effective capacity, visible light communications, resource allocation, physical layer security and channel estimation. where he is currently an Associate Professor. He has authored/coauthored more than 150 refereed IEEE journal articles and more than 150 IEEE conference proceeding papers that are cited more than 11800 times in Google Scholar. His research interests include signal processing for communications, array signal processing, convex optimizations, and artificial intelligence assisted communications. He has also served as the Symposium Co-Chair for 2019 IEEE Confer