Nonlinear impairment compensation using expectation maximization for dispersion managed and unmanaged PDM 16-QAM transmission.

In this paper, we show numerically and experimentally that expectation maximization (EM) algorithm is a powerful tool in combating system impairments such as fibre nonlinearities, inphase and quadrature (I/Q) modulator imperfections and laser linewidth. The EM algorithm is an iterative algorithm that can be used to compensate for the impairments which have an imprint on a signal constellation, i.e. rotation and distortion of the constellation points. The EM is especially effective for combating non-linear phase noise (NLPN). It is because NLPN severely distorts the signal constellation and this can be tracked by the EM. The gain in the nonlinear system tolerance for the system under consideration is shown to be dependent on the transmission scenario. We show experimentally that for a dispersion managed polarization multiplexed 16-QAM system at 14 Gbaud a gain in the nonlinear system tolerance of up to 3 dB can be obtained. For, a dispersion unmanaged system this gain reduces to 0.5 dB.


Introduction
The application of digital signal processing (DSP) based coherent detection has allowed optical communication systems to operate closer to the nonlinear Shannon capacity limit by employing spectrally efficient modulation formats. Therefore, there is currently a lot of ongoing research on DSP based algorithms for signal detection and optical fibre channel impairment compensation. Linear signal processing algorithms can be effectively used to compensate for linear fibre channel impairments and have been demonstrated very successfully for higher order quadrature amplitude modulation (QAM) signaling [1]. However, for long-haul systems employing higher order QAM, nonlinear optical fibre impairments can severely limit the transmission distance as well as the achievable total capacity [2]. Mitigation of optical fibre nonlinearities is therefore very crucial as it will allow launching more power into the fibre and thereby enhancing the transmission distance. Additionally, mitigation of fibre nonlinearities will help us reduce the nonlinear crosstalk from the neighboring channel in a multi-channel transmission system.
It has been shown that nonlinear fibre impairments can be compensated by various techniques: digital backpropagation (DBP), maximum-likelihood sequence estimation, nonlinear polarization crosstalk cancelation, nonlinear pre-and post-compensation, RF-pilot, etc, [3][4][5][6][7] and references therein. Some of the mentioned methods suffer from complexity and, additionally, the achievable gain in the nonlinear tolerance is dependent on particular transmission scenarios. Therefore, efficient and widely applicable DSP algorithms for nonlinearity compensation are still open for research.
We have already demonstrated that for the dispersion managed links, the expectation maximization algorithm can be used to enhance system tolerance towards nonlinearities [8]. However, dispersion managed link will impact the signal propagation in a different way compared to the dispersion unmanaged link, and it is therefore essential to investigate the benefits of EM for dispersion unmanaged link as well. In this paper, we consider both numerically and experimentally dispersion unmanaged link. Transmission distances of 240 km, 400 km and 800 km are investigated experimentally for the dispersion unmanaged link. For the consistency of the paper, results obtain for the dispersion managed link are included as well. We consider dispersion managed (link consisting of multiples stages of standard single mode fibre (SSMF) in combination with dispersion compensating fibre (DCF)) and dispersion unmanaged (link consisting of multiples stages of SSMF) polarization multiplexed 16-QAM single channel transmission. For the transmission link erbium doped fibre amplifier (EDFA) amplification is employed. First, it is investigated numerically, for the back-to-back case, if the expectation maximization (EM) algorithm can be effective in combating inphase and quadrature (I/Q) modulator nonlinearity, imbalance and laser linewidth. We investigate, also by numerical simulations, an improvement in nonlinear system tolerance that can be gained for PDM 16-QAM transmission. To begin with, we consider the case in which the chromatic dispersion is neglected and the system is impaired by self phase modulation induced nonlinear phase noise only. Finally, we move to a dispersion managed and unmanaged transmission system. As a proof of concept, an experimental set up employing dispersion managed and unmanaged PDM 16-QAM at 14 Gbaud is constructed and an improvement in the nonlinear system tolerance is investigated. The paper is ended with a conclusion and the future prospects of EM algorithm.

Numerical and experimental system set-up
In this section, a numerical and experimental set-up is presented. We first start by describing the numerical set-up used for simulations, and then we move to the experimental set-up. The section is concluded by a subsection describing the DSP algorithms used for signal equalization and demodulation.

Numerical set-up
The set-up used for the numerical investigations is shown in Fig. 1. All simulations are done using MATLAB (R2010a). For all numerical simulations, the baud rate is kept at 28 Gbaud resulting in the total bit rate of 224 Gb/s for the system under consideration. The transmitter and local oscillator (LO) laser phase noise is modeled as a random walk Wiener process. The output of the laser is then passed through an optical I/Q modulator. The I/Q modulator is driven by two four level pulse amplitude modulated (4-PAM) electrical signals, 4-PAM(1) and 4-PAM(2), in order to generate the optical 16-QAM signal. The module for generating 4-PAM signals is shown in Fig. 1. It consists of pseudo-random binary sequence (PRBS) generator (generates four independent sequences of length 2 15 − 1), signal mapping, upsampling, pulse shaping filter (raised cosine), digital-to-analog converter (DAC), attenuators and electrical amplifiers. The actual impulse response of the driving amplifiers is not taken into consideration. It is assumed that the electrical amplifiers have sufficient bandwidth such that they don't induce any signal distortion. The method of 4-PAM signal generation is very similar to the one reported in [9]. The output of the I/Q modulator is then passed through a polarization multiplexing stage with a delay of 10 symbols, and the output is then amplified (EDFA). For the back-to-back numerical investigations, the generated PDM 16-QAM signal is coherently detected in a 90 degrees optical hybrid, photodetected and sampled at twice the baud rate by the analog-to-digital converter. We assume that the sampling frequency and the phase is not synchronized to the incoming signal and that the clock recovery is thereby performed by the DSP. The response of the analog-to-digital converter is modeled as a fourth-order Butterworth filter with a 3 dB bandwidth corresponding to 75% of the signal symbol rate. The sampled signal is then sent to the DSP modules which are described in subsection 2.3.
In this paper, we will also perform numerical investigations involving fibre transmission, and will therefore consider dispersion managed and unmanaged link. The dispersion managed link consists of a different number of stages where each stage consists of 80 km of SSMF and 17 km of a DCF. EDFA amplification is employed after the SSMF and the DCF spans, respectively, as shown in Fig. 1. For the SSMF we have the following fibre parameters: α sm f =0.2 dB/km, D sm f =17 ps/nm/km and nonlinear coefficient is γ sm f =1.3 W −1 km −1 . For the DCF, we have the following parameters: α dc f =0.5 dB/km, D dc f =-80 ps/nm/km and nonlinear coefficient is

Experimental set-up
The set-up used for the experimental investigations is very similar to the one shown in Fig. 1. For the experiment, the baud rate is kept at 14 Gbaud resulting in the total bit rate of 112 Gb/s. The transmitter and LO laser are both tunable external cavity lasers with a linewidth of ∼100 kHz. The wavelength of the transmitter and LO laser is set to 1550 nm. Pulse-pattern generator outputs four copies, (x 1 , x 2 , x 3 , x 4 ), of a true PRBS of length 2 15 − 1. The PRBS sequences are first decorrelated by 270 bits, amplified and combined into a 4-PAM electrical signal. The two PRBS sequences x 1 and x 3 are independent, while x 2 and x 4 are inverted versions of x 1 and x 3 , respectively. The peak-to-peak amplitude of the 4-PAM signal used to drive an optical I/Q modulator is approximately 3 V. The delay in the polarization multiplexing stage is 10 symbols. Also, for the experiment we consider a dispersion managed and unmanaged link. For the experimental investigations we first consider dispersion managed link and then we move to dispersion unmanaged ink. The dispersion managed link consists of 80 km of SSMF and 17 km of DCF with inline EDFA amplification. For the dispersion managed link, the DCF is just bypassed. The fibre parameters for SSMF and DCF used for the experiment are the similar to the one we have used in the numerical set-up. At the receiver, the 14 Gbaud PDM 16-QAM signal is then sampled at 50 Gs/s using a sampling scope with a nominal resolution of 8-bits and analog bandwidth of 17 GHz. The sampled signal is then send to DSP modules for the offline processing described in section 2.3.

Digital signal processing algorithms
The DSP modules consists of an I/Q imbalance compensation, interpolation (clock recovery) module, joint polarization demultiplexing and carrier recovery stage. We apply expectation maximization algorithm for nonlinearity compensation after joint polarization demultiplexing and carrier recovery. Within the expectation maximization algorithm, symbol demodulation is embedded. The I/Q imbalance compensation algorithm employs Gram-Schmidt orthogonalization and is similar to the one reported in [1]. In order to make sure that we have a control signal for the clock recovery module, irrespective of the polarization mixing angle and the differential group delay, (DGD), a method reported in [10] is implemented. The implemented clock recovery module is a feedback structure similar to the one reported in [11]. It consists of an interpolator, timing error detector, loop filter and a number controlled oscillator (NCO). The timing error detector, which is the most crucial component, is a modified Gardner algorithm [10]. For the loop filter, an averaging filter is used. After, the clock recovery module a decimator is used in order to downsample the signal to one sample per symbol. The algorithm used for signal decimation is based on the maximum search method as proposed in [11]. The polarization demultiplexing stage is performed jointly with carrier frequency and phase estimation module. The polarization demultiplexing unit consists of a butterfly structure as the one reported in [1], and the carrier phase and frequency estimation unit is a decision-directed digital phase-locked loop [11]. The decisions from the digital phase-locked loop are then used as the error signal for the polarization demultiplexing. We found that the significant gain in the performance of the phase-locked loop can be obtained by properly designing the digital loop filter. We found that for the considered case proportional integrator filter was the best choice. We emphasize that the polarization demutliplexing and digital phase-locked loop are first trained in the blind mode using constant modulus algorithm and the switched to a decision directed mode as also reported in [1]. This type of joint equalization and phase/frequency estimation is described in more details in [12]. The EM algorithm is then applied after the polarization demultiplexing and carrier frequency/phase recovery stage. The task of the EM algorithm is then to learn the channel properties from the demodulated data without any prior knowledge. The information extracted from the channel is then used to compensate for the channel impairments and perform subsequent signal demodulation. After the EM stage, error counting is performed on ∼100000 received symbols.

Statistical signal representation -mixture of Gaussians
In this section, we will first describe how a received signal can be modeled as a so called "Mixture of Gaussians, (MoG)" and then we will move into basic principles of the EM algorithm. For a more detailed treatment of MoG see [13], Chapter 9.2. Throughout, the entire section we will assume that the signal that is input to the EM algorithm contains one sample per symbol, and is obtained after polarization demultiplexing, frequency and phase recovery stage. We will refer to this signal as the demodulated signal. The demodulated signal in x/y-polarization can be considered as a mixture of Gaussian densities (MoG) consisting of a number of components (clusters), where each of the components (clusters) can be described by a 2-D Gaussian distribution. For instance, in the case of 16-QAM, we have 16 clusters. For a 16-QAM signal constellation, in the absence of any impairment, we will have 16 distinct constellation points. However, in the presence of additive white Gaussian noise, around each of the 16 constellation points there will occur spread of symbols. The cluster is then defined as a grouping of the points/symbols around a mean value. Irrespective of the modulation format applied, (PSK or QAM), the demodulated signal can mathematically be expressed as a superposition of M Gaussian densities, where M is the number of clusters and corresponds to the number of constellation points. The probability density function of the demodulated signal is then expressed as: where k refers to each cluster in the constellation, π k is a mixing coefficient (for the considered case of a signal where symbols have uniform distribution π k = 1/M). x = [x 1 , x 2 ] is a 2-D vector, corresponding to a detected symbol in the constellation (Inphase/Quadrature) plane and N(x|μ k , Σ k ) is a 2-D Gaussian density with mean μ k and a 2 × 2 covariance matrix Σ k [13]: where |Σ k | is the determinant of the covariance matrix and it expresses the area covered by the specific cluster k. The covariance matrix is defined as: For the most general case, each cluster is described by its specific covariance matrix Σ k . However, when the signal is mostly dominated by the additive Gaussian noise, the covariance matrices will be equal and diagonal, i.e. there is no correlation among symbols within the cluster. Additionally, the clusters will be circularly symmetric (equal variances) and the covariance matrix is then expressed as: An example of the demodulated 16-QAM signal dominated by additive Gaussian noise is shown in Fig. 2(a). It is observed in Fig. 2(a) that all the clusters look similar. An example of the demodulated signal strongly impaired by laser phase noise is shown in Fig. 2(b). It is observed in Fig. 2(b) that the clusters are not similar. Indeed, the clusters belonging to the outer ring are elliptical. Here, we will distinguish between two cases: (1) the covariance matrix is still diagonal and σ 2 1,1 = σ 2 2,2 ; the clusters are stretched in either vertical or horizontal direction, (2) the covariance matrix is non-diagonal and in this case the shape and orientation of the cluster is arbitrary, all depending if there is positive or negative correlation. Finally, let's look at third case when the demodulated signal is severally impaired by non-linear phase noise, see Fig. 2(c). It is observed that in Fig. 2(c) not only outer cluster are affected but all the clusters experience distortion. It should also be noticed that the entire constellation is tilted (phase offset introduced), and the outer points have been compressed. This compression means that the mean values μ k have been altered compared to the reference constellation. By reference constellation, it is meant the constellation which is free of any impairment.
In general, different optical channel impairments will have a different imprint on the received signal constellation. This information can then be used to determine the impairment and make optimal signal detection as explained next.
The optimal signal detection in maximum likelihood sense is obtained by maximizing a posteriori probability of the received symbol x belonging to one of the clusters k, where k = 1, ..., M:k = argmax k p(k|x) (5) or in another words find a cluster k for which p(k|x) is maximized. The a posteriori probability p(k|x) is obtained from Bayes' theorem [13]: Inserting the expression for the Gaussian distribution, the optimal decision, Eq. (5), reduces to a quadratic decision rule:k with w k = Σ −1 k μ k and w k0 = log π k − log |Σ k |/2 − μ T k Σ −1 k μ k /2. In the case, when the covariance matrices are equal, the quadratic term in optimal decision rule in Eq. (7) are the same for all k and the decision rule becomes linear: However, in case when the signal constellation is distorted by nonlinear phase noise, laser phase noise, etc, Eq. (6) needs to be used in order to make optimum signal detection. In order to evaluate Eq. (6), M Gaussian densities, N(x|μ k , Σ k ) and thereby parameters π ≡ {π 1 , ..., π k }, μ ≡ {μ 1 , ..., μ k } and Σ ≡ {Σ 1 , ..., Σ k } describing Gaussian densities need to be determined.
Next, we will show how to use a powerful method of EM in order to determine the parameters that generate the Gaussian mixture model. The EM will determine in a maximum likelihood sense the most likely parameters Ξ = [π, μ, Σ] that generated Gaussian densities.

Expectation maximization algorithm
In general, the EM is a numerical method of producing a solution to a maximum likelihood estimation for problems which can be simplified by introducing latent variables [13,14]. In the where X = [x 1 , ..., x N ], N is the length of the observation interval and the likelihood function of Ξ, p(X|Ξ), for independent identically distributed data is expressed as: No closed-form analytical solution for Eq. (9) is available. Therefore, the iterative EM framework can be used to find a solution. The EM is a two step iterative procedure which is guarenteed to converge to the (local) maximum likelihood solution given in Eq. (9) [14]. The two step procedure, so called expectation (E) step and maximization (M) step for the particular case considered in this section is as follows [13]: for n = 1, ..., N and k = 1, ..., M(11) where γ nk is called the responsibility and is nothing but a posteriori probability, Eq. (6), needed for optimal decisions. The flow-chart describing the algorithm is shown in Fig. 3. To begin with, we initialize the EM with initial parameters for the means, covariance matrices and mixing coefficients and then the algorithms start to iterate in order to find most likely parameters. In the E-step, the current values of the parameters, Ξ i at the iteration i, are used to evaluate the Eq. (11). The E-step expressed by Eq. (11) computes the probability of the received symbol belonging to one of the clusters, i.e. posterior probability. In the M-step we use those probabilities to reestimate the parameters Ξ. In other words, in the M-step we are trying to find the parameters that maximize the probability that the data has been generated by a particular cluster. When making a parameter update resulting from the E step and followed by the M step, the likelihood function, P(X|Ξ i ), on the parameters will increase and will flatten out when the algorithm has converged. The convergence properties of the EM strongly dependent on the initialization. For the considered cases throughout the paper, we found that the EM algorithm will converge after 3 iterations. Once the EM algorithms has converged (i > N iter ), we use the results to perform the optimum signal detection governed by Eq.

Back-to-back investigation
First, we consider back-to-back case. It is investigated how the EM algorithm can be used to combat the combined effects of I/Q modulator nonlinearities and imperfection, and combined laser linewidth. We deliberately drive the I/Q modulator with large peak-to-peak amplitude, V pp , of the electrical 4-PAM signal such that the constellation diagram of the 16-QAM signal is distorted. The resulting modulation depth of the modulator is then m = V pp /V π =2.12. Furthermore, it is assumed that the phase shift between the inphase and quadrature branch of the I/Q modulator deviates from π/2 by 5%. In Fig. 4, -log[BER] is plotted as function of the combined laser linewidth for the optical signal to noise ratio (OSNR) of 25 dB. Figure 4 shows that compared to the case when no compensation is used (linear decision boundaries), the EM is very efficient in combating the combined impairments originating from I/Q modulator nonlinearities, non-ideal phase shift between the I and Q branches and combined laser linewidth.

Dispersion managed link
In order to investigate the effects of nonlinear phase noise only, the dispersion is numerically set to zero, i.e. transmission link consisting of 12 spans of SSMF with dispersion set to zero. The dominant nonlinear impairment is therefore the nonlinear phase noise originating from the EDFA amplification along the link. In order to determine the effectiveness of the EM algorithm, we also plot in the same figure -log(BER) when k-means algorithm is used [13,15]. The kmeans algorithm, is another widely used clustering algorithm in nonlinear signal processing. It computes the means of the clusters and the information is used to perform minimum distance signal detection expressed by Eq. (8). In Fig. 5(a), -log(BER) is plotted as a function of the input signal power to the transmission span. In general, there is a relatively large improvement in the nonlinear system tolerance when using the EM algorithm. Figure 5(a) shows that by only using the k-means algorithm the system tolerance can be increased by approximately 1 dB for -log(BER) of 3, while by using the EM, the system tolerance can be increased by 2.2 dB compared to the case when neither the EM nor k-means is used. Next, we consider dispersion managed link consisting of SSMF and DCF as described in section 2.1. The total number of spans is 12. The results of -log(BER) as a function of span input power are shown in Fig. 5(b). It is observed from Fig. 5(b) that by employing the EM system tolerance towards nonlinearities can be increased. The improvement in the nonlinear system tolerance is approximately 1.2 dB for the -log(BER) of 3.

Dispersion unmanaged link
Next, we move to the case of dispersion unmanaged link. In Fig. 6(a), -log(BER) is plotted as a function of the span input signal power. The total number of spans is 15. It is observed that there is no benefit of using the EM algorithm. This is because for the dispersion unmanaged links, the impact of the nonlinearity can be modeled as additive Gaussian white noise [16][17][18]. Therefore, all 16 covariance matrices are circularly symmetric and the optimum decision boundaries become linear as explained in Section 3. As the laser linewidth will also have an imprint on the signal constellation, next, we want to investigate if dispersion unmanaged links can benefit from the EM when impaired by laser linewidth and operate in the nonlinear regime. In Fig. 6(b), the -log(BER) is plotted as a function of the combined laser linewidth. The number of spans for the unmanaged link is set 15 and the input signal power to the span is 1.4 dBm. It is observed in Fig. 6(b) that some improvement in the laser linewidth tolerance can be obtained when applying EM.

Dispersion managed link
First, we will demonstrate how EM can be effectively used to extract information from a severely distorted constellation and use this information to mitigate the impairments. Figure  7, shows a constellation diagram of a signal impaired by nonlinear phase noise after one span transmission through the dispersion managed link. We plot the recovered constellation diagram of the x polarization after the carrier recovery stage. Together with the constellation, we have also plotted optimal (nonlinear) decision boundaries obtained by applying Eq. (6) in conjunction with the EM algorithm. It is observed from Fig. 7, that due to the nonlinear impairments, especially the outer constellation points are distorted. The inner constellation points are less affected due to lower power, however, all clusters experience a significant phase shift. In the case, no compensation is applied -log(BER) is 1.30. For the case when the k-means is used the respective -log(BER) is 2.04 while when the EM is applied -log(BER) gets down to 3. This example demonstrates the capabilities of EM of compensating distorted and phase shifted constellations.
In Fig. 8(a), we plot the demodulated signal constellation for the input power of P in = 0 dBm and the corresponding optimal decision boundaries after 800 km of transmission. It is observed that the demodulated signal constellation shown in Fig. 8(a) is distorted and therefore the optimal decision boundaries are nonlinear. In Fig. 8(b), we plot -log(BER) as a function of span input power after 800 km of transmssion thorugh dispersion managed link. It is observed that there is an improvement in the nonlinear system tolerance by employing the EM algorithm which is in accordance with simulation results. We observe up to 3 dB of improvement in nonlinear tolerance compared to the case when no compensation is used. The reason why we get more improvement for the experimental data may be attributed to the fact that the EM is also effective in compensating residual distortion induced on the signal. It is observed from the figure that only very little improvement can be obtained by using the k-means algorithm, and this is also in good agreement with the simulation results.

Dispersion unmanaged link
For our next investigations, the DCF is removed and we consider 800 km of dispersion unmanaged signal transmission. In Fig. 9(a), the constellation diagram of the recovered signal is plotted together with the optimal decision boundaries for the input power of 0 dBm. It is observed in Fig. 9(a), that the cluster are not distorted in the same way as for the dispersion managed link. Indeed, the clusters seem to be more circularly symmetric and the optimal decision boundaries are very close to linear. This confirms the earlier observations reported in [16][17][18], that for the dispersion unmanaged links the effect of fibre nonlinearity corresponds to adding white Gaussian noise. There is however, some slight distortion in the shape of the clusters. In Fig. 9(b), -log(BER) is plotted as a function of span input power. It is observed that for input power exceeding 0 dBm , there is no improvement in the nonlinear system tolerance when kmeans or EM is used. However, for the input power less then 0 dBm shown in Fig. 9(b), there is a an improvement of ∼0.5 dB.
In order to investigate how nonlinearity compensation by the EM scales with transmission distance, -log(BER) is plotted as a function of span input signal power for transmission distances of 240 km and 400 km, respectively, see Fig. 10. It is observed in Fig. 10(a) that an improvement in the nonlinear system tolerance of approximately 0.5 dB is observed for the entire range of the considered input optical power. For the transmission distance of 400 km, Fig. 10(b), a similar improvement of 0.5 is observed, however, the gain disappears for input power exceeding 2 dBm. One of the explanations could be that for input power exceeding 2 dBm the distortion is large and the EM cannot properly estimate the parameters.

Conclusion
We have shown that the expectation maximization algorithm is a powerful tool for combating system impairments, (fibre nonlinearities, I/Q imperfections and laser linewidth), which significantly distort the signal constellation. By the distortion, it is meant the deviation from circular symmetric shape. Additionally, we have demonstrated, experimentally that by using expectation maximization, up to 3 dB of improvement in nonlinear system tolerance can be obtained for 800 km long dispersion managed transmission link. For the considered dispersion unmanaged link of 240 km, 400 km and 800 km, an improvement of approximately 0.5 dB, is obtained. For a multi-channel WDM transmission system, the EM algorithm can potentially be beneficial as the inter-channel nonlinear effects will have an imprint on the constellation of the signal under constellation. The improvement in the nonlinear system tolerance offered by the EM, will depend on the modulation format of the neighboring channels, the spacing, as well as the number of neighboring channels. However, this is the topic that needs further investigations. In order to see what benefits EM brings we need to relate the performance of the EM to a digital backpropagation, which has become a benchmark for nonlinearity compensation techniques. For the dispersion managed links, it has been numerically shown that up to 4 dB of improvement can be obtained [19]. The gain in the nonlinear system tolerance will though depend on the implementation of the digital backpropagation. For instance this gain reduces to 2 dB if 1 step/fiber digital backpropagation is used [19]. For the dispersion unmanaged links, an improvement of 3 dB for PDM 16-QAM signal has been shown experimentally [20]. As a rule of thumb, we may conclude that for dispersion unmanaged system digital backpropagation offers better performance, while for dispersion managed systems, depending on the implementation of digital backpropagation, the performance of the EM and digital backpropagation may be comparable. However, one technique does not have to exclude the other, as one may consider the combination of both techniques in combating optical fibre channel nonlinearities.