Polarization-time coding for PDL mitigation in long-haul PolMux OFDM systems

In this paper, we present a numerical, theoretical and experimental study on the mitigation of Polarization Dependent Loss (PDL) with Polarization-Time (PT) codes in long-haul coherent optical fiber transmissions using Orthogonal Frequency Division Multiplexing (OFDM). First, we review the scheme of a polarization-multiplexed (PolMux) optical transmission and the 2× 2 MIMO model of the optical channel with PDL. Second, we introduce the Space-Time (ST) codes originally designed for wireless Rayleigh fading channels, and evaluate their performance, as PT codes, in mitigating PDL through numerical simulations. The obtained behaviors and coding gains are different from those observed on the wireless channel. In particular, the Silver code performs better than the Golden code and the coding gains offered by PT codes and forward-error-correction (FEC) codes aggregate. We investigate the numerical results through a theoretical analysis based on the computation of an upper bound of the error probability of the optical channel with PDL. The derived upper bound yields a design criterion for optimal PDL-mitigating codes. Furthermore, a transmission experiment of PDL-mitigation in a 1000km optical fiber link with inline PDL validates the numerical and theoretical findings. The results are shown in terms of Q-factor distributions. The mean Q-factor is improved with PT coding and the variance is also narrowed. © 2013 Optical Society of America OCIS codes: (060.0060) Fiber optics and optical communications; (060.1660) Coherent communications; (060.4080) Modulation; (060.4230) Multiplexing. References and links 1. P. J. Winzer, “High-Spectral-Efficiency Optical Modulation Formats,” J. Lightwave Technol. 30(24), 3824–3835 (2012). 2. S. Savory, “Digital filters for coherent optical receivers,” Opt. Express 16, 804–817 (2008). 3. S. L. Jansen, I. Morita, T. C. W. Schenk, and H. Tanaka, “121.9-Gb/s PDM-OFDM Transmission With 2-b/s/Hz Spectral Efficiency Over 1000 km of SSMF,” J. Lightwave Technol. 27(3), 177–188 (2009). 4. J. P. Gordon, H. Kogelnik, “PMD fundamentals: Polarization mode dispersion in optical fibers,” in Proc. Natl. Acad. Sci. U.S.A. 97(9), 4541–4550 (2000). 5. T. Duthel, C. R. S. Fludger, J. Geyer, and C. Schulien, “Impact of polarization dependent loss on coherent POLMUX-NRZ-DQPSK,” in proc. of OFC/NFOEC’08, 1–3. 6. A. Juarez, C. Bunge, S. Warm, and K. Petermann, “Perspectives of principal mode transmission in mode-divisionmultiplex operation,” Opt. Express 20(13), 13810–13824 (2012). 7. W. Shieh, X. Yi, Y. Ma, and Y. Tang, “Theoretical and experimental study on PMD-supported transmission using polarization diversity in coherent optical OFDM systems,” Opt. Express 15(16), 9936–9947 (2007). #195015 $15.00 USD Received 31 Jul 2013; revised 29 Aug 2013; accepted 30 Aug 2013; published 20 Sep 2013 (C) 2013 OSA 23 September 2013 | Vol. 21, No. 19 | DOI:10.1364/OE.21.022773 | OPTICS EXPRESS 22773 8. S. Mumtaz, G. Rekaya, Y. Jaouën, “Space-Time codes for optical fiber communication with polarization multiplexing,” in proc. of ICC’10, 1–5. 9. S. Mumtaz, J. Li, S. Koenig, Y. Jaouën, R. Schmogrow, G. Rekaya-Ben Othman, and J. Leuthold, “Experimental demonstration of PDL mitigation using Polarization-Time coding in PDM-OFDM systems,” in proc. of SPPCom’11, paper SPWB6. 10. S. Mumtaz, G. Rekaya-Ben Othman, Y. Jaouën, J. Li, S. Koenig, R. Schmogrow, and J. Leuthold, “Alamouti code against PDL in polarization multiplexed systems,” in proc. of SPPCom’11, paper SPTuA2. 11. E. Meron, A. Andrusier, M. Feder, and M. Shtaif, “Use of space-time coding in coherent polarization-multiplexed systems suffering from polarization-dependent loss,” Opt. Letters 35(21), 3547–3549 (2010). 12. V. Tarokh, A. Naguib, N. Seshadri, and A.R. Calderbank, “Space-time codes for high data rate wireless communication: performance criteria in the presence of channel estimation errors, mobility, and multiple paths,” IEEE Transactions on Communications 47(2), 199–207 (1999). 13. J.-C. Belfiore, G. Rekaya, E. Viterbo, “The golden code: a 2x2 full-rate space-time code with nonvanishing determinants,” IEEE Transactions on Information Theory 51(4), 1432–1436 (2005). 14. H. Bulow, W. Baumert, H. Schmuck, F. Mohr, T. Schulz, F. Kuppers, and W. Weiershausen, “Measurement of the maximum speed of PMD fluctuation in installed field fiber,” OFC/IOOC’99 2, 83–85. 15. S. Ben Rayana, H. Besbes, G. Rekaya-Ben Othman, and Y. Jaouën, “Joint equalization and polarization-time coding detection to mitigate PMD and PDL impairments,” in proc. of SPPCom’12, paper SpW2B.3. 16. A. Andrusier, E. Meron, M. Feder, and M. Shtaif, “An optical implementation of a space-time-trellis code for enhancing the tolerance of systems to polarization-dependent loss,” Opt. Letters 38(2), 118–120 (2013). 17. S. R. Desbruslais, P. R. Morkel, “Simulation of polarisation mode dispersion and its effects in long-haul optically amplified lightwave systems,” IEE Colloquium on International Transmission Systems, 6/1–6/6 (1994). 18. O. Vassilieva, I. Kim, Y. Akasaka, M. Bouda, and M. Sekiya, “Interplay between PDL and nonlinear effects in coherent polarization multiplexed systems,” Opt. Express 19(26), B357–B362 (2011). 19. A. Mecozzi, M. Shtaif, “The statistics of polarization-dependent loss in optical communication systems,” IEEE Photonics Technol. Lett. 14(3), 313–315 (2002). 20. N. Gisin, “Statistics of polarization dependent loss,” Optics Communications 114, Elsevier (1995). 21. L. Nelson, C. Antonelli, A. Mecozzi, M. Birk, P. Magill, A. Schex, and L. Rapp, “Statistics of polarization dependent loss in an installed long-haul WDM system,” Opt. Express 19(7), 6790–6796 (2011). 22. A. Lima, I. Lima Jr., C. Menyuk, and T. Adali, “Comparison of penalties resulting from first-order and allorder polarization mode dispersion distortions in optical fiber transmission systems,” Opt. Letters 28(5), 310–312 (2003). 23. W. Shieh, “PMD-Supported Coherent Optical OFDM Systems,” IEEE Photonics Technol. Lett. 19(3), 134–136 (2007). 24. X. Liu, F. Buchali, “Intra-symbol frequency-domain averaging based channel estimation for coherent optical OFDM,” Opt. Express 16(26), 21944–21957 (2008). 25. M. Shtaif, “Performance degradation in coherent polarization multiplexed systems as a result of polarization dependent loss,” Opt. Express 16(18), 13918–13932 (2008). 26. J. Proakis, M. Salehi, Digital Communications, Fifth Edition. Mc Graw Hill International Edition (2008). 27. P. Delesques, E. Awwad, S. Mumtaz, G. Froc, P. Ciblat, Y. Jaouën, G. Rekaya, and C. Ware, “Mitigation of PDL in coherent optical communications: How close to the fundamental limit?,” in proc. of ECOC’12, paper P4.13. 28. C. Xie, “Polarization-dependent loss induced penalties in PDM-QPSK coherent optical communication systems,” in proc. of OFC/NFOEC’10, 1–3. 29. E. Awwad, Y. Jaouën, G. Rekaya-Ben Othman, and E. Pincemin, “Polarization-Time Coded OFDM for PDL Mitigation in Long-Haul Optical Transmission Systems,” in proc. of ECOC’13, paper P3.4 (to be published). 30. E. Awwad, Y. Jaouën, and G. Rekaya-Ben Othman,“Improving PDL Tolerance of Long-Haul PDM-OFDM Systems Using Polarization-Time Coding,” in proc. of SPPCom’12, paper SpTu2A.5.

Abstract: In this paper, we present a numerical, theoretical and experimental study on the mitigation of Polarization Dependent Loss (PDL) with Polarization-Time (PT) codes in long-haul coherent optical fiber transmissions using Orthogonal Frequency Division Multiplexing (OFDM). First, we review the scheme of a polarization-multiplexed (PolMux) optical transmission and the 2 × 2 MIMO model of the optical channel with PDL. Second, we introduce the Space-Time (ST) codes originally designed for wireless Rayleigh fading channels, and evaluate their performance, as PT codes, in mitigating PDL through numerical simulations. The obtained behaviors and coding gains are different from those observed on the wireless channel. In particular, the Silver code performs better than the Golden code and the coding gains offered by PT codes and forward-error-correction (FEC) codes aggregate. We investigate the numerical results through a theoretical analysis based on the computation of an upper bound of the error probability of the optical channel with PDL. The derived upper bound yields a design criterion for optimal PDL-mitigating codes. Furthermore, a transmission experiment of PDL-mitigation in a 1000km optical fiber link with inline PDL validates the numerical and theoretical findings. The results are shown in terms of Q-factor distributions. The mean Q-factor is improved with PT coding and the variance is also narrowed.

Introduction
Taking advantage of the numerous degrees of freedom in an optical fiber is a key solution to the exponentially increasing demand of capacity [1] over long-haul and metropolitan optical fiber networks. Over the last years, coherent detection [2] replaced direct detection schemes offering an enhanced receiver sensitivity and introducing multilevel modulation formats. Besides, wavelength division multiplexing (WDM) enabled the transmission of many independent wavelengths over the same fiber providing scalability and an important capacity growth to the optical networks. Furthermore, coherent polarization division multiplexed (PDM) systems [1,3] were considered to double the capacity of the optical link by sending two independent signals on two orthogonal polarization states, thus using the amplitude, phase and polarization state of the transmitted signal. Moreover, in order to cope with the increasing number of internet users and the outgrowth of new bandwidth-demanding services such as cloud computing and realtime multimedia services, emerging research projects are studying new physical dimensions in the fiber such as space-division multiplexing (SDM) that consists in using parallel transmission paths per wavelength [1], ideally multiplying the capacity of the link by the number of paths. However, the increase in the spectral efficiency of a multiplexed optical transmission comes at the detriment of a reduction in performance of the transmission scheme. This is mainly due to coupling effects and differential group delays such as polarization mode dispersion and rotations of the polarization states in PDM systems [4], as well as differential losses between the multiplexed channels such as polarization dependent loss (PDL) in PDM systems [5] and mode dependent loss (MDL) in SDM systems [6]. These effects can be modeled using a multiinput multi-output (MIMO) formalism [7] which paves the way to the adaptation of mature digital signal processing (DSP) tools, initially designed for wireless MIMO transmissions and the search for new MIMO techniques specific for the optical fiber channel.
In this paper, we will study the simplest multiplexed scheme which consists in a 2 × 2 polarization-multiplexed (PolMux) MIMO system. The implementation of these systems was made possible with the use of DSP algorithms and the development of high-speed electronics to recover the transmitted data at the receiver side in the electrical domain. The impairments undergone by the high bitrate transmitted signal can be categorized into two classes: linear and nonlinear. The major linear impairments are chromatic dispersion (CD), polarization-mode dispersion (PMD), and polarization-dependent loss (PDL) that can all be modeled using 2 × 2 transfer matrices (Jones matrices) straightforwardly leading to a 2 × 2 MIMO system representation. As for nonlinear effects, they will not be considered in the numerical and theoretical studies where we assess the penalties induced by the linear impairments. However, in the experimental study, we will observe the limitations caused by non-linear effects and determine the optimal operating point of the transmission (in terms of optimal launched signal power).
Dispersive effects preserve the total energy of the transmitted signal and the equalization of CD and PMD is very effective in a single-carrier context [2] and in multi-carrier systems such as orthogonal frequency division multiplexing (OFDM) based systems [7]. On the other hand, PDL remains the main limiting linear impairment. In current systems (both coherent and non-coherent), PDL is not mitigated and system margins are considered to absorb the induced penalties and ensure a target performance. In long-haul optical links, PDL is introduced by imperfect optical elements that unequally attenuate the polarization states of the incident signal and induce energy loss and optical signal-to-noise ratio (OSNR) fluctuations. Recently [8][9][10][11], space-time (ST) codes were suggested to mitigate PDL. In wireless MIMO channels, ST coding improves the reliability of the transmission by sending multiple copies of a data symbol to the receiver on different antennas and at different time slots expecting that the symbol will survive the impairments of the channel in a good enough state to allow reliable decoding [12]. The performance of ST codes, or more appropriately Polarization-Time (PT) codes in this context, such as the Golden code [13], the Silver code and the Alamouti code were numerically evaluated on an optical channel with PDL [8]. A preliminary experimental demonstration validated the numerical results [9]. It was found that the Silver code performed better than the Golden code, unlike the case of a wireless Rayleigh fading channel. Moreover, the performance of the Alamouti code was found to be independent of the amount of PDL in the link [10]. In this work, we will recall these recent numerical results, theoretically explain them and demonstrate the possibility to implement PT codes in a PolMux OFDM system and their potential in mitigating PDL through a transmission experiment. Accordingly, we start by defining the transmission scheme of a PT-coded coherent optical OFDM system in Section 2 as well as defining the channel model of an optical link with PDL.
In Section 3, we present the simulation results of the investigated ST codes showing their performance and the coding gains in a linear regime. We will also examine the achieved coding gains when a forward-error-correcting (FEC) code is added to the transmission scheme. Afterwards, in Section 4, we derive an upper bound of the error probability of an optical link with PDL to interpret the simulated performance and check whether the used coding schemes are optimal or better codes can be found. Finally, in Section 5, we experimentally validate the numerical and theoretical results in the previous sections and also demonstrate the ability of PT codes to mitigate in-line PDL in a 1000km-long optical link.

PT-coded coherent OFDM system model
In this section, we describe the general transmission scheme in which PT codes will be implemented. We also define the channel model of an optical transmission with PDL that will be considered in the simulations and the theoretical study. In a conventional PolMux OFDM transmission [7], two independently modulated OFDM signals where each OFDM subcarrier carries an M-QAM symbol, are sent on two orthogonal polarization states; while in a PT-coded OFDM system, the modulated polarization states are correlated and carry a combination of M-QAM symbols. Indeed, PT coding consists in sending a combination of different modulated symbols on two polarization states (Pol 1 ,Pol 2 ) during several time slots. Hence, if we consider two consecutive time slots (T 1 ,T 2 ), the PT codeword matrix X will be:

PolMux coherent OFDM transmission scheme
This operation of combining modulated symbols is represented by the block "Polarization-Time coding" in Fig. 1. The codeword matrices of the investigated coding schemes will be examined in the next section. The first row of X modulates the first polarization state and the second modulates the second orthogonal polarization state. The columns of X will be carried by the same subcarrier of two consecutive OFDM symbols. After the assignment of symbols to the different subcarriers, conventional OFDM processing is realized including an inverse Fast-Fourier Transform (iFFT) and the insertion of a well-dimensioned cyclic prefix (CP) to absorb all CD and PMD [7]. The main advantage of choosing OFDM to deal with dispersion instead of single-carrier based solutions is the reduced complexity of the equalization step, at the receiver side. Indeed, the use of a suitable CP allows to replace the multi-tap equalizer usually used to extract the useful symbol from the interfering symbols, with a single-tap frequency domain equalizer needed to correct any phase or amplitude distortion induced by the channel and separate the two polarizations. Next, the signal is up-converted to the optical domain and a polarization beam combiner combines the OFDM signals on two orthogonal polarization states.

Receiver side
At the receiver side, a polarization beam splitter splits the incident optical signal on two orthogonal states and a dual polarization coherent receiver down-converts the signal to the electrical domain. Next, OFDM processing is carried including CP removal and an FFT. Each OFDM subcarrier sees a non-dispersive channel and the received symbol is given by [7]: where X k,i is the 2 × 2 matrix of transmitted symbols on the k th subcarrier (k = 1 . . . n) of the i th OFDM symbol, Y k,i is the 2 × 2 matrix of the received symbols. H k (ω k ) is the 2 × 2 channel matrix including laser phase noise, CD, PMD and PDL. N k,i is a 2 × 2 matrix representing the additive noise. A training sequence known at the receiver is used to estimate the channel matrix for each subcarrier [3]. Then, the common phase error induced by the laser phase noise is corrected and finally the transmitted data symbols can be detected and demodulated. It is important to note that ST codes were designed to bring coding and diversity gains to the MIMO scheme when they are optimally decoded according to the maximum likelihood (ML) criterion instead of using sub-optimal linear decoders. Considering the channel model in Eq. (2) and assuming all codeword matrices X equiprobable, an ML decoder estimates the codeword X with X ′ according to the following criterion: where C is the set of all possible codewords. The indices k and i were dropped for clarity. The PT decoding can only be performed under the assumption of a constant H during the codeword duration (2 time slots in this case) which is the case of the optical channel [14]. ML decoding is performed by an exhaustive research with reasonable complexity when the considered modulation symbols come from a 4-QAM constellation. We point out that the decoding complexity is quite reduced with the use of OFDM at the cost of an increased overhead. PT coding could have been implemented in a single-carrier context with time domain equalization [15,16]. However, multi-taps channel matrices would be required increasing the decoding complexity.

Long-haul optical link
In this section, we focus on the properties of the long-haul optical channel. We consider a multispan optical link with N S spans where each span contains a fiber modeled as a concatenation of random PMD elements and inline components with PDL as seen in Fig. 2. Each span is followed by an Erbium doped fiber amplifier (EDFA) operating in constant power mode to raise the signal power to the initially injected power at the transmitter. These amplifiers also add noise to the signal caused by amplified spontaneous emission (ASE). The accumulated ASE noise at the end of the link is the dominant noise source in the whole transmission scheme. In the following, the channel model of the link will be described and simplified to carry the numerical and theoretical studies of PDL mitigation with PT codes. We will, in particular, look at the statistics of PDL, its frequency dependence and the noise properties at the receiver side. We do not consider CD and phase noise because these two effects are polarization independent. Each fiber span is modeled with a concatenation of M random PMD elements where each element has a transfer matrix H PMD (ω k ) defined as the product of a delay matrix and a rotation matrix as in [17]. These unitary effects cause no loss of energy to the transmitted signal unlike PDL that induces OSNR fluctuations. The transfer matrix of a PDL element is given by: The diagonal matrix gives the imbalanced attenuation values of the least and most attenuated polarization states and R α is a random rotation matrix. α is uniformly drawn in [0 : 2π]. ε and γ are defined through Γ dB = 10log 10 where Γ dB ≥ 0 is the PDL coefficient in dB (or simply referred to as PDL) and consists of the ratio between the highest and the lowest losses. The normalization factor in the model using the variable γ is commonly dropped in literature [5,18] because the diagonal matrix and the rotation matrices are enough to take into account the OSNR inequality between the polarization states as well as their crosstalk.
The individual PDL value of each inline component is kept as low as possible , i.e.0 ≤ γ ≪ 1. However, many of these components are interspersed in long-haul optical links leading to a significant accumulated PDL. An overview of previous works on PDL, especially by Mecozzi et al. [19] and Gisin [20], shows statistical descriptions of PDL. In [19], it is found that the probability density function of the global Γ dB of an optical link is a Maxwellian distribution when we consider many low-PDL components. Yet, field measurements of PDL in [21] showed that the probability distribution of Γ dB was truncated because of the presence of a limited number of elements in the link having an appreciable PDL. In all cases, the global Γ dB is a random variable depending on the number of PDL elements in the link and their individual PDL.
For each PDL element, γ is drawn from identical independently distributed (iid) normal distributions with mean and variance determined by the desired mean global Γ dB and the number of PDL elements in the link as in [19]. The choice of the normal distribution is arbitrary because regardless of the distributions of the individual PDL elements, the statistics of the global PDL [19] of the whole link approach a Maxwellian distribution when enough PDL elements are inserted. These statistics were simulated with N S > 10 spans and M = 20 and the obtained distributions perfectly fitted the expected theoretical Maxwellian distributions. Second, we look at the frequency dependence of the channel due to PMD which implies a frequency dependence of the global PDL. Current optical fibers have a typical PMD coefficient PMD c as low as 0.05ps/ √ km. If we consider a L = 2000km multi-span link, the root mean square accumulated DGD σ DGD is equal to PMD c √ L = 2.2ps which induces no fluctuations to the PDL experienced by the subcarriers of an OFDM signal occupying a bandwidth less than the coherence bandwidth B coh of the optical channel (B coh ∝ σ −1 DGD [22]). Hence, in the rest of our work, we will consider a narrow-band frequency independent channel model. Note that the frequency dependence of Γ dB due to PMD in the wide-band case was reported in [23,24]. In [24], authors found that an increased DGD (up to an accumulated DGD of 100ps) in the optical link reduced the BER variations caused by PDL and not the degraded mean BER.
Third, we look at the noise in a long-haul optical link. The noise which is added on each amplification stage is modeled as an additive white Gaussian noise (AWGN). However, the PDL elements in the link act as partial polarizers. The signal and the in-line injected noise propagating through the link are both polarized by PDL. Given that the optical amplifiers are operated in a constant output power mode, the received signal Y k at the output of the link can be written in function of the input signal X k and the in-line injected noise: with H j→N S = H Ns H Ns−1 ...H j+1 H j and H j being the transfer matrix of the j th span. Each component of the noise N j added after the j th span is white and Gaussian distributed with a zero mean and a variance N 0 per real dimension. Because of PDL, N k will be polarized and its coherency matrix Q = E N k N † k will not be proportional to the identity matrix. However, we can still define an equivalent system given by [25]: where N k is a zero-mean white Gaussian noise of variance σ 2 = N S N 0 per real dimension. The effective PDL experienced by the signal and given by the ratio of the eigenvalues of H k H † k will be smaller than the PDL given by the ratio of the eigenvalues of H k H † k but will also follow a Maxwellian distribution.
Using a singular value decomposition, H k can be written as a product of two unitary matrices U and V and a diagonal matrix with 1 + γ eq and 1 − γ eq on its diagonal representing the different loss coefficients between the least and the most attenuated polarization states at the end of the link: with γ eq = (λ max − λ min )/(λ max + λ min ) and a = (λ max + λ min )/2 where λ max and λ min are the eigenvalues of H k H † k . a is an inevitable loss coefficient induced by PDL. Being particularly interested in the differential attenuation and crosstalk effects due to PDL and polarization rotations, we will consider unitary rotation matrices. Moreover, knowing that the dynamics of the optical channel present rather slow variations [14], we will only regard constant values of Γ dB = 10log 10 1+γ eq 1−γ eq for the numerical and theoretical investigations. Hence, in the rest of the paper, we will consider the following channel matrix H to model an optical link with PDL:

Numerical investigation of PDL mitigation
Many ST codes were designed for wireless MIMO channels. We will consider the most famous ones that have proven to be the best-performing codes on 2 × 2 and 2 × 1 Rayleigh fading channels: the Golden code, the Silver code and the Alamouti code. In the following, the codeword matrix X and the rate of each coding scheme is given. The rate of a code is defined as the number of transmitted symbols per time slot (ts). In an uncoded case, we simply fill the matrix with 4 different M-QAM symbols and the rate is equal to 2 M-QAM symbols/ts.

The Golden code
The Golden code [13] has the best performance on 2 × 2 MIMO Rayleigh fading channels. The codeword matrix of the Golden code is: where θ = 1+ The Golden code achieves a full rate of 2 symbols/ts because 4 symbols are transmitted during 2 symbol times. Hence, this code introduces no redundancy. Moreover, the determinant of the codeword matrix (corresponding to a coding gain on a Rayleigh fading channel [12]) is proportional to 1 5 which is the highest obtained value for a 2 × 2 ST code.

The Silver code
The Silver code has a slightly weaker performance than the Golden code but it has the advantage of having a reduced decoding complexity due to its particular structure. The codeword matrix of the Silver code is: where S 1 , S 2 , S 3 , S 4 are M-QAM symbols. The determinant of this code is proportional to 1 7 . The Silver code also achieves a full rate of 2 symbols/ts because 4 symbols are transmitted during 2 time slots making it also a redundancy-free code.

The Alamouti code
The Alamouti code has the optimal performance on a 2 × 1 MIMO Rayleigh fading channel and can be also used for 2 × 2 MIMO channels. The codeword matrix of the Alamouti code is defined by: where S 1 and S 2 are M-QAM symbols. Note that the codeword matrix has an orthogonal structure which makes the decoding straightforward. However, the Alamouti code introduces a redundancy as it has a rate of only 1 symbol/ts (half-rate code).

Performance analysis of PT codes
The performance of a PT-coded OFDM transmission on an optical channel with PDL will be investigated using the following frequency non-selective channel model with H defined in Eq. (8): The noise N k is modeled as additive white with iid circularly-symmetric complex Gaussian components N (0, 2σ 2 ). At the transmitter, 4-QAM symbols will be used for the uncoded case as well as to fill the codeword matrices of the Silver and the Golden code (full-rate codes). On the other hand, 16-QAM symbols will be used for the Alamouti code (half-rate code) in order to compare the schemes at the same spectral efficiency of 4 information bits/ts. At the receiver, ML decoding is performed by an exhaustive search with a reasonable complexity. In the case of an uncoded scheme, there are 16 possible codewords in C . In case of the Silver or the Golden code, C has 4 4 = 256 codewords. The Alamouti code benefits from its orthogonal structure and achieves the ML criterion with a single decoding operation. Monte Carlo simulations are carried in order to evaluate the performance of the different coding schemes in terms of BERs.   Figure 3(a) shows the BER evolution versus the SNR bit for a PDL of 6dB. Without PT coding, the SNR degradation induced by PDL is about 2.3dB at BER = 10 −3 . The SNR penaltydefined as the SNR gap to the PDL-free case at a given BER -is only 0.6dB with the Golden code and 0.3dB with the Silver code corresponding to a coding gain of 2dB. Apart from the important coding gains, these codes do not introduce any spectral efficiency penalties compared to the uncoded case as they are by construction redundancy-free. On the other hand, looking at the performance of the Alamouti code, we notice an SNR penalty of 3.6dB at BER = 10 −3 . This is due to the use of 16-QAM symbols. In Fig. 3(b), we observe the performance of the Alamouti code for 3 different levels of PDL. We notice that the code performs the same independently of the amount of PDL. The SNR penalties at BER = 10 −3 for different PDL values had been evaluated for the first time in [8] and the results were confirmed in [11]. For PDL values less than 4dB, the Silver code can mitigate almost all PDL effects and always performs better than the Golden code. In [10], the performance of the Alamouti code was also found to be independent of PDL. Moreover, experimental demonstrations of PDL mitigation with PT coding were also realized in a back-to-back (no transmission) scenario [9]. However, all these observations were still analytically unexplained.

Concatenation of FEC and PT coding
While PT coding technique uses the modulated symbols to form a codeword matrix and mitigate some channel effects, other coding techniques, known as forward error-correcting (FEC) or channel codes, operate on the information bits and add some redundancy in order to enhance the performance over the noisy channel. The FEC block at the transmitter side precedes the M-QAM modulation block and a corresponding FEC decoding unit at the receiver side follows the demodulation block. Linear block codes are a major class of channel codes where a binary information sequence of length k is mapped to a binary sequence of length n called a codeword.
The code rate of a block code is denoted by R c : The optimal decoding strategy of FEC codes is soft decision decoding (SDD) [26,Chap.7] that receives at its input a real value indicating the reliability of each coded bit (e.g. log-likelihood ratios) and computes the original information bits. A sub-optimal, yet computationally simple, decoding strategy is hard decision decoding (HDD) [26,Chap.7] that consists in quantizing the samples at the output of the PT demodulator before sending them to the FEC decoder. We are interested, in this section, in evaluating the total gain provided by FEC coding and PT coding through Monte-Carlo simulations. The performance on an optical channel with Γ dB = 6dB of an uncoded 4-QAM scheme (no FEC, no PT) and a Silver-coded scheme is compared with a Bose-Chaudhuri-Hocquenghem (BCH) coded binary sequence where n = 63, k = 45 and an error correction capability t = 3, followed by 4-QAM modulation and PT coding (Silver code). The obtained BER curves are plotted in Fig. 4. At a BER = 10 −4 , the gain provided by the Silver code alone is 2.4dB. When HDD is considered, the gain provided by the BCH code alone is 1.8dB. The concatenation of both codes offer a total coding gain of 3.9dB, approximately equal to the sum of the separate coding gains 2.4 + 1.8 = 4.2dB. The summation of the FEC and PT coding gains is also observed when using SDD. In this case, the gain offered by the BCH code alone at a BER = 10 −4 is 2.8dB. The concatenation with the Silver code provides a total gain of 4.9dB and the sum of the separate coding gains is 2.4 + 2.8 = 5.2dB.

Theoretical analysis of PDL mitigation
Three main results are retained when observing the previous numerical investigations: • the Silver code performs better than the Golden code, unlike the case of the wireless Rayleigh fading channel.
• the performance of the Alamouti code is independent of the amount of PDL in the link.
• the total coding gain obtained when using both a FEC code and a PT code is equal to the sum of the separate coding gains.
These findings show different behaviors of MIMO codes on the optical channel. Further exploration of the mitigation of PDL with MIMO codes is required to understand the obtained results and check whether the investigated codes are optimal or better codes can be found. Many previous works have investigated the performance of PolMux schemes in presence of PDL in terms of the outage probability [11,25,27]. However, an outage analysis only gives the performance limits of a transmission scheme. The only properties of the transmitted signal that appear in an outage analysis are the average symbol energy and the transmission rate. On the other hand, an error probability analysis based on the calculus of an upper bound of the probability that an ML decoder erroneously decodes the transmitted symbol takes into account the codeword matrix of the transmitted symbols. It is an appropriate tool to extract the criteria that a codeword matrix should satisfy in order to minimize or mitigate the channel impairments. A well-known and extensively used error probability calculus in the case of a MIMO Rayleigh fading channel can be found in [12]. This calculus revealed the design criteria that were used to construct the Golden and the Silver codes for the wireless channel. In this section, we will derive an upper bound of the error probability in the case of an optical channel with PDL.

Upper bound of the error probability
The error probability is defined as: For equiprobable codewords, using the union bound, the error probability is upper-bounded by: where card(C ) is the cardinality of C and Pr(X → X ′ ) is the pairwise error probability (PEP) supposing that X and X ′ are the only possible codewords in the codeword space. To compute the PEP, we define the conditional PEP that we average over all the possible channel realizations. For an ML decoder, the conditional PEP is defined as: Using the Gaussian properties of the noise N and applying Chernoff's bound, the PEP can be upper-bounded by [26,Chap.4]: where E H [] is the averaging operation over all possible channel realizations.
Averaging over H given in Eq. (8) where we consider constant values of Γ dB and a random rotation angle that varies uniformly in [0 : 2π], we get: where I 0 (z) is the 0 th order modified Bessel function of the first kind.
x 1,2 being line vectors and: We can approximate the error probability expression in Eq. (18) for high SNR values by using a first order approximation of I 0 (z) when z → ∞ and get:

Design criterion
I 0 (z) being monotonously increasing for z ≥ 0, its minimum is at z = 0. This corresponds to null 'a' and 'b' and the obtained error probability will be independent of PDL. Consequently:

Proposition 1 A Polarization-Time code completely mitigates PDL if and only if all codeword
differences satisfy: When this design criterion is met, we recover the performance over two parallel AWGN channels which is the best achievable performance: The resulting criterion is completely different from the rank and the minimum determinant criteria derived for a Rayleigh fading channel [12] and defining respectively a diversity gain and a coding gain on the BER curves. If we compare the approximation of the error probability expression at high SNR in Eq. (21) to the one obtained in the case of a 2 × 2 MIMO Rayleigh fading channel, we notice different behaviors: the error probability of the Rayleigh fading channel decays as SNR −2r [12] where r is the minimum rank of the matrix X ∆ . The diversity of the system is defined as the power of SNR −1 (2r in this case) and can be graphically discerned as the slope of the BER curves at high SNR values. While the error probability of the PDL channel decays exponentially as a function of the SNR as in the case of an additive white Gaussian channel in Eq. (22). Hence, Space-Time codes, used as Polarization-Time codes, bring no diversity gain to the optical channel with PDL. A coding gain that will be evaluated in the following, is only brought, reducing the penalty induced by PDL.

Performance analysis of PT codes
The performance of both coded and uncoded schemes will be examined using the derived upper bound of the pairwise error probability expression in Eq. (21). To compare the different schemes, we will compute the squared distance d 2 = X ∆ 2 − γ eq a 2 + b 2 of all combinations of codeword differences and compare the minimum values of d 2 , denoted d 2 min , of the codes. The best code is the one that maximizes this minimum value. In Table 1, we report the minimum values of X ∆ 2 − γ eq a 2 + b 2 analytically computed for each investigated PT code at four different PDL values. To fill the table, we set a spectral efficiency of 4 bits per time slot for all coding schemes and consider an average symbol energy E s = 1 for all constellations.
First, we notice that the Alamouti code has the same minimal distance for all PDL values which explains why it performs the same independently of PDL in Fig. 3(b). This is due to the orthogonality of its codeword matrix (Eq. (11)) that induces a = b = 0 for all possible codeword differences. Hence, this code satisfies the criterion of Proposition 1. However, its performance is affected by the use of 16-QAM symbols giving a squared minimal distance of 0.8. Second, we note that the Silver code is not optimal in mitigating PDL since it does not satisfy the derived design criterion. Unlike the Alamouti code, the Silver code has only some codeword differences having a = b = 0. In Fig. 5, we plot the performance of the Silver code for different PDL values. We see that the code mitigates almost all PDL when the PDL coefficient is equal to 3dB and 6dB whereas for a PDL of 10dB, the code is not able to completely palliate PDL. The computed d 2 min in Table 1 explain the behavior of the Silver code. d 2 min is given by the same codeword difference with a = 0 and b = 0 at a PDL of 3dB and 6dB, and is equal to 2. Whereas at a PDL of 10dB, it falls to 1.23 given by another codeword difference where a = 0 and b = 0. Third, in Fig. 3(a), we saw that the Silver code outperforms the Golden code for Γ dB = 6dB, and both reduce the penalty that PDL causes to the uncoded scheme. Again, this result can be explained by looking at Table 1. Indeed, d 2 min is the greatest for the Silver code followed by the Golden code and then the uncoded scheme.
In conclusion, we were able to explain, in terms of error probability bounds, the performance of the Alamouti, the Silver and the Golden codes on an optical channel with PDL. These codes were designed to satisfy the rank and the minimum determinant criteria for a wireless channel that are no more relevant for the optical channel.

Concatenation of FEC and PT coding
The numerical investigation of the performance of a PolMux scheme using both a FEC code and a PT code showed that the total coding gain is equal to the sum of the separate coding gains brought by each. This result can be also theoretically explained and is mainly due to the exponential decrease of the error probability as a function of SNR.
The bit error probability of a linear block code over an AWGN channel, and after hard decision decoding, is determined by the minimum distance of the considered FEC code d min,FEC and the crossover probability p of the equivalent binary symmetric channel (BSC) [26,Chap.7]: PT using a linear block code and a full-rate PT code. At the same achieved error probability, the coding gain G of the coded scheme is given by: In decibels, we obtain G dB : The first term denote the coding gain of the FEC code and the third term denotes the coding gain provided by PT coding. Equation (30) shows that the total asymptotic gain obtained when concatenating a FEC code and a PT code is the sum of the gains provided by each code separately, validating the third result of our numerical investigation.

Experimental validation of PDL mitigation
The numerical and theoretical validations of PDL mitigation using PT coding are limited to a linear channel with a single-element or lumped PDL and AWGN noise at the receiver side. Initial experimental results limited to the linear regime also showed the efficiency of PT codes [9].
These simplifications allowed us to analyze the potential of MIMO codes for PolMux optical transmission using a handy and well defined channel model. The final remaining step is to test the ability of PT codes to mitigate inline PDL through a transmission experiment with distributed PDL taking into account the interactions of PDL with distributed ASE noise [28] and non-linear effects [18].

Experimental setup
The transmission scheme, shown in Fig. 6, consists of a PolMux OFDM transmitter, a 1000 km optical link and a dual-polarization coherent receiver. The OFDM signals are made of 256 subcarriers including 194 data subcarriers and 10 pilot subcarriers for common phase error estimation. The data subcarriers are alternatively modulated with 4-QAM symbols, Silver-, Goldenand 16-QAM Alamouti-coded symbols. An 18-sample CP is appended to each OFDM symbol (7% overhead). A time-multiplexed training sequence [3] is periodically inserted for channel estimation and time and frequency synchronizations. The total effective bitrate is 25 Gbit/s while the raw bit rate is 36 Gbit/s taking into account the different transmission overheads. The OFDM signals are generated by an arbitrary waveform generator (AWG) with two outputs. Hence, to emulate PolMux, we consider OFDM signals satisfying the Hermitian symmetry property. Next, up-conversion to the optical domain is done using two single-drive Mach-Zehnder modulators (MZM). Then the two optical signals are combined by a Polarization Beam Combiner and multiplexed in the optical link among seven 50 GHz-spaced wavelengths. Further details on the experimental setup can be found in [29]. The transmission line consists in a recirculating loop which contains 2 spans of 100 km of SMF, a PDL element of 2dB, a polarization scrambler to randomize the polarization state between successive loops and an optical band-pass filter. After additional ASE noise loading at the receiver, the desired wavelength is filtered and the signal is detected with a dual-polarization coherent receiver. The received signal is then acquired with a real-time Tektronix oscilloscope and offline-processing [3,7] ending with ML decoding is carried to measure the BER and the corresponding Q-factor in dB, commonly used in optical communications:

PT coding in non-linear propagation regime
We will transmit the PT-coded OFDM in the recirculating loop of Fig. 6 and replace the in-line PDL element by an attenuator having the same insertion loss and emulate PDL at the transmitter side by inserting an attenuator in one branch of the PolMux transmitter (this corresponds to the worst case of OSNR degradation of one polarization tributary [18]). First, BER measurements are carried out in a back-to-back setup [29] validating the previous simulated performance of all coding schemes. Then, BERs are measured after a 1000km-transmission for input powers ranging from -15 to 5dBm. The obtained bathtub-shaped curves are plotted in Fig. 7 for the 4-QAM and Silver-coded schemes at a PDL of 0, 3 and 6dB. First, we notice that all curves have their optimum operating point at an input power of -3dBm. However, the uncoded scheme suffers severely from PDL (Q = 10.5dB at -3 dBm for PDL = 6dB) while the Silver code completely compensates the penalty at PDL = 3dB and nearly all penalty at PDL = 6dB (Q = 12.3dB at -3dBm). Second, the curves can be separated into three regimes: a linear regime, up to -6dBm, where non-linear effects are negligible and the coding gains match the numerical results obtained with a linear channel model; a moderate non-linear regime, between -6dBm and 0dBm, where a minimum BER is reached; and a severe non-linear regime where the performance of all schemes is deteriorated. An important result is that the PT-coded OFDM does not induce any extra penalties in presence of non-linear effects. This experimental validation confirms a previous numerical investigation of the behavior of PT codes in non-linear propagation regime [30].

Mitigation of distributed in-line PDL
We remove the emulated PDL at the transmitter side and insert the PDL element of 2dB into the loop. 2000 Q-factor measurements are then carried out for the different coding schemes at the optimum operating point. The measured OSNR at the receiver is 12dB after additional ASE noise loading in order to evaluate the complete Q-factor distributions. The OSNR is proportional to SNR bit that we used along this paper: where R b is the total transmitted bitrate and B re f is the reference spectral bandwidth of 0.1nm. The obtained Q-factor distributions are shown in Fig. 8. We also show, in the inset, the measured probability distribution of PDL and the fitted theoretical distribution. PDL is measured using the estimated 2 × 2 channel matrices and a Maxwellian probability distribution function of PDL is obtained with a 4.2dB mean (the expected theoretical mean being 4dB [19]).
The Q-factor distribution for the 4-QAM-coded subcarriers is asymmetric with a large tail towards the worst Q-factors and has a mean of 11.2dB and a standard deviation (std) of 0.55dB whereas the distribution is symmetric and the mean Q-factor of the Silver-and Golden-coded subcarriers is 11.8dB (the same observed value for the PDL-free case when OSNR = 12dB). The Q-factor distributions are also narrower when PT codes are used. The Silver code gives a distribution slightly narrower than the one obtained with the Golden code: std of 0.35dB and 0.37dB respectively. For the Alamouti code, we observe a mean Q-factor of 8.3dB due to the use of 16-QAM symbols to guarantee the same spectral efficiency of the other schemes. However, its Q factor distribution is the narrowest of all with a std of 0.24dB. This is due to the powerful orthogonal structure of the Alamouti codeword matrix that makes its performance independent of the amount of accumulated PDL in the link, as found in the theoretical analysis in section 4. The observed small variance of its Q-factor distribution can be ascribed to the increased sensitivity of 16-QAM modulation to phase noise and non-linear effects.

Conclusion
We have presented numerical and theoretical studies of PDL mitigation using redundancy-free coding schemes. The Silver code can efficiently mitigate the PDL in current long-haul optical links where PDL values rarely exceed 6dB [21]. However, the theoretical study of PDL mitigation showed that this code is not optimal. The established upper bound of the error probability for an optical channel with PDL yields the design criterion required to construct codes that guarantee a PDL-independent error probability, restoring the performance over an additive white Gaussian channel. These design rules open the way for the construction of optimal codes dedicated to PDL mitigation in the optical channel. We have also shown that the total coding gain of a concatenation of a FEC code and a PT code is the sum of the separate coding gains. A transmission experiment with distributed in-line PDL was accomplished as well, validating the results obtained with a simplified channel model. PT coding does not induce extra penalties when non linear effects are considered and PT codes can both improve the mean and the variance of the Q-factor distributions.