Upper and lower bounds to the information rate transferred through the Pol-Mux channel

Abstract: Pol-Mux transmission is a well established technique that enhances spectral efficiency by simultaneously transmitting over horizontal and vertical polarizations of the electrical field. However, cross-coupling of the two polarizations impairs transmission. Under the assumption that the cross-coupling matrix is a Markov process with free-running state, we propose upper and lower bounds to the information rate that can be transferred through the channel. Simulation results show that the two bounds are tight for values of the cross-coupling power of practical interest and modulation formats up to 16-QAM (quadrature amplitude modulation).


Introduction
Simultaneous transmission of modulated signals over the horizontal and vertical polarizations of the electrical field is a well established technique [1][2][3] that allows to improve spectral efficiency by using the same frequency twice. In its essence, this technique relies upon the principle of MIMO (Multiple Input Multiple Output) systems, that have become popular after the seminal paper [4]. To cancel interference arising from non-ideal orthogonality between the horizontal and the vertical polarizations, linear processing can be adopted [5], even if it is well known that non-linear techniques achieve better performance in presence of interference and additive noise, see e.g. [6,7].
Either implicitly or explicitly, most of the receivers studied in the literature assume that the MIMO channel matrix is static or quasi-static. However, the experimental results of [6] show that the coherence time of the channel is quite small, say, in the order of 10 to 30 symbol intervals for 112 Gb/s dual-polarization QPSK (Quadrature Phase Shift Keying). Hence tracking the channel becomes an issue. Tracking techniques can be based on pilot symbols, as proposed, for instance, in [8], but, independently of the channel tracking method, a low coherence time of the channel matrix, hence a fast time-varying channel, will make noisy the channel estimate (in practice only a short time window spanning a few signal samples can be used for channel estimation at a given time instant) thus impacting the information rate that can be transmitted through the channel. This observation motivates the study of the information rate transferred through the Pol-Mux channel. Channel capacity of the fading MIMO channel is a classical topic in the general framework of information theory, see e.g. [9] and, in that context, also the information rate of channels with free-running state has been studied [10]. In the context of optical transmission the information rate is well studied for the phase noise channel, at least for the channel model with free-running state, see e.g. [11][12][13], but less has been done for the Pol-Mux channel, which can be seen indeed as a variant of the phase noise channel where • the modulus is not constant • the channel is MIMO. Therefore, starting from the lower bound for the phase noise channel of [13], we adapt it here to the Pol-Mux channel and introduce a new upper bound based on the Kalman filter.

Channel model
Let the lowercase characters indicate possibly complex scalars and column vectors and let the uppercase characters indicate matrices. The notation a k+i k is used to indicate a column vector (or matrix, when the elements are vectors) made by the chunk of sequence (a k , a k+1 , · · · , a k+i ) T , while {a k } is used to indicate the semi-infinite sequence (a 0 , a 1 , · · · ). The notation I m is used to indicate the m × m identity matrix and the superscript H denotes Hermitian transposition. The output of the Pol-Mux channel at time k is where x k is the k-th sample of the i.i.d. input modulation complex vector data sequence, with zero mean vector and covariance matrix M k is the channel matrix and w k is the k-th element of the i.i.d. complex Gaussian vector noise sequence with zero mean vector and covariance matrix For small to moderate polarization crosstalk, the matrix M k can be modelled as [6] where is the k-th element of a complex Gaussian random vector sequence which is hereafter modelled here as a free-running 1-causal ARMA (Autoregressive Moving Average) process, hence where v k is the k-th sample of a white Gaussian random vector sequence with zero mean and covariance matrix In other words, {λ k } is the filtered version of {v k }, where the filter is made of two shift registers, one for {v 1,k } and the other one for {v 2,k }, each one with m memories, and with 1-causal feedback taps a Using the z-transform you write where To cast the model in the framework of linear dynamic systems we need to define the state of the system. To this aim, let us define the vector sequence hence ω k−1 k−m is the content of the two shift registers at the k-th channel use. Note that λ k depends only on Therefore you can take as the state of the linear dynamic system at time k, thus writing the measurement equation and the state transition equation as with where 0 m 1 is a column vector of m zeros, and the 2(m + 1) × 2(m + 1) state transition matrix is where and O m is the all-zero square matrix of size m × m. The state transition probability is where g c (µ, Σ m ; x) indicates a m-dimensional complex Gaussian probability density function over the complex vector space spanned by x with mean vector µ and covariance matrix Σ m and Q is the covariance matrix of the process noise with The joint source and channel output probability, given the hidden state, is where The conditional probability of channel output given the hidden state is

Upper and lower bounds to the information rate by the Kalman filter
Let For the conditional entropy, by chain rule one writes which, by the Shannon-McMillan-Breiman theorem, can be evaluated as Since conditioning does not increase entropy, we have the following upper and lower bounds to the conditional entropy that one can use in a straightforward way in the right side of (22) together with (23) to get lower and upper bounds to the information rate. Let us consider the upper bound (26). The probabilities inside the logarithm can be evaluated by the Kalman filter as follows. The knowledge of past transmitted symbols that appear in the conditioning is imported in the Kalman filter by including all the conditions in the measurement, hence by updating the Kalman filter in data-aided mode. Let us write the channel output as The predicted measurement at time k isŷ whereŝ k denotes the state predicted by the Kalman filter at time k, that is the expectation of the hidden state given past measurementŝ As innovations process we take Starting from an initial pair (Σ 1 ,ŝ 1 ), wherê for k = 1, 2, · · · , the state prediction vector and the prediction error covariance matrix evolve aŝ where The desired probability is evaluated as where, using the predicted state and the prediction error covariance matrix computed by the Kalman filter, one has Similarly, for the lower bound to the conditional entropy, one has with whereŝ f b,k andΣ f b,k are the estimates produced by combining a forward and a backward Kalman filter asŝ

Simulation results
The consideration of realistic spectra of the cross-pol coefficients is out of the scope of the present paper and we left it to future studies. For practical methods, to estimate the strength of cross-pol interference the reader is referred to [6], where the strength of interference is given by the autocorrelation of interference at time zero. In the following we express the strength of interference by using the SIR (Signal-to-Interference Ratio), which is the inverse of the interference autocorrelation at time zero. To derive simulation results, we set ρ = 0 and for each one of the two random coefficients appearing in the Pol-Mux matrix we take the first-order ARMA model where −1 < z p < 1 is the pole of the first-order ARMA model. The filtered sequence has zero mean, unit power spectral density at frequency zero and power hence the SIR is In the common case where z p is close to 1, the filtered sequence is a first-order low-pass random sequence with −3 dB normalized bandwidth  constellation sizes, achievable with the pure AWGN (Additive White Gaussian Noise) channel: 4 bits for 2 × 4-QAM, 8 bits for 2 × 16-QAM and 12 bits for 2 × 64-QAM. Figure 2 gives the same upper and lower bounds obtained with z p = 0.887, that is SIR=12.2 dB. In the practice it seems to be a strong interference condition, since the minimum SIR reported in the experimental results of [6] is around 14 dB. In this case, the information rate with 64-QAM and at high SNR remains well below the information rate achieved with the AWGN channel, thus confirming that the Pol-Mux interference becomes the limiting factor of the information rate transferred through the channel. We also note that the spread between upper and lower bounds becomes large with 64-QAM and at high SNR, where the capability of tracking the MIMO channel becomes crucial. Actually, the lower bound renounces to the blind part of tracking thus renouncing to some tracking capability, while the upper bound upgrades the blind tracking to a data-aided tracking, thus enhancing tracking capabilities over what can actually be done.

Conclusions
We have proposed upper and lower bounds to the information rate of the Pol-Mux channel and shown simulation results for a specific channel model. The results show that with moderate interference our bounds are so close that virtually compute the exact information rate. For strong interference and modulation formats with high spectral efficiency there is still some spread between the two, leaving space to future investigations.