BER estimation of QPSK homodyne detection with carrier phase estimation using digital signal processing

: An approximate analytical expression for the bit error rate of a QPSK homodyne receiver employing digital signal processing for carrier recovery is derived. BER estimated using the analytical expression is in excellent agreement with Monte-Carlo simulations. The analytical approximation leads to an intuitive understanding of the trade off in such systems and allows optimization of system parameters without resorting to Monte-Carlo simulations.


Introduction
Coherent detection (CD) is continuously being studied because of its potential advantages over direct detection (DD) [1]. CD generally results in higher sensitivity in optical communication links. CD also results in better channel selectivity in wavelength-division multiplexed optical networks. In CD, the best sensitivity is achieved when homodyne detection is used. However, in this case both the transmitter and local oscillator (LO) lasers need to have narrow linewidths (LWs) and be phase-locked. These requirements render the realization of a homodyne detection receiver difficult to implement. To circumvent this problem, several receiver schemes employing high-speed digital signal processing (DSP) have been suggested [2][3][4]. These schemes maintain the advantages of homodyne detection without phase locking, using instead digital feedforward carrier recovery.
The scheme in [3] demonstrates an intuitive approach to feedforward carrier estimation for optical QPSK using DSP. Because of its simple implementation, this scheme can potentially be employed in the near future. However, to the best of our knowledge, an analytical derivation of the bit error rate (BER) for this scheme has not been provided in the literature.
Such derivation is valuable towards understanding the effect that each parameter has on the system performance and enables a comparison between various receiver types without the need to revert to time-consuming Monte-Carlo (MC) simulations.
In this paper, we set to find an estimate of the BER for a QPSK feedforward carrier recovery scheme using DSP. The paper is organized as follows: Section 2 presents in detail the feedforward carrier and data recovery scheme employing DSP. The derivation of the phase estimation error associated with this scheme is provided in Section 3. Section 4 presents comparisons between MC simulations and the approximate analytical expression obtained for the distribution of the phase estimation error and the associated BER. Conclusions are presented in section 5.

Feedforward carrier recovery using DSP
We begin by presenting the carrier and data recovery process for a DSP based CD receiver. CD of the incoming optical signal is achieved using a phase diversity receiver followed by a pair of balanced detectors, one for each quadrature [1]. The local oscillator (LO) in this scheme is not phase-locked to the carrier, greatly reducing the complexity of the CD process. Analog to digital conversion (ADC) of the two quadratures is performed at the symbol rate (e.g., 10Gsample/s/quadrature for a 20Gbps QPSK signal). Each pair of samples (the in-phase and quadrature samples) is combined into a single complex sample. The digital feedforward carrier recovery scheme, as suggested in [3], is shown in Fig. 1. The incoming sample stream is divided into blocks of Nb samples each. Each sample block is used as input to the next available processing unit (PU), until the last PU is reached. At that point the first PU should be available to receive the next sample block. The scheme can be generalized to any number of PUs, noting that this continuous process requires that enough PUs are present to allow each PU the necessary time to complete its operation before a new sample block is fed to it. Each PU also requires input from the next PU which is necessary for correct data decoding and phase tracking, as will be discussed later.    Various other impairments (e.g., amplified spontaneous emission from optical amplifiers, quantization noise inherent in the ADC process and the effect of laser relative intensity noise observed with imperfect balanced detection) are not considered in this work.  1 4 in this operation introduces a 4-fold phase ambiguity which may be eliminated by employing differential precoding of the data quadrant k d [2,3,5]. Another source of error to consider is the ubiquitous shot noise which distorts the phase estimation for each sample. In order to mitigate this effect, an equal-tap weight transversal filter is employed [2]. In this case the carrier phase estimate, It is crucial to note that although the filtering process reduces the effect of shot noise, it will inherently introduce an error on the phase estimation since 1 PU est ϕ is used as a common phase estimate for all the samples in the block. There is an evident tradeoff here: a longer filter (larger Nb ) reduces the shot noise more but the phase estimate is then common on more samples, thus reducing the phase estimate accuracy on each sample. Conversely, a shorter filter (ideally 1 Nb = , i.e., no filtering at all) allows better following of the phase noise in the absence of shot noise, but will perform poorly in the presence of it. An optimal block size can be determined as will be shown later. In order to decode the differentially encoded data, the phase threshold operator T extracts the i th symbol quadrant: operator eliminates fractional part of X . Note that this detection scheme is not equivalent to the worse-performing differential detection scheme [5]. Synchronous CD is still employed here where the decoding of the data is performed on the basis of comparison of consecutive quadrant numbers, not pair-wise comparison of sample phases. The operator C decodes the data such that ( ) and 0 otherwise. This rather complex operation needed to decode the last symbol in each PU is explained by the fact that the differential encoding employed dictates that only 1 Nb − decoded symbols can be extracted from Nb samples. To decode the last symbol, the quadrant number of the first sample in the next PU is required.
The BER of a differentially encoded QPSK signal with noisy phase reference can be derived in a similar manner as described in [6,Eq. (12)], although the phase estimation error on two consecutive samples ( ) 1 2 , ε ε are not assumed to be identical here: cos sin is the electrical signal to noise ratio (SNR) per bit, b E is the energy per bit and 0 N is the single-sided power spectral density of the shot noise.
is the probability density function (PDF) of the RV ϕ Δ , the phase estimation error.
Note that the simplification taken in Eq. (2) is made under the assumption that 1 b γ and a small phase estimation error. These conditions are easily met in the range of parameters (SNR>6dB and beat LW<2MHz) considered in this paper. There is a factor of approximately 2 when Eq. (2) is compared to the BER expression for the gray coded case (i.e., no differential encoding) [6,Eq. (14)]. This factor originates from differential encoding where any error in a symbol is manifested twice through differential decoding, to the first-order approximation. From Eq. (2) it is seen that in order to evaluate e P it is necessary to obtain P ϕ Δ , the distribution of ϕ Δ .

Distribution of the phase estimation error -ϕ Δ
The phase estimation error associated with the scheme presented in Fig. 2 is defined by: To see how ϕ Δ is distributed, start by considering the ( ) 4 ⋅ operation: For high SNR, all terms containing the shot noise of third order and higher can be neglected because ( ) 3 2 o n n . Subsequent simulation results and analytical considerations will confirm the validity of this assumption for high SNR values. Substituting Eq. (4) into Eq. (3), while making this approximation, yields: We consider first the phase estimation error in the absence of shot noise. Recall that although shot noise is not considered at first, the filtering operation necessary for optimal phase tracking in the presence of shot noise introduces an error on the phase estimation. We set to investigate this error before introducing the shot noise. The phase estimation error in this case is given by: Eq. (6) may be simplified by noting that the laser phase noise is a Wiener process [7] characterized by a zero mean white Gaussian frequency noise where 2 υ Δ and r B are the beat LW of the transmitter and LO laser, and symbol rate, respectively. The frequency noise is independent of data modulation and shot noise. The instantaneous phase k ϕ may then be written as   Noting that k ϕ Δ is a linear combination of independent identically distributed (iid) Gaussian RVs, which may be written conveniently in a matrix notation: The variance of ϕ Δ is the sum of the variances of the independent, identically-distributed RVs, so that We proceed to incorporate the shot noise contribution to the distribution of ϕ Δ . Eq. (5) can be re-written as ( ) Noting that the phase of a complex Gaussian white noise is uniformly distributed as , any other arbitrarily distributed angle can be lumped into the phase of the shot noise without affecting its statistical attributes.
As the shot noise and phase noise are independent, the shot noise contribution to the variance of ϕ Δ is additive. To determine this contribution, the distribution of w ρ is to be established. Let represents an angle with an arbitrary PDF, independent of the angle of n (note that w ρ is a random sample of ρ where 2 δ σ is associated with the beat LW ( ) Special care should be taken when invoking the CLT, since at high SNR levels the block size Nb which determines the number of summands reduces. Presence of heavy tails might render the CLT approximation invalid beyond first order. However, as the SNR increases, even though Nb becomes smaller, the significance of the 2 nd order shot noise is diminished and the distribution of ϕ Δ approaches Gaussian anyway. To verify the validity of this approach, a series of 11 5.5 10 ⋅ samples following the distribution of RV ϕ Δ as defined in Eq.
(10) was generated using several computers. The beat LW, SNR and block size used were 2MHz and 13.5dB and 8, respectively. The PDF of the obtained series (generated PDF) was compared to a Gaussian PDF defined by Eq. (11), using the same SNR, LW and block size. Figure 3 presents the two PDFs and the associated BERs as these are accumulated under the integral in Eq. (2) as a function of the integration variable. As seen in Fig. 3, the tails of the generated PDF are somewhat wider compared to the Gaussian PDF. However, by observing the respective BER curves, it is seen that this tails' widening does not significantly affect the final BER; i.e. the difference in BER in both cases is negligible (approximately 5%). Note that the series of generated samples must be long enough to allow for enough events at the tails. It is observed on Fig. 3 that the series used is indeed long enough since the BER curve for the generated PDF case levels off at roughly 0.325 ϕ Δ = where the generated PDF still has enough samples to validate this test. Similar results are obtained for a beat LW of 600KHz, SNR of 13dB and Nb of 15 (parameters which also achieve an approximate BER of improves the accuracy of the Gaussian approximation since the number of summed terms is now increased. The above explanation does not imply that the actual phase estimation error variance is better approximated at a lower SNR, but simply justifies the use of a Gaussian approximation of the phase estimation error PDF at the SNR range under consideration. . The symbol rate used was 10GS/s. Superimposed on these plots is the optimal Nb at each SNR. Note that the log function was used to obtain better contrast on Fig. 4. In this case a lower value is preferable. As seen in Fig. 4, when the SNR is increased, Nb may be reduced as a wider filter becomes sufficient. The result obtained in Eq. (12) is important since it allows an accurate determination of the block size when such a scheme is to be implemented without the need to perform lengthy simulations.

Comparison with Monte-Carlo simulation
MC simulations have been performed to verify the validity of the results obtained in Section 3. A comparison between the MC simulation and approximate analytical expression for is presented in Fig. 5. In this case the MC simulations are a strict implementation of the feedforward carrier recovery scheme without any assumptions made on the distribution of the phase estimation error. The MC simulation and analytical approximation show excellent agreement, supporting the approximations made in order to arrive at the analytical expression given in Eq. (11). As expected, the analytical expression is more accurate for smaller LW, which is seen in Fig. 5  The BER curves obtained from MC simulations were also compared to the approximate BER calculated using Eq. (2) with ϕ Δ distributed as in Eq. (11). The optimal block size for each SNR and LW considered (given in Eq. 12) was used. The BER curves are presented in Fig. 6. The MC simulations were performed using a series of 6 10 samples. Results of simulations with at least 100 errors are included in the BER comparison. Excellent agreement can be observed between the MC simulation and the obtained analytical approximation. The limit curve shown in Fig. 6 is the numerical evaluation of Eq. (2) taking ( ) ( ) P ϕ ε δ ε Δ = . Since the approximations used in the analytical derivation become more accurate with higher SNR, it is expected that the analytical approximation for BER and its MC simulation will have even better agreement at high SNR values. As seen in Fig. 6, the use of DSP incurs a small power penalty (e.g., approximately 0.3dB at beat LW of 600KHz for a BER of 9 10 − ) compared to the ideal curve (i.e., no phase estimation error) while avoiding the need to employ carrier phase locking. This simplifies the complexity of a CD receiver dramatically. The DSP based scheme is also observed to significantly outperform the differential QPSK reception. However, it is to be noted that the model considered in the analysis presented does not take into account various other noise sources, as mentioned earlier and may serve as a preliminary, if somewhat optimistic estimation of the DSP based system performance.

Conclusion
In this paper, an estimate of the BER for the QPSK feedforward carrier recovery scheme using DSP suggested in [3] was obtained analytically through a series of approximations.
The DSP phase estimation scheme was presented in detail. A 4-fold phase ambiguity associated with this detection scheme was resolved by using differential encoding. It was also determined that shot noise filtering is needed to reduce the effect of shot noise on the phase tracking performance. However, the filtering process itself introduces an error in phase noise tracking. A tradeoff between these two factors is to be addressed where the variable controlling this tradeoff is the PU block size which determines both the width of the shot noise filter and number of samples which share the same phase estimate. Through a series of approximations it was shown that the phase estimation error can be modeled as a zero mean Gaussian RV. The phase estimation error variance was shown to be associated with the beat LW, electrical SNR and block size. Extensive simulation results show that the phase estimation error approximation to Gaussian is viable.
To optimize the system performance (i.e., balance between shot noise filtering and phase noise tracking) the variance of approximated PDF for the phase estimation error was minimized with respect to the block size, thus obtaining an optimal block size at a given SNR and LW. The values obtained from MC simulations and the analytical expression for the variance of phase estimation error are in excellent agreement.
The analytical approximation allows prediction of the system performance (i.e., BER), for varying parameters. It was observed that the DSP receiver scheme introduces a small power penalty at a BER level of 9 10 − , compared to the ideal (no phase estimation error) case. The need to phase-lock the LO to the carrier's phase is alleviated, dramatically reducing the complexity of CD reception. Using results obtained in this paper, an intuitive understanding of the design tradeoffs is obtained and optimization may be carried out without reverting to timeand resource-consuming Monte-Carlo simulations.