Simple and efficient frequency offset tracking and carrier phase recovery algorithms in single carrier transmission systems

In this paper, we propose a low-complexity and efficient carrier recovery algorithm for single carrier transmission systems that is capable of tracking frequency offset (FO) variations. Working as a FO tracking estimator, the algorithm demonstrates good accuracy in simulation and a FO drift of up to 200 MHz/μs can be compensated with minimal degradation in a QPSK system. In 112 Gb/s dual polarization (DP) QPSK experiments, the algorithm recovers a data sequence having >80 MHz of FO drift within 250 μs, providing better performance than a one-time estimator. In a regime that utilizes parallel processing of the data, we further demonstrate FO tracking and carrier phase recovery (CPR) using only one of the streams in a parallelized configuration, and we apply the carrier information to recover neighbouring streams directly. Consequently, the complexity of both the FO tracking and the CPR is further reduced. ©2013 Optical Society of America OCIS codes: (060.1660) Coherent communications; (060.5060) Phase modulation. References and links 1. L. Li, Z. Tao, S. Oda, T. Hoshida, and J. C. Rasmussen, “Wide-range, accurate and simple digital frequency offset compensator for optical coherent receivers,” in Proc. OFC’08, Paper OWT4. 2. M. Selmi, Y. Jaouen, and P. Ciblat, “Accurate digital frequency offset estimator for coherent PolMux QAM transmission systems,” in Proc. ECOC '09, Paper P3.08. 3. J. C. M. Diniz, J. C. R. F. de Oliveira, E. S. Rosa, V. B. Ribeiro, V. E. S. Parahyba, R. da Silva, E. P. da Silva, L. H. H. de Carvalho, A. F. Herbster, and A. C. Bordonalli, “Simple feed-forward wide-range frequency offset estimator for optical coherent receivers,” Opt. Express 19(26), B323–B328 (2011). 4. T. Nakagawa, K. Ishihara, T. Kobayashi, R. Kudo, M. Matsui, Y. Takatori, and M. Mizoguchi, “Wide-range and fast-tracking frequency offset estimator for optical coherent receivers,” in Proc. ECOC’10, Paper We.7.A.2. 5. T. Nakagawa, M. Matsui, T. Kobayashi, K. Ishihara, R. Kudo, M. Mizoguchi, and Y. Miyamoto, “Non-dataaided wide-range frequency offset estimator for QAM optical coherent receivers,” in Proc. OFC’11, Paper OMJ1. 6. X. Zhou, X. Chen, and K. Long, “Wide-range frequency offset estimation algorithm for optical coherent systems using training sequence,” IEEE Photon. Technol. Lett. 24(1), 82–84 (2012). 7. A. Leven, N. Kaneda, U. Koc, and Y. Chen, “Frequency estimation in intradyne reception,” IEEE Photon. Technol. Lett. 19(6), 366–368 (2007). 8. P. Gianni, G. Corral-Briones, C. E. Rodriguez, H. S. Carrer, and M. R. Hueda, “A new parallel carrier recovery architecture for intradyne coherent optical receivers in the presence of laser frequency fluctuations,” in Proc. IEEE GLOBECOM’11, pp. 1–6. 9. S.-H. Fan, J. Yu, D. Qian, and G.-K. Chang, “A fast and efficient frequency offset correction technique for coherent optical orthogonal frequency division multiplexing,” J. Lightwave Technol. 29(13), 1997–2004 (2011). 10. M. Qiu, Q. Zhuge, X. Xu, M. Chagnon, M. Morsy-Osman, and D. V. Plant, “Wide-range, low-complexity frequency offset tracking technique for single carrier transmission systems,” in Proc. OFC’13, Paper OTu3I.8. 11. E. Ip and J. M. Kahn, “Feedforward carrier recovery for coherent optical communications,” J. Lightwave Technol. 25(9), 2675–2692 (2007). 12. T. Pfau, S. Hoffmann, and R. Noé, “Hardware-efficient coherent digital receiver concept with feedforward carrier recovery for M-QAM constellations,” J. Lightwave Technol. 27(8), 989–999 (2009). 13. M. G. Taylor, “Phase estimation methods for optical coherent detection using digital signal processing,” J. Lightwave Technol. 27(7), 901–914 (2009). #184962 $15.00 USD Received 6 Feb 2013; revised 16 Mar 2013; accepted 18 Mar 2013; published 28 Mar 2013 (C) 2013 OSA 8 April 2013 | Vol. 21, No. 7 | DOI:10.1364/OE.21.008157 | OPTICS EXPRESS 8157 14. Q. Zhuge, M. Morsy-Osman, X. Xu, M. E. Mousa-Pasandi, M. Chagnon, Z. A. El-Sahn, and D. V. Plant, “Pilotaided carrier phase recovery for M-QAM using superscalar parallelization based PLL,” Opt. Express 20(17), 19599–19609 (2012). 15. Z. Tao, L. Li, L. Liu, W. Yan, H. Nakashima, T. Tanimura, S. Oda, T. Hoshida, and J. C. Rasmussen, “Improvements to digital carrier phase recovery algorithm for high-performance optical coherent receivers,” IEEE J. Sel. Top. Quantum Electron. 16(5), 1201–1209 (2010). 16. M. Oerder and H. Meyr, “Digital filter and square timing recovery,” IEEE Trans. Commun. 36(5), 605–612 (1988).


Introduction
Coherent detection combined with digital signal processing (DSP) is widely employed in modern optical communication systems.In coherent transmission systems with a free-running local oscillator (LO), the frequency offset (FO) between the transmit laser and the LO is an inevitable problem.Although this frequency mismatch can be compensated by controlling the laser frequency directly based on an analog feedback loop, this type of design is usually complicated and ineffective.In this sense, approaches based on DSP are preferred for FO estimation because they do not need the additional manipulations of the LO [1].Several algorithms have been proposed as an FO estimator in single carrier systems.In [2][3][4][5], the authors describe several spectrum based methods.These methods detect the spectral shift of the received signal, which is proportional to the FO in the system.However, these algorithms require a Fourier transform of the signals, so generally their computation complexity is high.Other approaches utilize the phase increment between adjacent symbols [6,7].But to make these algorithms tolerant to large FO, calculations are required for each symbol to do the estimation.Given that the computational resources are limited at the receiver further reduction of the computation complexity is desirable.In addition, the frequencies of transmitter and LO suffer fluctuations in practical systems, so the FO is not constant [8,9].As an example, the frequency variation rate of an external cavity laser (ECL) was experimentally measured at approximately 0.2 MHz/μs in [9].Consequently, accurate FO tracking is necessitated in the carrier recovery algorithm.However, most of the previous estimators only give a constant estimation for the processed sequence, so they are not suitable for FO tracking.Although this problem can be partly solved by duplicating the estimators and employing them in multiple positions of the sequence, the complexity of this implementation is high.Furthermore, most of the previous receiver algorithms perform the carrier phase recovery (CPR) after the FO compensation.Therefore, coordinating the CPR with the FO estimator and reducing the overall complexity is also important in order to achieve efficient implementation in high speed systems.
In this paper, we describe in more detail the FO tracking algorithm proposed in [10].In addition, we propose an efficient and low-complexity parallel implementation of this algorithm together with carrier phase recovery.The outline of the paper is as follows: In Section 2 the principle of our algorithm is described.Then in Section 3, several numerical and experimental results are provided.Specifically, we first numerically show that our FO tracking algorithm gives <1 MHz estimation error when the FO drift rate is 2 MHz/μs.In addition, we demonstrate that this algorithm is able to correct FO drifts up to 200 MHz/μs with only small degradation for quadrature phase-shift keying (QPSK) signal.Next we demonstrate experimentally using a 112 Gb/s dual-polarization (DP) QPSK system that the algorithm successfully detects a FO drift of more than 80 MHz within 250 μs.We also show that the FO estimations given by the proposed estimator are comparable with the results given using the spectrum-based methods.In the regime that utilizes parallel processing, we further numerically and experimentally demonstrate the simplified implementation of FO tracking and CPR that shares one set of carrier frequency and phase estimations among multiple parallel streams.Finally the conclusions are summarized in Section 4.

Principle
The FO estimator in our algorithm is capable of tracking the FO between the transmit laser and the LO.initial FO estimation gives the overall frequency mismatch of the received sequence.To achieve this we use the training-symbol-aided method described in [6].This algorithm gives a FO estimation range of [−R s /2, R s /2], where R s is the symbol rate.This wide estimation range is essential especially when a large global FO exists between the transmit laser and the LO.
After initial FO estimation, tracking the FO drift is achieved using the data sequence immediately following the training symbols.During tracking, the data sequence is divided into blocks, with the block length L determining the tracking time resolution.Before the (n + 1) th block is processed for CPR, it is pre-compensated using the estimated FO of the n th block that is denoted as f o,n .Then the FO increment, ∆f o,n is estimated for the current block, and is added to f o,n to get the FO estimation f o,n+1 , as shown in Fig. 1 and Eq. ( 1) where C is the pre-determined weighting coefficient of the estimated FO increment.The theoretical value of C is 1.In both simulation and experiments, the value of C can be optimized within 1~2 depending on the expected frequency drift rate, and larger C values usually give better performance when FO drifts fast.Next the updated FO estimation f o,n+1 is used for pre-compensating the next block, and the operations above are repeated.This process gives a good approximation if we choose the block length properly such that the FO drift inside a block is sufficiently small (e.g.several MHz).To estimate ∆f o,n , we further divide the block into sub-blocks and each sub-block contains K symbols.Then we calculate the phase increment over each sub-block as Eq. ( 2): Here i is the index of the sub-block, φ(k i ) represents the phase of the k i th symbol after the removal of modulation and k i is the index of the first symbol in the corresponding sub-block.In other words, the phase increment is calculated every K symbols within the block.The modulation is removed using a digital phase-locked loop (DPLL) in our work.However, other phase recovery algorithms such as the Viterbi-and-Viterbi method can also work with the proposed estimator.The calculated phase increments are attributed to both the FOinduced phase shift and the random phase errors related to laser phase noise and additive white Gaussian noise (AWGN) [11].After they are averaged within a block, the random noise component becomes less effective.In other words, the average phase increment is mainly attributed to the FO-induced phase shift and thus ∆f o,n can be approximated using Eq. ( 3): where ϕ Δ is the average phase increment within the processed block and T sym is the symbol duration.One thing to note is that longer block lengths help to suppress the random noise, although this limits the time resolution.To avoid ambiguity, the phase increment should be confined within the range of [-π, π).For the previous phase increment based algorithms, what they estimate is the absolute value of the FO, which may be as large as several GHz.So the value of K cannot be large in order to make the phase increment smaller than the limit.For example, the estimators in [6,7] investigate the phase increment between adjacent symbols, which corresponds to K = 1.As a result, calculations are required for each symbol in these algorithms.As an improvement of our approach, we perform the FO pre-compensation described above.By doing this, ∆f o,n , which is the target of the estimation, is the FO increment and is only several MHz if the FO changes smoothly and the FO variation inside a block is sufficiently small.So even if we calculate the phase increment between symbols with a larger spacing, the absolute value of the phase increment can be still smaller than π in a high baud rate system.Therefore, the frequency at which phase increment calculations are performed is reduced, and subsequently, the complexity of FO estimation is reduced in our algorithm.Specifically, if the computations for CPR and FO compensation are not taken into account, then on average only 2/L real multipliers and 2/K real adders are required for each symbol to do tracking in the proposed algorithm.The proper value of K here is mainly determined by ∆f o,n , which is further related to FO drift rate and block length.In the following sections, the value of K is 50.
In addition to a reduction in complexity, the proposed FO tracking algorithm has other benefits.Firstly, our estimator can compensate the FO dynamically.By doing this, the residual FO is always minimized such that it can be tolerated by CPR algorithms even in the presence of FO drifts.Next, the tracking algorithm estimates the FO increment rather than measure the absolute FO.Therefore, if the initial FO can be estimated within [−R s /2, R s /2] without ambiguity and the FO drift rate is tolerable to the tracking algorithm, our estimator can keep tracking the FO change and a large FO can be estimated even when it exceeds the limitation of [−R s /2, R s /2].This benefit is crucial when we use a low-cost laser source with large end-of-life frequency inaccuracy.
In high data rate systems parallelization and pipelining are used [12,13].Specifically, received sequences are de-multiplexed into multiple parallel streams and each stream is processed individually.We note that the parallelization process not only enhances phase noise induced impairments [12] but also amplifies the influence of the FO and its drift.Specifically, if the degree of parallelism is denoted by P, then for a certain stream, one symbol is extracted every P symbols from the original sequence, as shown in Fig. 2 [14].Therefore, the FOinduced phase shift between adjacent symbols is increased by P times in each de-multiplexed stream.In other words, the equivalent FO of each stream is P times larger than the original serial sequence.Assuming the use of parallel processing, we next discuss algorithm implementation in this regime.Note that if stable and narrow-linewidth laser sources such as ECLs are used in a system, the carrier information does not differ that much among different streams.In this situation, the proposed FO tracking algorithm can be employed in a subset of the streams (including only one) and the estimated FO variation in these streams is shared with the neighbouring streams.In the example shown in Fig. 3, the FO tracking estimator works on the p th stream when the degree of parallelism is P. in this stream (B(n, p)), a factor F comp,n is used for FO compensation of the next block and this factor is applied to this stream as well as neighbouring streams (i.e.Stream 1~(p-1) and Stream (p + 1)~P).By doing this, the FO estimators in the other streams can be removed, and the overall complexity is reduced.Note that although the hardware implementation is facilitated using the parallel structure, its tolerance to both FO and FO drift is reduced compared with the serial structure [15].Similar ideas also apply to the CPR in our algorithm, and as we mentioned above, a DPLL is used for the CPR in this paper.The principle of the DPLL is shown in Fig. 4 [14].First the current carrier phase error ϕ k is estimated based on the previous symbols.This estimated phase error is used to de-rotate the phase of the current symbol r k .Next the symbol is decided, conjugated and multiplied by itself to get the complex error e k .Considering the majority of the FO is pre-compensated based on our algorithm, the phase of e k is small in most cases, so this phase can be approximated by simply extracting the imaginary part of e k as shown in Fig. 4. Then this phase difference is multiplied by a weighting parameter G and added to the previous phase offset to provide the new estimated phase error.Using this DPLL in one parallel stream, the carrier phase error of each symbol can be obtained.By copying this carrier information directly as shown in Fig. 3, the phase compensation can be easily performed in the neighbouring streams.Therefore, the DPLL can be removed in those streams, and the overall complexity for CPR is reduced.However, the reduced tolerance to the laser phase noise and residual FO is the trade-off of the proposed parallel processing scheme.

Simulation evaluation
First we investigate the regime without parallelization.In this simulation, a 28-Gbaud QPSK sequence with 1.12 × 10 6 symbols is generated and contaminated by the laser phase noise and AWGN.The laser phase noise was modeled as a Wiener process, which results from a total system linewidth of 200 kHz, corresponding to the ~100 kHz linewidths of the ECLs at both the transmitter and the LO, and the optical signal-to-noise ratio (OSNR) is 13.5 dB.Then we set an FO manually and impose the FO-induced phase shift to the symbols.The back-to-back scenario is considered without transmission impairments and other effects.10 4 symbols are allocated to the training sequence, which is used to estimate the initial FO.The phase of the following data symbols are recovered through the DPLL.Differential coding/decoding is employed to avoid the cycle slip.
Figure 5(a) shows a specific case when the initial FO is 1 GHz, the FO drift rate is 2 MHz/μs and the block length is 10 4 symbols.From the simulation result, we can see the variation of the FO is successfully tracked by the proposed algorithm with estimation errors smaller than 1 MHz.This deviation is small enough to be tolerated by CPR algorithms [1]. Figure 5(b) demonstrates our algorithm's tolerance to a very high FO drift rate assuming the FO drifts linearly in each simulation.The bit error rates (BER) in this figure are obtained under the optimized parameters of the DPLL and C value for each case.The FEC limit refers to a BER of 3.8 × 10 −3 .From the figure, we can see the performance of DPLL without dynamic FO tracking is degraded significantly as the FO variation rate is increased.The implementation which employs a separate estimation for each block based on the spectrum of the 4th power of the processed symbols [2,5] is tolerant to FO drifts up to 50 MHz/μs.However, the complexity of this implementation is high.Besides, it does not function well at higher FO variation rate, because the absolute value of the FO in some blocks may exceed the estimation limit of this estimator after FO drifts for a long time.In contrast, with the aid of the proposed FO tracking, FO drifts up to 200 MHz/μs can be compensated with only small degradation.Therefore, by using our FO tracking algorithm the system reliability can be ensured even in an unstable environment where the laser frequency may change abruptly and unpredictably.Moreover, the complexity of the proposed FO tracking is much lower than the implementation using multiple separate estimators.In a more realistic regime with parallel processing, the BER as a function of OSNR is investigated.The modulation format of QPSK is first investigated, and the sequence is interleaved into 8 parallel streams as shown in Fig. 2, each containing 1.12 × 10 6 symbols.The frequency variation rate of the original serial sequence is 0.2 MHz/μs, which is a practical value for ECLs measured in [9].The 5th stream is chosen for the FO estimation and DPLL-based phase recovery, and the carrier recovery of the other 7 streams is performed based on the estimated FO and phase errors from the 5th stream.Finally the BER of stream 1, 5 and 8 is extracted in order to compare the performance of different streams.These results are further compared with the cases without parallel processing and without tracking.The theoretical curve and the FEC limit are also included as reference.simulation results.Certain amounts of degradation from theory are introduced for all the simulated curves, and the use of differential coding/decoding is one main reason for this penalty.In the regime with parallel processing, the performance of the 1st, 5th and 8th stream is almost the same.Therefore, we conclude that the tracked FO and carrier phase errors can be shared among the parallel streams.Compared to the case without parallelization, a very slight degradation is introduced because the interleaving enhances both the laser phase noise and FO-induced phase shift.In addition, the FO tracking provides a 1 dB improvement in terms of OSNR at the FEC limit compared to the regime without the tracking mechanism.When the FO drift rate of the serial sequence is increased to 2 MHz/μs, the demodulation cannot be completed normally if no tracking mechanism is included.In contrast, the system with FO tracking still works and provides the performance that is almost the same as the previous case as shown in Fig. 6(b).If the modulation format is 16-QAM, similar results are observed as shown in Fig. 7 when the baud rate is 28 Gbaud and the FO variation rate is 0.2 MHz/μs.One thing to note is the performance degradation because of the parallel processing is more significant for 16-QAM.

Experimental validation
Next we conducted experiments in a 28 Gbaud (112 Gb/s) DP-QPSK system.Figure 8 illustrates the transmission setup.Two ECLs are employed at the transmitter and the LO.The signals provided by the digital-to-analog converters (DACs) drive an IQ-modulator to perform electrical-to-optical conversion.The DP signal is emulated by splitting the signal into two polarizations and re-combining them after delaying one polarization.Next the signal is boosted to the optimal power before it is sent to the re-circulating loop.The loop consists of 320 km of standard single mode fiber (SMF), and an EDFA with a noise figure of 5 dB is placed every 80 km to compensate the fiber loss.The launch power is −2 dBm and the transmission distance is 4800 km.The output from the loop is filtered, amplified and filtered again.After coherent detection, the signals are digitized by two 80 GS/s real-time scopes.
Finally DSP is performed offline in MATLAB, which includes the re-sampling, timing recovery [16], chromatic dispersion (CD) compensation, timing synchronization and constant modulus algorithm (CMA) before the carrier recovery.The received symbols are interleaved into 8 streams and processed in parallel for the carrier recovery.In the 5th stream, the FO tracking is performed using the proposed FO estimator, and the CPR and symbol decision is achieved through a DPLL and differential coding/decoding.After the FO and carrier phase errors are obtained in the 5th stream, they are shared to recover the other 7 streams.We obtained a serial sequence containing 7.02 × 10 6 symbols in total, corresponding to a time duration of 250.71 μs in the 28 Gbaud system.The block length is 2000 in each parallel stream, which enables a time resolution of 0.571 μs considering the degree of parallelism is 8.The value of K is 50, which means we calculate the phase increment every 50 symbols.
Figure 9 (a) depicts the estimated FO variation based on the proposed algorithm.During the time duration of 250 μs, the FO drifts approximately 88 MHz, which amounts to an average drift rate of approximately 0.35 MHz/μs.The black stars represent the FO estimations of the discrete data sections which are estimated based on the spectrum of the 4th power of the processed symbols.From the comparison, we can see the results based on our approach match the values given by the spectral approach.However, the complexity of the proposed algorithm is much lower.
Next the system performance with and without FO tracking is compared.The same received sequence is loaded into the two algorithms and processed respectively.For the algorithm without FO tracking, the sequence is compensated based on the initial FO only, which is estimated from training symbols, and there is no mechanism to track the FO change afterwards.Figure 9(b) summarizes the relationship between the BER and the sequence length.The sequence length here represents the total length of the input serial sequence.The BER in this figure, which is calculated in the 5th stream over all the data in this stream, is obtained using the optimal parameters for each case.As shown in the figure, if the tracking mechanism is not included, the system performance deteriorates gradually as the processed data sequence becomes longer.This is because the entire sequence is merely compensated by the first estimated FO.After frequency drifts, the residual FO in the following sequence may become too large for the DPLL to tolerate.In contrast, with FO tracking, both the initial FO and its variation are corrected.As a result, a low BER is maintained even when the processed sequence becomes long and FO drift inside the sequence becomes significant.Finally the BER of the different parallel streams are compared, which is shown in Fig. 10.The cases with parallelism degree equal to 8, 16 and 32 are investigated using the data above.In these three cases, the FO tracking and the DPLL-based CPR are employed in the 5th stream, the 9th stream and the 17th stream, respectively.The BER level without parallel processing is also plotted as a reference.As a result of parallelization, the performance of each stream is degraded slightly compared with the serial case as shown in Fig. 10, because the effects of the FO and phase noise are enhanced in each stream interleaving.However, a similar performance among the different streams is observed, thus validating the proposed parallel processing scheme up to a parallelism degree of 32.

Conclusion
In this paper, we propose a low-complexity and efficient carrier recovery algorithm for the optical coherent receiver in single carrier systems.Working as a frequency offset (FO) estimator, it is capable of tracking the FO variation with low complexity.In simulation, it gives accurate FO estimations with small errors, and FO drifts up to 200 MHz/μs can be compensated effectively for QPSK signals.Experimentally we successfully recovered a data sequence having >80 MHz FO drift in a 112 Gb/s DP-QPSK system.We show that the algorithm maintains good system performance in the presence of FO drift, which is in contrast with the significant degradation in the configuration without tracking mechanisms.In the regime with parallelization, we also demonstrate that the carrier information in one stream can be shared with other parallel streams.This further reduces the overall complexity for both the FO tracking and carrier phase recovery (CPR).

Fig. 3 .
Fig. 3. Carrier recovery in the regime with parallelization.B(n, p): the n th block in the p th tream; Φ n : the carrier phase information of the n th block in the p th stream.

Fig. 5 .
Fig. 5. (a) FO evolution and the estimations using the proposed tracking algorithm.(b) BER versus FO drift rate for systems with different tracking strategies.

Fig. 6 .
Fig. 6.BER versus OSNR in different configurations for QPSK system when the FO drift rate of the serial sequence is (a) 0.2 MHz/μs and (b) 2 MHz/μs.

Fig. 7 .
Fig. 7. BER versus OSNR in different configurations for 16-QAM system when the FO drift rate of the serial sequence is 0.2 MHz/μs.

Fig. 9 .
Fig. 9. (a) Estimated FO versus time and (b) BER versus sequence length for systems with and without FO tracking.