Analytical optimization of wideband nonlinear optical fiber communication systems

: In the design of fiber links for both continental and transoceanic optical communication systems, the optimization of span length is of high importance from both performance and cost perspectives. In this work, the maximization of signal-to-noise ratio (SNR) is investigated by optimizing the span length in wideband (up to 4.5-THz) Nyquist-spaced optical fiber communication systems. A simple and accurate closed-form expression of the optimal span length is provided, and a quick estimation of SNR is also described for practically feasible and cost-effective span length values.


Introduction
Currently, over 95% of digital data traffic is carried over optical fiber networks [1]. Advanced techniques such as wavelength division multiplexing (WDM), polarization multiplexing, highorder quadrature amplitude modulation (QAM) and multi-channel digital back-propagation (MC-DBP) have been employed to meet the drastically increased demand on the data volume [2][3][4][5][6][7][8]. Such systems have high efficiencies of frequency spectrum, while they either need complex hardware signal processing or are sensitive to system parameters and impairments [9,10]. It is thus very important to study how to improve the performance, in order to design optical-fiber communication systems rationally.
Ultra long-haul terrestrial and submarine communication systems, require high and stable performance in terms of signal transmission. With these considerations, designers need to leave enough margin to cope with the random and burst interferences. Therefore, it is of high importance to know where is the optimal case and how much the corresponding performance can achieve. The gap between the current system performance and its optimal value is significantly useful. Designers need to comprehensively analyze these data to design appropriate systems, taking into account both the system performance and the economic cost.
Many research works have been carried out on evaluating the performance of high capacity optical fiber communication systems. The accuracy of the analytical model of the nonlinear interference i.e., enhanced Gaussian noise (EGN) model, has been demonstrated in many reported works [11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27]. The optimization of the information rate, the span length and the signal power has also been numerically and experimentally investigated [27,28]. However, there were no simple and closed-form optimization results provided. This leaves some space for improvement. For example, it is intractable to obtain an analytical form of the optimal span length via the full EGN model due to its complex structure. The GN model and some other NLI models are simple but not accurate enough [11,16,20,22]. The approximated EGN model [12][13][14][15]21] is accurate enough without being over complicated, while the numerical solution using the this model function images requires a lot of computation resources. This is still inefficient for designers. Therefore, these reported works could not be easily applied to the design of optical systems with different transmission parameters.
In this paper, we presented an analytical formula to calculate the optimal span length based on the approximated EGN model and mathematical approximations. We examine the signal-to-noise ratio (SNR) as an indicator of the performance when the transmission span length is varied in practical WDM optical communication systems considering the transceiver noise. In the process of deriving the formula, a simple closed-form estimation of the system SNR is provided. The accuracy of those formulas is also investigated. With those formulas, designers can estimate system the maximum SNR quickly with the transmission parameters, to determine whether the design of the system is feasible. Finally, The SNR is compared for digital signal process (DSP) schemes with electronic dispersion compensation (EDC) at optimized span length and MC-DBP at commonly used span length (80 km) [29] in dual-polarization quadrature phase shift keying (DP-QPSK), dual-polarization 16-ary quadrature amplitude modulation (DP-16QAM), DP-64QAM and DP-256QAM systems to indicate the improvement via the optimization.
This paper is arranged as follows. The numerical transmission setup is described in Section 2. The EGN model is explained in Section 3. Section 4 details the analyses of theoretical formulas and analytical results. Conclusions are drawn in Section 5. Figure 1 illustrates the transmission setup of the dual-polarization multi-channel Nyquist-spaced optical system using modulation formats of DP-QPSK, DP-16QAM, DP-64QAM, and DP-256QAM. In the transmitter, the laser comb lines are modulated with the in-phase and quadrature (I-Q) modulators. The transmitted symbol sequences in each channel and polarization are fully independent and random. The Nyquist pulse shaping (NPS) is implemented by a root-raised cosine (RRC) filter with a roll-off of 0.1%. The standard single mode fiber (SSMF) is simulated based on the split-step Fourier solution of Manakov equation with a logarithmic step-size distribution [30].

Transmission system
The erbium-doped optical fiber amplifiers (EDFA) are periodically placed to compensate for the loss in the optical fiber. To realize a coherent detection of all in-phase and quadrature signal components in each polarization, the received signals are mixed with an ideal free-running local oscillator (LO) laser carrier at the receiver end. In DSP modules, a frequency domain equalizer is used to implement the EDC [31,32]. The MC-DBP is realized using the reverse operation of the split-step Fourier solution of the Manakov equation with the nonlinearity compensation implemented in the middle of each segment [6,33]. For MC-DBP, an ideal RRC filter is used to select the desired back-propagated bandwidth. Meanwhile, the filter can also remove the unwanted out-of-band amplified spontaneous emission (ASE) noise. After that, a matched filter is used to select the center channel and to remove cross talk from neighboring channels. In total, over 2 16 symbols are employed to calculate the SNR. The phase noise, the frequency offset of the transmitter and LO lasers, and the differential group delay (DGD) between the two polarizations in the fiber are all neglected [34,35]. Table 1 details parameters used in the numerical simulations.

EGN model
Impairments in optical communication system can be divided into linear distortions and nonlinear distortions [11][12][13]26,27]. For dual-polarization multi-span EDFA amplified Nyquist-spaced WDM transmission systems without any phase noise. Linear distortions contain ASE noise and transceiver noise (TRx noise). Nonlinear distortions include signal-signal (S-S) interaction and signal-noise (S-N) interaction. Therefore, signal-to-noise ratio can be expressed as follows [16,36]: where ASE noise can be considered as additive Gaussian noise [37], and TRx noise power is proportional to signal power P.
whereκ is TRx noise figure, N s is the amount of fiber spans, G is EDFA gain, F n is EDFA noise figure, h is Plank constant, ν 0 is the center lightwave frequency and R s is the symble rate. From [38][39][40], the nonlinear noise can be evaluated by the following formulas [14][15][16]: where L eff = 1−e −αLs α , α is fiber attenuation coefficient, β 2 is second-order dispersion coefficient, N ch is the amount of WDM channels, L s is the fiber span length, ϵ is the coherence factor, η and ξ are the nonlinear interference (NLI) coefficient and the distance-dependence factor derived by the EGN model [12,41] respectively.
where the constant κ is related to the fourth standardised moment (kurtosis) of the input signal constellation. Values for different modulation formats are shown in Table 2 [26,36]. Function Φ(x) is the digamma function, C ≈ 0.557 is the Euler-Mascheroni constant and γ is the fiber nonlinear coefficient. As for the system with MC-DBP, the S-S interaction will be compensated, depending on the amount of MC-DBP channels. Meanwhile, σ 2 s−s can be expressed as following [17][18][19][20][21][22][23][24][25]: The accuracy of the EGN model has been well tested in previous works [16,18,23,36]. The EGN model is a perfect tool for estimating system SNR. However, sometimes engineers prefer to know the optimal span length or the difference between the actual case and the optimal case at given system parameters. Therefore, a closed-form optimal span length formula is derived based on the EGN model in this work.

Simpler SNR estimation
Analyzing the influence of different terms in the EGN model on the signal SNR and considering some mathematical approximations, the following simplified form can be obtained: For the detailed derivation process, please refer to Appendix A. There are some numbers in Eq. (14) and Eq. (18) that arise from fitting the ϵ (Eq. (7)) with a biased first-order Taylor expansion, and they are constant. Moreover, A is only related to the square of bandwidth, and the change is small when the bandwidth is over 2 THz. Therefore, it can be further simplified by using a constant to replace related parameters. The accuracy of Eq. (12) has been examined and the result is shown in Fig. 2.  Figure 2 shows the difference in terms of predicted SNR between Eq. (12) and the EGN model. The transmission distances are 4000 km in the orange area and 8000 km in the green area, respectively. The red dots in Fig. 2 are simulation results. It is shown that the performance of system working at the optimal power (-5 dBm in Fig. 2(a) and -2 dBm in Fig. 2(b), solid line) and the performance of system operating at a higher signal power (-3 dBm in Fig. 2(a) and 0 dBm in Fig. 2(b), dashed line). It can be seen that the discrepancies between Eq. (12) and the EGN model are always smaller than 0.3 dB, in terms of SNRs. These deviations are within the acceptable range in practical applications. Besides, the results from the approximated EGN model are also very close to the simulation results.

Optimal span length estimation
Using the simplified EGN model (Eq. (12)) and corresponding mathematical approximation, the formula for the optimal span can be obtained as follows: Appendix B shows the derivation procedure in detail. Figure 3 shows the difference of the estimated optimal span length from Eq. (19) and from the EGN model at the bandwidth of half C-band (∼ 2 THz), including four groups of curves (a)-(d).
The EDFA noise figure is 6 dB in all four groups. Fig. 4 shows the difference of the estimated optimal span length from Eq. (19) and from the EGN model at the bandwidth of C-band (∼ 4.5 THz), include three groups of curves (a)-(c). Group (c) has a lower EDFA noise figure (4.5 dB) than the group (a) and the group (b) (6 dB). All systems work near the optimal signal power. It can be seen that the discrepancy decreases when the transmission length goes larger and the difference is always within 5%. Besides, it can be found that when the EDFA noise figure (F n ) grows up, the optimal span length L s (Opt.) goes down. That indicates that the choice of EDFAs with lower noise figure could decrease the system costs, since the optimal span length increases. Although the unit price of EDFAs may increase, fewer EDFAs will be used due to the extension of the optimal span length. This would save total costs in long-haul communication systems. Taking an 8000 km communication system as an example, the use of EDFAs with a noise figure of 4.5 dB rather than with a noise figure of 6 dB could reduce the total costs if the unit price of EDFA is increased by less than 12%. On the other hand, the SNR performance will improve due to the decrease of EDFA noise figure. This means that the system has the potential to increase the data rate (e.g. by using higher-order modulation formats) in the future.

Analysis of optimal span length considering commercial factors
The optimal span length described above corresponds to the best sysmtem performance (SNR) with specific transmission schemes. It represents the fiber span length when the maximum SNR can be achieved using specific EDFA, modulation format, and symbol rate, etc. Usually, the calculated optimum fiber span length is relatively short (⩽ 35 km). In this section, we use an indicator of cost/bit to analyze the optimal span length taking into account commercial factors, and the corresponding cost parameters are shown in the Table 3 [42]. The difference in C OA of two optical amplifiers with different noise figures (4.5 dB and 6 dB) comes from Ref. [43,44]. The system cost is calculated by C tot = (C D + C C + C F M)L + C OA NM + C T C, where C D , C C , C F , C OA and C T are listed in Table 3, M is the number of spatial paths and its value is 16 (8 fiber pairs) in this paper, L is transmission distance, N is the amount of EDFAs and C represents the capacity which calculated by C = 2MB log 2 (1 + SNR) and B is the system bandwidth (4.5 THz) [42]. Figure 5 shows the relationship between the cost/bit (or the capacity) and the fiber span length. The solid lines correspond to the capacity and the dash lines to the cost/bit. It can be found that the calculated optimal span length to maximize the capacity is smaller than the estimated optimal span length to achieve the minimum (best) cost/bit value. Therefore, considering the commercial factors only, the optimal span length is near 60 km. This will lose some system capacity and will cause the communication system to work sub-optimally. The use the optimal span length derived by Eq. (19) can allow the maximum capacity, but it will cost more to build such systems. Meanwhile, considering the blue lines and the orange lines in Fig. 5(a) or Fig. 5(b), it can be found that the use of EDFAs with a lower noise figure not only can improve the system capacity, but also can decrease the cost per bit value. Taking Fig. 5(b) as an example, if the capacity requirement is 475 Tb/s in the 6 dB EDFA scheme, the fiber span length is near 38 km. The cost/bit is 0.048 correspondingly. For the 4.5 dB EDFA scheme, in order to achieve the same or higher capacity requirement, the fiber span length can be selected as between 13 km and 60 km. It can be found that when the span length is between 40 km and 60 km, the 4.5 dB EDFA transmission scheme not only improves the system capacity but also reduces the cost/bit. Besides, as mentioned in Section 4.2, it may also decrease the system total costs by replacing EDFAs with a lower noise figure because of the longer fiber spans. Therefore, the use of EDFAs with lower noise figures is a solution to balance the performance and the costs.

Fast estimation of commonly used span length
The system SNR can be calculated in a simpler way, than the use of Eq. (12), when span length is higher than 50 km.
Some mathematical approximations are detailed in Appendix C. The final formula is shown as follows: Fig. 6 shows the difference between Eq. (20) and the EGN model for different span lengths, and the accuracy is very high.

Improvement from the use of the optimal span length
Through Eq. (19), the optimal SNR of the system can be easily and quickly calculated, as shown in Fig. 7 orange lines. It has been known that the SNRs of systems with 16QAM, 64QAM and 256QAM behave close to each other [27], so we only plotted the 16QAM case in Fig. 7(a) and Fig. 7(c). To demonstrate the improvement from the use of the optimal span length, Fig. 7 draws the SNR of the system with 80 km per span (purple lines). It can be seen that compared to the 80 km/span system, the improvement is significant. For the submarine optical communication system, the span length has been set to 50 km to cope with the harsh communication environment (dark lines) [29]. However, 50 km is still larger than the optimal span length. The SNR can be further improved by about 1 dB via the reduction of fiber span length as shown in light green area in Fig. 7. In addition, we have compared the improvement from the use of MC-DBP and the improvement from the optimization of span length. Currently, the digital coherent receiver has a maximum bandwidth of 300 GHz [45] that means only 9-channel MC-DBP for 32 GBaud system and 5-channel MC-DBP for 64 GBaud can be achieved. Interestingly, the SNR in the system with 300 GHz MC-DBP (80km per span) behaves similar to the system with 50km per span (EDC only). This means that when the bandwidth of the coherent receiver and the speed of the DSP cannot make a breakthrough, the optimization span length has always been an effective option to improve the system performance.

Conclusion
The EGN model can estimate the system SNR accurately, however, the computational complexity is significant and the optimal solution of span length can not be obtained easily. In this work, we have simplified EGN model with some mathematical approximations and proved its accuracy. After that, the analytical formula of the optimal span length is derived by solving simplified formulas. We have derived a fast and accurate formula to estimate the optical fiber span length to maximize SNR of the optical communication systems. We have already examined its accuracy and it can reduce the computational complexity significantly compared to the EGN model. Meanwhile, we analysed the optimal span length not only in terms of performance but also with respect to commercial factors. To balance the cost and the signal performance, the noise figure of optical amplifiers is also an important factor. Generally, the EDFAs with lower noise figure have more commercial benefits especially in systems with longer transmission distances. For systems with a fiber span length over 50 km, we have further simplified the EGN model and the simplified formula only needs to calculate exponential function once and multiplication three times. Finally, the improvement using optimal span lengths is examined compared to the use of MC-DBP. Our work provides a simple and analytical solution at the design of the optical fiber links in both continental and transoceanic optical communication systems.

Appendices
Appendix A. SNR estimation For a communication system without MC-DBP, its SNR can be calculated using Eq. (1) -Eq. (10).
To simplify the formula, we first ignore some terms that contribute little to SNR. Consider the nonlinear noise term, it can be found that they have the same pre-factor, ,κ is TRx noise figure. When N s is big enough, the second term in brackets in Eq. (21) approximately equals 1.5 σ 2 totASE P , which nearly equals 0 because the signal power P is far greater than ASE noise power σ 2 totASE . Besides,κ is 25 dB ≈ 0.0032 is negligible. Therefore, From the EGN model and the normalization of the numerator in Eq. (22), we could get η corr = 80 81 where η GN represents the η derived by the GN model and η corr is the correction term derived by the EGN model. When x is big enough, sinh −1 (x) = log(x + ︂ is a constant. Let C represent it, and C 0 represent γ 2 π |β 2 |R 2 s . Substitute L = N s L s into Eq. (23) and use L eff L s as an independent variable x. where , and L is the transmission distance. From the EGN model [16], ϵ approaches 0, and the optimal span is usually small meaning that x approaches 1 [28]. N s is finite and ϵ changes little, so it is feasible to fit corresponding curves with linear functions. As follows, C = hν 0 R s F n Lα P (28) κ 1 = 8 27 whereκ is TRx noise figure.
the optimal span length is usually small within 30 km, and x is close to 1. The fit curve of function 1 x can be fitted as follows, Eq. (37) is accurate enough when L s within 20-50 km as long as the fiber loss coefficient is 0.2 dB/km. As for other type of fibers, the fitting parameters are shown in Table 4. Now we consider the second part of Eq. (35). In the process of deriving Eq. (12), due to some approximations, the optimal point has been shifted to the right hand side. It is noted that the value of the second part of Eq. (35) changes very slowly when L s is close to the optimal value. To derive the analytical formula, we use a value near its minimum as an approximate representation which can also compensate for the error caused by the shift of the optimal point. Finally, the closed-form optimal span length formula can be expressed as follows: L s (Opt.) = log (︂ 5.8C+κ 1 (0.13 log(αL)+1)−2κ 2 1.3C )︂ /0.076
Data availability. Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.