Properties of nonlinear noise in long, dispersion-uncompensated fiber links

We study the properties of nonlinear interference noise (NLIN) in fiber-optic communications systems with large accumulated dispersion. Our focus is on settling the discrepancy between the results of the Gaussian noise (GN) model (according to which NLIN is additive Gaussian) and a recently published time-domain analysis, which attributes drastically different properties to the NLIN. Upon reviewing the two approaches we identify several unjustified assumptions that are key in the derivation of the GN model, and that are responsible for the discrepancy. We derive the true NLIN power and verify that the NLIN is not additive Gaussian, but rather it depends strongly on the data transmitted in the channel of interest. In addition we validate the time-domain model numerically and demonstrate the strong dependence of the NLIN on the interfering channels' modulation format.


Introduction
The modeling of nonlinear propagation in optical fibers is a key component in the efficient design of fiber-optic communications. Although computer simulations have long reached a state of maturity allowing very accurate prediction of system performance, their use is prohibitively complex in many cases of relevance, where approximate analytical models become invaluable. In a wavelength division multiplexed (WDM) environment, nonlinear propagation phenomena can be classified as either intra-channel [1], or inter-channel [2] effects. Intra-channel effects manifest themselves as nonlinear inter-symbol interference, which can in principle be eliminated by means of post-processing (such as back-propagation [3]), or pre-distortion [4]. Inter-channel effects consist of cross-phase-modulation (XPM) and four-wave-mixing (FWM) between WDM channels, and in a complex network environment, where joint processing is prohibitively complex, distortions due to inter-channel effects are random and it is customary to treat them as noise. The chief goal of analytical models of fiber propagation is to accurately characterize this noise in terms of its statistical properties.
While early attempts of characterizing the properties of nonlinear interference noise (NLIN) in the context of fiber-communications date back to the previous millennium [5], two recent analytical approaches are of particular relevance to this paper. The first approach, which relies on analysis in the spectral domain, originated from the group of P. Poggiolini at the Politecnico di Torino [6][7][8][9][10][11] and its derivation has been recently generalized by Johannisson and Karlsson [12] and by Bononi and Serena [13]. The model generated by this approach is commonly referred to as the Gaussian noise (GN) model and its implications have already started to be addressed in a number of studies [14][15][16]. The second approach has been reported by Mecozzi and Essiambre [17], and it is based on a time-domain analysis. The results of the latter approach [17] are distinctly different from those of the former [6][7][8][9][10][11][12][13]. Most conspicuously, in the results of [6][7][8][9][10][11][12][13], the NLIN is treated as additive Gaussian noise and its power-spectrum is totally independent of modulation format. Conversely, the theory of Mecozzi et al. predicts a strong dependence of the NLIN variance on the modulation format, consistently with recent experimental observations [18]. It also predicts that in the presence of non-negligible intensity modulation a large fraction of NLIN can be characterized as phase noise. This property has a very important practical consequence. If NLIN indeed has a large phase-noise component, as argued in [17], then it can be canceled out easily by making use of its long temporal correlation [19], and the effective NLIN becomes much weaker than suggested by its overall variance. The consequences of this reality in terms of the predicted channel capacity have been recently studied in [19,20].
In this paper we review the essential parts of the time-domain theory of [17], as well as those of the frequency domain GN approach. We argue that the difference between the two models results from three subtle, but very important shortcomings of the frequency domain analysis. The first is the implicit assumption that NLIN can be treated as additive noise, while ignoring its statistical dependence on the data in the channel of interest. While it is true that within the framework of a perturbation analysis NLIN can always be expressed as an additive noise term, its dependence on the channel of interest is critical. In the case of phase noise, for example, the signal of interest s(t) changes into s(t) exp(i∆θ ), and the noise s(t) exp(i∆θ ) − s(t) ≃ is(t)∆θ may be uncorrelated with s(t), but it is certainly not statistically independent of it. The second shortcoming of the frequency domain approach is the assumption that in the limit of large chromatic dispersion the electric field of the signal and the NLIN that accompanies it can be treated as a Gaussian processes whose distribution is uniquely characterized in terms of its power density spectrum. The third shortcoming that we find in the GN analysis, is the claim that non overlapping frequency components of the propagating electric field are statistically independent of each other. We show here that these components are statistically dependent in general and it is the assumption of independence that is responsible for the fact that the NLIN in [6][7][8][9][10][11][12][13] appears to be independent of modulation format. We supplement the NLIN variance obtained in the frequency domain analysis of [6] with an extra term that follows from fourthorder frequency correlations and which, as we believe, settles the discrepancy with respect to the time-domain theory of [17].
The study contained in this paper was performed only for the case of single carrier transmission, where XPM constitutes the predominant contribution to NLIN, a fact which is confirmed by our simulations. For this reason the analytical parts of this paper focus exclusively on XPM. Moreover, in order to isolate only the NLIN caused by inter-channel nonlinear interference, we back propagate the channel of interest so as to eliminate the distortions that are induced by SPM and chromatic dispersion.
The paper is organized as follows. In Section 2 we review the main analytical steps of [17], occasionally recasting them in a form that emphasizes the aspects of most relevance to this paper, and supplement them by the calculation of the autocorrelation function of the nonlinear phase-noise [19]. We then review the spectral approach in Section 3 and explain the consequences of the assumptions on Gaussianity and statistical independence that were made in [6,7,12,13]. In Sec. 4 we describe a numerical study that validates the analytical prediction of Secs. 2 and 3. Section 5 is devoted to a summary and discussion.

Time-domain analysis
We consider a channel of interest, whose central frequency is arbitrarily set to zero, and a single interfering channel whose central frequency is set to Ω. Since XPM only involves two-channel interactions, the NLIN contributions of multiple WDM channels add up independently, and there is no need to conduct the initial analysis with more than a single pair. We also ignore nonlinear interactions that involve amplified spontaneous emission noise, which are negligible within the framework of a perturbation analysis such as we are conducting here. While a second-order analysis such as in [21] is possible in principle, we find the first-order approach sufficiently accurate in the context of the study conducted here. As a starting point we express the zeroth order (i.e. linear) solution for the electric field as where the superscript (0) throughout the equation signifies "zeroth order". The first sum on the right-hand-side of (1) represents the channel of interest, and the second sum represents the interfering channel. The symbols a k and b k represent the data that is carried by the k-th symbol of the two channels, respectively, z and t are the space and time coordinates, β ′′ is the dispersion coefficient and T is the symbol duration. For simplicity of notation, and without loss of generality we will assume throughout this section that β ′′ is negative and Ω positive. The fundamental pulse representing an individual symbol is g (0) (z,t) = U(z)g(0,t), where g(0,t) is the input waveform and U(z) = exp i 1 2 β ′′ z∂ 2 t (with ∂ t denoting the time derivative operator) [22] is the propagation operator in the presence of chromatic dispersion. We assume that the waveform g(0,t) is normalized to unit energy, whereas the actual energy of the transmitted symbols is accounted for by the coefficients a k and b k . In addition it is assumed that the input waveform g(0,t) is orthogonal with respect to time shifts by an integer number of symbol durations, namely ∞ −∞ g * (0,t − kT )g(0,t − k ′ T )dt = δ k,k ′ . Owing to the unitarity of U(z) this property of orthogonality is also preserved in the linearly propagated waveform g (0) (z,t).
The first order correction for the field, u (1) (z,t), is obtained by solving the nonlinear Schrödinger equation in which the nonlinear term is evaluated from the zeroth order approxi- where γ is the nonlinearity coefficient and the function f (z) accounts for the loss/gain profile along the optical link [17]. It is equal to 1 in the case of perfectly uniform distributed amplification, whereas in the case of lumped amplifiers f (z) = exp(−αz ′ ), where α is the loss coefficient and z ′ is the difference between the point z and the position of the last amplifier that precedes it. It is assumed that only terms that contribute to the channel of interest (i.e. in the vicinity of zero frequency) are retained in the nonlinear term in (2). The solution to Eq. (2) at z = L is straightforward and it is given by We now focus, without loss of generality on the detection of the zeroth data symbol a 0 , which is obtained by passing the received field, u(L,t) ≃ u (0) (L,t) + u (1) (L,t), through a matched filter whose impulse response is proportional to g (0) (L, T ). The contribution of u (0) (L,t) to the output of the matched filter is a 0 itself, whereas the contribution of u (1) (L,t) is the estimation error ∆a 0 resulting from NLIN. It is given by where we have used the identity U(L − z)g (0) * (L,t) = g (0) * (z,t), which follows from the definition of the linear propagation operator. Substitution of the zeroth order field expression from Eq. (1) in Eq. (4) produces the result where is responsible for intra-channel interference effects, whereas accounts for (inter-channel) XPM induced interference. Intra-channel interference involves only symbols transmitted in the channel of interest and they need not be considered as noise. It can be reduced either by performing joint decoding of a large block of symbols, or eliminated by means of back-propagation or pre-distortion. We will hence ignore the terms proportional to S h,k,m in what follows and focus on the NLIN due to XPM. Notice that given the injected pulse waveform g(0,t), the symbol duration T , the channel spacing Ω and the parameters of the fiber, the value of X h,k,m can be found numerically. It can be seen to reduce monotonically with the walk-off between channels, where the relevant parameter is the ratio between the group velocity difference β ′′ Ω and the symbol duration T . A very important feature in X h,k,m is that it is proportional to the overlap between four temporally shifted waveforms. It is therefore reasonable to expect based on Eq. (7) that the largest elements of X h,k,m are those for which h = 0 and k = m. That is because in this situation only two temporally shifted waveforms need to overlap. We write the contribution of these terms to ∆a 0 as where we define θ = 2γ ∑ m |b m | 2 X 0,m,m . Notice that since X 0,m,m is a real quantity according to Eq. (7), θ is a real quantity as well and it represents a nonlinear phase rotation. This was the inspiration for using the sub-index p (as in "phase") in the symbol ∆a 0 p [23]. The first and second moments of θ are given by and the variance of the phase rotation is where we have used the independence between different data symbols |b m | 2 |b m ′ | 2 = |b m | 2 |b m ′ | 2 (1 − δ m,m ′ ) + |b m | 4 δ m,m ′ , as well as their stationarity |b m | n = |b 0 | n . Equation (9) constitutes an extremely important result that the phase noise grows with the variance of the square amplitude of the information symbols and that it vanishes in the case of pure phase-modulation where |b 0 | is a constant (and hence |b 0 | 4 − |b 0 | 2 2 = 0). This is a rather counter-intuitive result in view of the fact that upon propagation through a dispersive fiber, the intensity of the electric field appears to fluctuate randomly, independent of the way in which it is modulated ( [6,12] and see discussion related to Fig. 2 in Sec. 4 of this paper). Apart from the pure phase-noise that follows from XPM between WDM channels there are additional noise contributions involving a single pulse from the channel of interest with a pair of pulses from the interfering channel. We refer to the NLIN due to these contributions as residual NLIN, so as to distinguish it from the phase NLIN that was described earlier. In general, since residual NLIN occurs in the process of temporal overlap between three or four distinct waveforms (see Eq. (7) ) its magnitude in the presence of amplitude modulation (as in 16QAM or larger QAM constellations) is expected to be notably smaller than that of phase noise, as we demonstrate numerically in section 4.
A further simplification of the expression for the variance of phase-noise follows in the limit of large accumulated chromatic dispersion, which accurately characterizes the situation in most modern fiber-communications links that do not include inline dispersion compensation. In this situation the propagating waveform g (0) (z,t) quickly becomes proportional to its own Fourier transform [26], namely (10) simply reflects the fact that dispersion causes different frequency components of the incident signal to propagate at different velocities, so that the frequency spectrum of the injected signal is mapped into time. In this limit the coefficients X 0,m,m are given by where we defined ν = t/β ′′ z. In Eq. (11) we neglected the nonlinear distortion generated in the vicinity of the fiber input and defined z 0 ∼ T 2 /|β ′′ | ≪ L as the distance after which the large dispersion approximation Eq. (10) becomes valid. Using Eq. (11) we derive an approximate analytic expression for ∆θ 2 in the case of perfectly distributed amplification. The approximation relies on the notion that the largest overlap between the two waveforms in the integrand of (11) occurs at a position z = z m = −mT /β ′′ Ω. We replace the integral from z 0 to L with an integral from −∞ to +∞ and approximate f (z) with f (z m ), which is set to 1 when z m ∈ [z 0 , L] and to 0, otherwise. Physically this is equivalent to stating that all collision whose center is inside the region [z 0 , L] are counted as complete collisions in spite of the fact that in reality some of them (those that are centered close to the edges of the fiber) are partial. Multiplying the integrand by z m /z (which is close to unity when there is strong overlap between pulses), and changing the order of integration, we obtain Substitution into Eq. (9) yields the result The simplified expression for X 0,m,m , Eq. (12), also allows calculation of the temporal autocorrelation function of the phase noise R θ (l) = θ n θ n+l − θ 2 , where we use the notation θ n to denote the nonlinear phase rotation induced upon the n-th symbol in the channel of interest.
where [a] + = max{a, 0}. In the case of multiple WDM channels, Eq. (14) generalizes to where Ω s is the frequency separation between the s-th WDM channel and the channel of interest and the summation is over all the interfering channels. Notice that in the limit of large accumulated dispersion, |β ′′ Ω s |L/T ≫ 1, the phase noise is characterized by a very long temporal correlation. This property allows a cancelation of nonlinear phase-noise with available equalization technology [24, 25] and contributes to the achievement of higher information capacity [19,20]. It also allows the extraction of phase noise from simulations, as we explain in Sec. 4.

Frequency domain analysis
In this section we review the approach adopted in [6][7][8][9][10][11][12][13] of analyzing in the frequency domain the interaction leading to NLIN and relate it to the analysis in Sec. 2. Following [6], we assume that the transmitted symbols a n and b n are periodic with period M, so that a n+M = a n , b n+M = b n and the propagating field u (0) (z,t), which is defined in Eq. (1) is periodic in time with a period MT . As pointed out in [6], for large enough M, the assumption of periodicity is immaterial from the physical standpoint, but facilitates calculations by allowing the representation of the signal by means of discrete frequency tones, The coefficients ν n represent the spectrum of the channel of interest at frequency ω = 2π n MT , whereas ξ n represent the spectrum of the interfering channel at ω = Ω + 2π n MT . Both ν n and ξ n are zero mean random variables whose statistics depends on the transmitted symbols in a way on which we elaborate in what follows. The complex amplitude of the NLIN that the interfering channel imposes on the channel of interest is the sum of all the nonlinear interactions between triplets of individual frequency tones, where consistently with [6], the terms m = n that only contribute to a time independent phaseshift, where excluded from the summation. The factor of 2 in front of the sum in Eq. (17) is characteristic of XPM when the nonlinearly interacting channels are co-polarized. The coefficients ρ lmn are given by [7] ρ lmn = γ where the WDM channel spacing is assumed to be Ω = q 2π T , L s is the length of a single amplified span and N is the overall number of amplified spans in the system. The NLIN power is given by the square average of ∆u(t) where m = n and m ′ = n ′ and where we have made use of the fact that ν l and ξ m are statistically independent for all l and m since they correspond to different WDM channels that transmit statistically independent data. Lack of correlation between different frequency tones implies that ν l ν * l ′ = |ν l | 2 δ ll ′ , and the assumption of true statistical independence (which is key in obtaining the results of [6]) implies that ξ m ξ * n ξ * m ′ ξ n ′ = |ξ m | 2 |ξ n | 2 δ mm ′ δ nn ′ (1 − δ mn ) (where the irrelevant cases with m = n, or m ′ = n ′ were ignored for simplicity). Equation (19) then simplifies to an expression that only depends on the mean power spectrum of the interacting channels and is totally independent of modulation format. As we now show for the case of single carrier modulation, the above assumption of statistical independence is unjustified (even as an approximation) with most of the relevant modulation formats. We consider a generic interfering channel as in (1) x(t) = ∑ k b k g(t − kT ), which is periodic as in [6] with b k+M = b k . The Fourier coefficients of x(t) are whereg(ω) = g(t) exp(iωt)dt is the Fourier transform of g(t), ω n = n 2π MT , and the final expression on the right-hand-side follows from a straight-forward, albeit slightly cumbersome algebraic manipulation. The correlation relations between the various ξ n are obtained by averaging the product ξ n ξ * n ′ with respect to the transmitted data. In order to simplify the algebra we will assume Nyquist, sinc-shaped pulses g(t) = sinc (πt/T ) in which case [6,7] The restriction to Nyquist pulses ensured wide-sense stationarity for x(t) and allowed avoiding the appearance of correlations between frequency tones ξ n and ξ n ′ that are separated by an integer multiple of M [6]. Assuming circularly symmetric complex modulation, the central limit theorem can be applied to Eq. (21), implying (as argued in [6,7]) that in the limit of large M, the coefficients ξ n are Gaussian distributed random variables. Yet, unlike the claim made in [6,7], the fact that the coefficients ξ n are Gaussian and uncorrelated does not imply their statistical independence. That is because the coefficients ξ n are Gaussian individually, but not jointly and hence their lack of correlation does not imply anything regarding the statistical dependence between them. In order to see the lack of joint Gaussianity note that if all ξ n were jointly Gaussian then x(t) (which can be expressed as their linear combination) would have to be Gaussian as well. Therefore, unless the data-carrying symbols b n are themselves Gaussian distributed, the Fourier coefficients ξ n cannot obey a jointly Gaussian distribution. We now write the fourth order correlation, which is obtained from Eq. (21) (again, after some algebra and for the case of Nyquist pulses) where The first term on the right-hand-side of (23) is what would follow if the coefficients ξ n where indeed statistically independent, as assumed in [6], whereas the second term reflects the deviation from this assumption. Upon substitution into Eq. (19) we find that the noise variance can be written as where where terms with m = n or m ′ = n ′ are excluded from the summation. The first term on the right-hand-side of (25) is due to second-order correlations between the frequency tones and we will refer to it as the second-order noise (SON). This term coincides with the result of [6,12] (and can be obtained by substituting Eq. (22) in Eq. (20)). The second term is absent in the calculations of [6,12] and since it results from fourth order correlations between the frequency tones we will refer to it as fourth-order noise (FON). Consistently, we will refer to χ 1 and χ 2 as the SON and FON coefficients, respectively. Due to the delta functions in the definition of P mnm ′ n ′ in Eq. Moreover, as we demonstrate numerically in Sec. 4, in the limit of distributed amplification the coefficients χ 2 is almost identical to χ 1 and they become practically indistinguishable when the frequency separation between the interfering channels grows (see Fig. 5). Interestingly, in the special case of purely Gaussian modulation, when the symbols b k are circularly symmetric complex Gaussian variables, |b 0 | 4 − 2 |b 0 | 2 2 = 0 and the FON vanishes, in which case the NLIN spectrum found in [6] is exact. Consistently, we remind that this is also the only case in which x(t) is truly Gaussian distributed and the lack of correlation between different frequency tones indeed implies their statistical independence. The last point that we address in this section is the assumption of Gaussianity in the context of NLIN in the limit of high chromatic dispersion. The argument against this assumption is similar to the argument made in the context of Gaussianity in the frequency domain. That is because in the limit of large dispersion, the signal frequency spectrum is simply mapped to the time domain. Therefore, the field becomes Gaussian point-wise, but it does not form a Gaussian process. It is in fact a general principle that a linear unitary time independent operation, such as chromatic dispersion, cannot transform a non-Gaussian process into a Gaussian one. In the absence of joint Gaussianity between all of the field samples, the power density spectrum does not sufficiently characterize the nature of NLIN.

Numerical validation
In order to validate the analytical results of the previous section, a set of simulations, all based on the standard split-step Fourier transform method, was performed. In order to demonstrate the principle and to be able to test the phase-noise variance predicted in Eq. (13) we perform all simulations for the case of perfectly distributed gain, namely where the loss coefficient α is set to 0. The simulations are performed for a 500 km system over a standard single mode fiber, whose dispersion coefficient is β ′′ = 21 ps 2 /km and whose nonlinearity coefficient is given by γ = 1.3 W −1 km −1 . As we are only interested in characterizing the NLIN, we did not include ASE noise in any of the simulations. In all our simulations the symbol-rate was 100 Gb/s, similarly to [27], and the channel spacing was set to 102 GHz. Nyquist pulses of a perfectly square optical spectrum (of 100 GHz width) were assumed. The number of simulated symbols in each run was 8192 and up to 500 runs (each with independent and random data symbols) were performed with each set of system parameters, so as to accumulate sufficient statistics. The data symbols of the various channels were generated independently of each other using Matlab's random number generator whose periodicity is much larger than the collective number of symbols produced in our simulations. Use of very long sequences in every run is critical in such simulations so as to achieve acceptable accuracy in view of the long correlation time of NLIN, as well as to avoid artifacts related to the periodicity of the signals that is imposed by the use of the discrete Fourier transform. In all system simulations that we present, the number of WDM channels was five, with the central channel being the channel of interest. At the receiver the channel of interest was isolated with a matched optical filter and back-propagated so as to eliminate the effects of SPM and chromatic dispersion.

Modulation format dependence
In order to demonstrate the dependence of NLIN on the modulation format we plot in Fig. 1 the received signal constellations in six different cases. The figures in the left column represent the case in which the channel of interest undergoes QPSK modulation, whereas the right column refers to the case in which the modulation of the channel of interest is 16-QAM. The figures in the top panel correspond to the case in which the interfering channels are QPSK modulated, whereas the figures in the middle panel were produced with 16-QAM modulated interferers. The bottom two figures were produced in the case where the symbols of the interfering channels where Gaussian modulated. In the top panel, where the interfering channels undergo pure phase modulation, the NLIN is almost circular, albeit a small amount of phase-noise can still be observed. This small phase-noise is due to coefficients X 0,k,m (k = m) that were neglected in Sec. 2 [23]. In the center and bottom panels, where the intensity of the interfering channels is modulated, the phase-noise nature of NLIN is very evident, and it is largest in the case of Gaussian modulation. The modulation format dependence that is predicted in [17] and summarized in Sec. 2 is of somewhat subtle origin and is fairly counter-intuitive. As was argued correctly in [6,12,13] the electric field of the strongly dispersed signal appears fairly random independently of the modulation format as can be seen in Fig. 2. Moreover, as noted earlier, the point-wise distribution of the field is indeed Gaussian. Nonetheless the types of NLIN produced by the various modulations are very different as can be clearly seen in Fig. 1.
We note that the phase-noise nature of NLIN was not evident in the simulation results re- ported in [7]. While the difference between the results cannot be determined unambiguously based on the simulation details provided in [7], it may result from certain differences in the simulated system. Most importantly, the simulations in [7] do not eliminate intra-channel effects through back-propagation, as we do here, but use adaptive equalization, which may leave some of the intra-channel interference uncompensated. Furthermore, it is possible that the phasenoise that we report (which is characterized by a very long temporal correlation) is inadvertently eliminated in the process of adaptive equalization. Additionally, some of the discrepancy could result from the fact that the system simulated in [7] assumed lumped amplification, as opposed to distributed amplification that we assumed here. It is possible that these differences explain the agreement between the simulations reported in [7] and the analytical results of the GN model.

The variance of phase-noise and assessment of the residual NLIN
In this section we validate the analytical expression for the phase-noise variance in Eq. (13), and assess the residual noise. We remind that the residual noise is the part of the NLIN that does not manifest itself as phase-noise and hence remains after phase-noise cancelation. To this end, we define a procedure for extracting the phase noise from the results of the simulations. Denoting by r n the n-th sample of the received signal (in the channel of interest and after back propagation and matched filtering) we have r n = a n exp(iθ n ) + ∆a n,r , where ∆a n,r is the residual noise. We extract θ n through a least-squares procedure by performing a sliding average of the quantity a * n r n over a moving window of N = 50 adjacent symbols. We then normalize the absolute value of the averaged quantity to 1, so as to ensure that we are only extracting phase noise. The residual noise ∆a n,r is evaluated by subtracting a n exp(iθ n ) (withθ n being the estimated phase) from the received sample r n . The width of the sliding window needs to be narrow enough relative to the correlation time of θ n , but broad enough to ensure meaningful statistics. Using this procedure we computed the autocorrelation function of the nonlinear phase θ , which is plotted in Fig. 3   between the analytical and numerical autocorrelation functions is self evident. Notice that over a block of 50 symbols the autocorrelation of θ n drops only by 6% relative to its maximal value, thereby justifying the choice of N = 50 for the moving average window. Further considerations in optimizing the window-size can be found in [19]. Figure 4 shows the normalized overall NLIN variance ∆a 2 0 /PT , the phase-noise variance ∆θ 2 , and the normalized variance of the residual noise ∆a 2 0,r /PT , where P is the average power in each of the interfering channels. The analytical expression for the phase-noise variance Eq. (13) is also plotted by the dashed red curve. All the curves in Fig. 4 were obtained in the case of Gaussian modulation of the data-symbols. The accuracy of the analytical result is self

The difference with respect to the NLIN power predicted by the GN model
In order to assess the error in the estimation of the NLIN power by the GN model, we compute the NLIN power, as it is predicted by the GN model and as it is predicted by the theory in Sec.
3. In the case where P is the average power used in each of the channels, these quantities are specified by Eq. (25) and given by where the summation index s runs over all neighboring channels (which are spectrally separated by Ω s from the channel of interest). The SON coefficient χ 1 and the FON coefficient χ 2 are plotted in Fig. 5a as a function of the frequency separation between the interacting channels, where the blue diamonds are used to represent χ 1 and the red circles represent χ 2 . The two coefficients are seen to be very similar to each other so that the difference between them, which is illustrated by the green squares, is significantly smaller than the coefficients themselves. The Monte-Carlo integration method [28] was deployed in order to compute the sums in Eqs. (26) and (27) in the limit of M → ∞ with the estimation error being always lower than 3%.
In Fig. 5b we show the NLIN power in our simulated 5-channel system in the cases of QPSK, 16-QAM, and Gaussian modulation. The this solid curves show the theoretical result |∆u| 2 Full of Eq. (29) and the circles represent the variance obtained in the full split-step simulation. The dashed green line represents the prediction of the GN model |∆u| 2 GN , Eq. (28), which is correct only for Gaussian modulation. In the case of QPSK the actual NLIN power is lower by approximately 6.5dB than the prediction of the GN model. Since the NLIN powers in Fig. 5b include the contribution of phase-noise, the relation to the error-rate is not straightforward.

Discussion
Having reviewed the essential parts of the time domain model and the frequency domain GN model, we have pointed out that the difference between the models results from three unjustified assumptions in the frequency domain approach. The assumption that NLIN can be described as an additive noise term that is statistically independent on the signal, the assumption that in the large dispersion limit the electric field of the signal and the noise forms a Gaussian process that is uniquely characterized in terms of its spectrum, and the claim of statistical independence between non-overlapping tones in the spectrum of the interfering signal. We have shown that by correctly accounting for fourth-order correlations in the signals' spectrum an extra termthe FON -arises. The FON (which can be positive, or negative depending on the modulation format) needs to be added to the noise power obtained in the GN model (the SON) in order to obtain the correct overall NLIN. The inclusion of the FON recovers the dependence of the NLIN power on modulation format, a property that is absent from the existing GN model and reconciles between the frequency domain and the time domain theories. We stress that the current GN model of [6][7][8][9][10][11][12][13] which does not contain the FON term, cannot be considered a valid approximation, since with standard modulation formats (e.g QPSK, 16-QAM), the magnitude of the FON is comparable to that of the SON, which is the quantity calculated in [6,7]. The numerical validation of the theoretical results has been performed in the case of a five-channel WDM system with idealized distributed amplification. In this case the FON term was almost identical to the SON term, implying that the error in the NLIN power predicted by the GN model is very significant.
While the study presented in this paper focused on the single polarization case, the effect of polarization multiplexing can be be anticipated by considering the relevant factors. The SON part of the NLIN variance changes in the presence of polarization multiplexing by a factor of 16/27 [7], whereas it can be shown that the FON part changes by 40/81. The small difference between these factors has practically no effect on the conclusions made in this paper regarding the importance of accounting for FON. The numerical study of polarization multiplexed transmission, as well as the effects of lumped amplifications and the many other practical system parameters, is beyond the scope of this work and will be addressed in the future.
Finally, we note that when treating the NLIN as an additive, signal-independent noise process, its bandwidth appears to be comparable to that of the signal itself. Thus, one cannot take advantage of the fact that phase noise that dominates the variance of NLIN in many cases of interest is very narrow-band as we have shown here (see Eq. (15) and Fig. 3). The importance of this property of NLIN is immense as it allows cancelation of the phase-noise part of NLIN by means of available equalization technology [24,25], such that the residual NLIN (whose variance is much smaller than that of the NLIN as a whole) determines system performance. The system consequences of this reality have been addressed in [20].