Squeezing out the last few bits from band- limited channels with entropy loading

The past decade witnessed the stirring development of advanced optical modulations and digital signal processing, which have been pushing optical transmission systems towards the capacity limit. Recent research has sought to squeeze out the last few bits from bandwidth-limited optical channels. One straightforward path is to expand the signal spectrum beyond the bandwidth limit while keeping the single-carrier modulation, which inevitably induces huge inter-symbol interference. To cope with such penalty, sophisticated digital nonlinear equalization on single-carrier signals should be exploited to reduce the burden of the subsequent forward error corrections (FEC). On the other hand, a more instinctive capacity-approaching method for bandwidth-deficient channels is the well-known water-filling realized by multicarrier modulation. As its approximation, bit loading (BL) has been a well-established algorithm to maximize the bit rate of a discrete multitone (DMT) channel with fixed-rate FEC. Built on probabilistic constellation shaping (PCS), multicarrier entropy loading (EL) goes beyond BL by continuous source entropy adaptation and has proven its superiority over the single-carrier PCS counterpart. In this paper, we reveal the EL advantage over BL on both achievable information rate (AIR) and FEC, aiming to prove EL as the optimum capacity-approaching solution for bandwidth-limited channels with frequency-selective fading. In a 100G direct detection system with a bandwidth-deficient directly modulated laser, EL improves the AIR by 5%-10% over BL using identical FEC overhead. EL will be critical for short-reach interconnects dominated by low-cost optical components to squeeze out the last few bits from the bandwidth-constrained system. © 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement


Introduction
Over the past decade, the developments of advanced optical modulation formats and digital signal processing (DSP) have boosted the optical transmission close to the capacity limit. To approach the capacity of a band-limited channel at the linear regime, constellation shaping by either geometric [1,2] or probabilistic [3,4] methods are exploited to achieve the Gaussian shaping gain. While fiber channels are commonly treated with flat frequency response owing to the ultrawide bandwidth of both fiber optics and optical amplifiers, other bandwidthlimited devices inside the transmission system can give rise to colored-SNR effect across the spectrum like the cascaded ROADMs in meshed optical networks [5][6][7]. In particular, dominated by intensity-modulation (IM) direct-detection (DD), short-reach optical interconnect usually suffers from frequency selective fading due to transceiver bandwidth deficiency, laser frequency chirp and fiber chromatic dispersion. To squeeze out the last few bits from those IM-DD systems, the most straightforward approach is to maintain the conventional IM format like NRZ and PAM-4 while pushing their baud-rate beyond the limited bandwidth (normally referred as faster-than-Nyquist). Although in theory these single-carrier formats can approach the channel capacity under frequency selective fading [7,8], it raises stringent requirements on the FEC including overhead and decoding complexity. One solution is to share the burden of FEC with digital nonlinear equalizations [9], including pre-equalization such as Tomlinson-Harashima precoding [10,11] and postequalization such as Volterra equalization [12,13] and maximum likelihood sequence estimation (MLSE) [13,14]. On the other hand, the information theory has suggested the capacity-approaching strategy for bandwidth-limited channels, namely, the well-known water filling supported by multicarrier (MC) modulation [15] which decomposes high-speed signals to a series of low-symbol-rate subcarriers and adapts their modulations based on the SNR response. The most popular algorithm that approximates water filling is bit loading (BL) [16][17][18] that assigns various bit levels to subcarriers. Built on probabilistic constellation shaping (PCS) [3], entropy loading (EL) was recently proposed that loads continuous entropies instead of discrete bit levels to subcarriers [7]. The advantage of MC-EL over the singlecarrier PCS was demonstrated in a filter-narrowing coherent transmission system with cascaded ROADMs. In DD systems, MC-EL has also shown its superior capability of maximizing the achievable information rate (AIR) under limited transceiver bandwidth [19,20]. Moreover, EL is expected to outperform BL under coarse frequency resolution (namely, a few subcarriers), because its continuous adjustment of entropy has relaxed the frequency granularity. However, it is not clear if EL maintains such superiority in a fine frequency-resolved system. This paper compares the AIR between EL and BL, aiming to prove EL as the global optimum capacity-approaching modulation for bandwidth-limited channels. The comparison is made in a 100 Gb/s IM-DD system using a directly modulated laser (DML) with the 3-dB bandwidth of 16 GHz.

Useful definitions
Conventional BL loads uniform QAM to each subcarrier. Straightforward, the entropy of each subcarrier is m = log 2 M where M is the QAM order. The AIR and the required FEC overhead (OH) are predicted by the pre-FEC BER of the QAM symbols. When taking into account PCS-QAM applied in EL, more systematic criteria should be adopted to make fair comparisons between uniform QAM and PCS QAM. In this section, we briefly introduce some useful definitions utilized throughout the paper.
Instead of a uniform distribution among the constellation points, PCS usually assigns the Maxwell-Boltzmann distribution to form a Gaussian-like constellation [3]. We consider a generic channel with input X n = X 1 X 2 … X n and output Y n . The inputs X i are independent and identically distributed (i.i.d.). Each input symbol X i is mapped onto the QAM alphabet {χ | x 1 ,…,x M } whose probability mass function is P(χ). The source entropy (unit of bit) with the M-order QAM is defined as [21, Ch. 1] It is noted that entropy characterizes how much information rate can be sent by the transmitter and is only determined by source distribution P(χ).
In terms of the communication channel, capacity characterizes the highest information rate the channel can convey and is independent from the modulation. For an additive white Gaussian noise (AWGN) channel (which is applicable to most fiber communication systems in the linear transmission regime), capacity (unit of bit) is determined only by channel SNR as [21, Ch. 9] 2 log (1 ) C S N R = + (2) Another critical concept is the mutual information (MI) between X and Y, namely, I(X;Y). I(X;Y) characterizes the channel AIR given both the source distribution and the channel condition, and must be smaller than the capacity. Namely where q(y|x) is the conditional probability. For AWGN channel, q(y|x) simply follows the circular symmetric complex Gaussian distribution. As practical optical communication systems use bit-interleaved coded modulation (BICM) and bit-metric decoding (BMD), the generalized MI (GMI) under BMD offers a reliable estimation of the lower bound of AIR, calculated via Eq. (4) with a bit-metric-defined q(y|x) (see Eq. (22) of Ref [3]). GMI can also be estimated using the bit-wise log-likelihood ratio (see Eqs. (8)(9)(10)(11) of Ref [22]). The normalized GMI (NGMI), defined as the maximum number of information bits per transmit bit, can be calculated as [23] 1 It is noted that the rate of a binary FEC code has exactly the same implication as NGMI. As a result, an ideal FEC with code rate of R c guarantees error-free decoding if the channel satisfies NGMI>R c [23]. Correspondingly, the ideal FEC OH can be implied from NGMI as In this paper, we claim both NGMI and FEC OH under the assumption of ideal FEC.
For multicarrier system (like DMT) in which each subcarrier has its own H, C and GMI, the overall H, C and GMI are defined as their expectations (E) over all the subcarriers. The capacity of each subcarrier in all the following figures is calculated with the SNR measured under constant power loading to all the subcarriers. The overall NGMI is defined as which assumes the same code rate among all the subcarriers.

Loading algorithms with NGMI target
For conventional BL, a practical loading algorithm was proposed in [16]. Under bit rate or BER target, it assigns bit levels iteratively through a sub-optimum loop and then adjusts the power among subcarriers to reserve a constant SNR margin for every subcarrier. In general, such constant margin guarantees FEC codes provide even error protection across the spectrum. In terms of a system that involves PCS, GMI and NGMI [22,23] are more straightforward metrics to evaluate the channel AIR and predict the FEC OH. Under GMI/NGMI criteria, the SNR margin for BL/EL systems can be interpreted as the constant NGMI for all the subcarriers, a target adopted in [19]. Further, here we provide a simple yet efficient loading algorithm with NGMI target based on look-up tables. The look-up table, acquired by Monte-Carlo simulations with AWGN channels, stores the relations between NGMI and channel SNR for all the available modulation formats. For BL, the format is limited to discrete levels of QAM; while for EL, the format is specified by the squared-QAM level M (with m bits/symbol) and the source entropy H after PCS. These tables are illustrated in Fig. 1. Obviously, H is continuous for EL as shown by the smooth surface in Fig. 1(a) (in the table the step size of H is 0.1 bit), and only takes discrete values for BL shown by Fig.  1(b). A probe signal with QAM-4 should be sent to acquire the SNR map across the spectrum in advance. It is noted that we assume the constant-power water filling [7] before BL/EL for simplicity. If the channel has extremely low SNR (e.g. 0 dB) at some frequencies, a power loading based on water filling may enhance the overall AIR [19].
Based on the SNR map, EL assigns the continuous entropies to each subcarrier in one step: the [M,H] pair is determined by a 2-D table search based on SNR and NGMI. Figure  1(a) illustrates a 2-D search example with 16-dB SNR and 0.9 NGMI, which gives 6.1-bit H, namely, the PCS 256-QAM with 6.1 bits should be assigned for this particular subcarrier. In contrast, BL cannot provide a modulation that exactly meets the NGMI target at the specific SNR due to the discrete QAM level, and consequently an additional power loading should be followed to slightly adjust the SNR for precise NGMI match, similar to the conventional BL [16]. The third approach makes a compromise between EL and BL: while it loads continuous entropies, it inherits the subcarrier groups of BL, namely, subcarriers with the same QAM level in BL are constrained to have the same [M,H] in EL. We abbreviate it as EL-BG (BG stands for the bit-loading group). EL-BG is a useful intermediary to compare EL with BL; and more importantly, it reduces the amount of distribution matchers [3,24] for PCS from the total subcarrier number to the BL group number, offering a more practical EL implementation. EL-BG was demonstrated in [20], but the entropy of each subcarrier group was determined by the lowest SNR among that group and thus did not guarantee a constant margin across the spectrum. Under NGMI target, EL-BG finds [M,H] for each subcarrier group that matches the overall NGMI (defined by Eq. (7)) with the target, and then performs power loading for subtle NGMI match per subcarrier. The NGMI target above is equivalent to the BER target for BL [16]. Another popular BL target is bit rate, equivalent to the AIR evaluated by GMI in this paper. Under a fixed channel condition, higher source entropy helps the system approach the channel capacity at the sacrifice of larger FEC OH (i.e. lower NGMI). In other words, lower NGMI results in higher AIR. The AIR target can thus be decomposed to the attempt of a series of NGMI targets: smaller NGMI target keeps increasing AIR until the AIR target has been approached.

The entropy loading advantages
We apply the three loading algorithms (EL, BL and EL-BG) to an IM-DD system with discrete multitone (DMT) modulation to reveal the EL advantages. For BL, the modulation formats are chosen from 4-QAM up to 1024-QAM; while for EL, the formats are PCS squared-QAM (i.e. 16/64/256/1024 QAM). While the detailed experiment setup and results will be described in the next section, this section takes the SNR response measured in the experiment and performs a series of simulations to compare the loading algorithms. The SNR is measured up to 20-GHz bandwidth populated with 256 subcarriers, shown by Fig. 2(a). Figure 2(b) shows the AIR (evaluated by GMI) as the function of NGMI for the three loading algorithms, where lower NGMI corresponds to higher GMI as explained at the end of Section 3. To better explain the different behaviors of the loading algorithms shown by Fig.  2(b) in detail, we illustrate the channel capacity, source entropy and GMI versus frequency in Fig. 3. A straightforward observation is that both entropy and GMI follows the trend of the channel capacity very well for EL in Figs. 3(a) and 3(b) due to its maximum flexibility to assign continuous entropies with fine frequency resolution. In contrast, when the groups of subcarriers are fixed for BL and EL-BG in Figs. 3(c)-3(f), the entropy curves are forced to be staircase shapes. Consequently, the GMI curves also follow staircase trends to guarantee a fixed gap between entropy and GMI (i.e. H-GMI) within each group, because the NGMI has been fixed across the spectrum. Notably in Fig. 2(b), the fine frequency resolution of EL only gains around 0.06 bits/symbol more AIR over EL-BG with 0.98 NGMI, and such gain shrinks to 0.02 bits/symbol with lower NGMI. In contrast, BL shows much larger AIR penalty over both EL and EL-BG without the precise adjustment of entropy. As a result, it is reasonable to compare BL with EL-BG to reveal the reason of such AIR gap. This comparison is also more intuitive as both algorithms use the same loading groups. Figure 3 distinguishes two extreme NGMI values: at the left column, the NGMI is set to be 0.98 for all the figures, corresponding to a system with small FEC OH; at the right column, the NGMI is selected within the capacityapproaching ranges, namely, the loading algorithms have maximized the AIR under such NGMI. When NGMI is 0.98, EL-BG allows loading higher entropy than BL to each subcarrier according to the comparison between Figs. 3(c) and 3(e). Higher source entropy is the prerequisite of achieving larger AIR, manifested by the 0.6-bit/symbol GMI advantage of EL-BG over BL in Fig. 2(b). When the NGMI keeps decreasing, EL-BG maximizes the AIR at 0.88 NGMI and BL is at 0.82 NGMI. Although the GMI gap shrinks in Fig. 2(b) with sufficiently high FEC OH, EL-BG retains 0.4-bit/symbol GMI advantage over BL. This 7% AIR improvement is mainly contributed by the PCS shaping gain [7] originated from the Gaussian-like sources. Figures 3(g) and 3(h) show the GMI gaps of the three loading algorithms to the channel capacity. Approximately, EL guarantees a constant capacity gap for all the subcarriers; for fine NGMI match of each subcarrier, EL-BG performs an extra power loading to adjust the SNR, and consequently, its curve fluctuates around the EL curve and gives a similar averaged GMI with that of EL. In contrast, BL shows wider gaps compared with EL-BG across the entire spectrum, revealing its uniform constellations cannot approach the capacity at any frequency even if it has loaded QAM levels with sufficiently large entropy beyond the capacity. A seemingly abnormality here (as well as in Fig. 5(b)) is that a few subcarriers achieve the GMI beyond the capacity. This is because the capacity density in both figures is calculated by the SNR with constant power loading across the spectrum, but the extra power loading of EL-BG and BL changes the original SNR response. Such fact is not contradictory to a well-defined capacity of the overall DMT channel though, as neither EL-BL nor BL achieves an overall GMI beyond that, as seen in Fig. 2(b).
Besides the difference of AIR, another remarkable phenomenon shown in Fig. 2(b) is that EL and EL-BG maximize the AIR with higher NGMI, namely, lower FEC OH, compared with BL. This is the reason why we select the NGMI of 0.88 for EL ( Fig. 3(b)) and EL-BG ( Fig. 3(f)) while 0.82 for BL (Fig. 3(d)). Such OH advantage is no doubt crucial to simplify the design and computation complexity for FEC codes.

Experiment
We apply the NGMI-target EL and BL to a 100-Gb/s IM-DD system to verify the predictions from the simulation. More experiment details can be found in [25]. EL-BG is not demonstrated as it has similar performance with EL. The baseband signals are OFDM modulated with Hermitian symmetric to form the real-valued signal. The FFT size is 1024 with the oversampling rate of 0.5 and the cyclic prefix of 32. The digital signal drives an arbitrary waveform generator sampling at 80 GSa/s, resulting in a 20-GHz DMT signal. A 1549.44-nm distributed feedback (DFB) laser is directly modulated whose 3-dB bandwidth is 16 GHz. The signal is then transmitted through either back-to-back (B2B) channel or 1-km standard single mode fiber (SSMF). The receiver contains a 43-GHz PIN photodiode whose output is sampled by a 33-GHz real-time oscilloscope. The offline DSP performs frequencydomain channel equalization and then GMI and NGMI estimations.  We measure the SNR versus subcarrier frequencies for both B2B and 1-km channels. With slightly different experiment conditions, the SNR response shows a little difference between the measurements of BL and EL, as shown in Fig. 4(a). Such condition change brings around 0.02-bit/symbol more capacity of EL compared with BL, which is excluded below when we claim the AIR advantage of EL for a fair comparison between the two loading algorithms. The SNR of B2B EL system is selected for the simulation throughout Section 4. In Fig. 4(a), the dispersion narrows down the channel bandwidth and decreased the SNR at high frequencies after 1-km fiber as expected. Figure 4(b) shows the GMI as the function of NGMI. The 1-km transmission decreases the GMI by around 0.3 bits/symbol for both BL and EL, corresponding to 6-Gb/s rate penalty with 20-GHz bandwidth.
The experiment results shown in Fig. 4(b) coincide well with the simulation in Fig. 2(b). When the NGMI is 0.97 (3% OH), GMI is 4.92 bits/symbol for BL and 5.57 for EL, corresponding to 13% AIR increment. When NGMI keeps decreasing, heavier FEC gradually overwhelms the EL advantage of fine entropy granularity, leading to narrower and narrower GMI gap between EL and BL. When both algorithms reach the maximum AIR, BL achieves 5.61 bits/symbol GMI while EL is 5.95, corresponding to 6% AIR increment. Figure 5 shows the corresponding capacity, entropy and GMI curves when both loading algorithms maximize the AIR. As expected, the GMI of EL almost overlaps with the capacity. In contrast, despite a few subcarriers achieve the GMI beyond the capacity owing to the extra power loading that adjusts the subcarrier SNR, most others leave a gap to the capacity. Moreover, BL shows a staircase-like GMI following its entropy curve, indicating the power loading has forced NGMI to maintain constant across the spectrum approximately. In terms of the FEC advantage, EL requires only 13.6% FEC OH (i.e. 0.88 NGMI) to maximize its AIR while BL requires 20.5% OH (i.e. 0.83 NGMI).

Summary
This paper reveals the superiority of entropy loading over bit loading in a 100-Gb/s bandwidth-constrained discrete multitone system: (i) under low forward error correction (FEC) overhead, it achieves >10% improvement on the achievable information rate (AIR) with its capability of continuous source entropy adaptation; (ii) on approaching the capacity, it remains 5-10% AIR gain owing to the Gaussian-like constellation shaping gain; (iii) it maximizes the AIR with much less FEC overhead. In summary, entropy loading offers the optimum solution to squeeze out the last few bits from a band-limited communication system with efficient forward error corrections.