Optical Network Scaling: Roles of Spectral and Spatial Aggregation References and Links

As the bit rates of routed data streams exceed the throughput of single wavelength-division multiplexing channels, spectral and spatial traffic aggregation become essential for optical network scaling. These aggregation techniques reduce network routing complexity by increasing spectral efficiency to decrease the number of fibers, and by increasing switching granularity to decrease the number of switching components. Spectral aggregation yields a modest decrease in the number of fibers but a substantial decrease in the number of switching components. Spatial aggregation yields a substantial decrease in both the number of fibers and the number of switching components. To quantify routing complexity reduction, we analyze the number of multi-cast and wavelength-selective switches required in a colorless, directionless and contentionless reconfigurable optical add-drop multiplexer architecture. Traffic aggregation has two potential drawbacks: reduced routing power and increased switching component size. Real-time Nyquist pulse generation beyond 100 Gbit/s and its relation to OFDM, " Opt. 26 Tbit s-1 line-rate super-channel transmission utilizing all-optical fast Fourier transform processing, " Nat. " 250gb/s real-time PIC-based super-channel transmission over a gridless 6000km terrestrial link, " in OSA Optical Fiber Communication Conference, paper PDP5D–5 (2012). Spatial superchannel routing in a two-span ROADM system for space division multiplexing, " J. " 20 × 960-Gb/s Space-division-multiplexed 32QAM transmission over 60 km few-mode fiber, " Opt. Elastic optical networking: A new dawn for the optical layer? " IEEE Commun. Pulse shaping for 112 Gbit/s polarization multiplexed 16-QAM signals using a 21 GSa/s DAC, " Opt. A baud-rate sampled coherent transceiver with digital pulse shaping and interpolation, " in OSA Optical Fiber Communication Conference, paper OTu2I–2 (2013). Experimental investigation of sampling phase sensitivity in baud-rate sampled coherent receiver for Nyquist pulse-shaped high-order QAM signals, " in OSA CLEO: Science and Innovations, paper SW1J–2 (2014). Framework for waveband switching in multigranular optical networks: part I-multigranular cross-connect architectures, " J. of Opt. Pulse-shaping with digital, electrical, and optical filters—a comparison, " J. Nyquist-WDM superchannels with net SE> 7bit/s/Hz using a gain-switched comb source, " in OSA CLEO: Transmission of a 1.2-Tb/s 24-carrier no-guard-interval coherent OFDM superchannel over 7200-km of ultra-large-area fiber, " in Proc. No-guard-interval coherent optical OFDM for 100-Gb/s long-haul WDM transmission, " J. Transmission of 1 Tb/s unique-word DFT-spread OFDM superchannel over 8000 km EDFA-only SSMF link, " J. Multiband DFT-spread-OFDM equalizer with overlap-and-add dispersion compensation for low-overhead and low-complexity channel equalization, " in Bandwidth-scalable long-haul transmission using synchronized colorless transceivers and efficient wavelength-selective …

Modern optical networks have mesh architectures formed by nodes that transmit, receive and route data, which are interconnected by links [2].Links comprise one or more singlemode fibers (SMFs), each carrying numerous wavelength-division-multiplexed (WDM) channels.Nodes incorporate reconfigurable optical add-drop multiplexers (ROADMs) allowing individual WDM channels to be independently added, dropped or routed.
Optical networks have reached a point that the bit rates of individual routed data streams exceed the throughput of single WDM channels.The throughput per channel is determined by: (a) the symbol rate, which is limited by bandwidth and sampling-rate constraints in transmitter and receiver components [3]; (b) the modulation and coding scheme, which is limited by the achievable spectral efficiency [4]; and (c) the two polarization modes available for multiplexing in SMF.As the throughput per channel cannot be increased significantly in the foreseeable future, a solution is to increase the networking granularity from a single WDM channel to a group of WDM channels.A group of aggregated channels that are transmitted, routed and received as a unit is known as a superchannel [5].
The two dimensions available for forming superchannels are spectrum and space.A spectral superchannel is an aggregation of signals conveyed on adjacent carrier frequencies.Spectral superchannels have been studied widely [6][7][8][9], and their benefits for high-throughput transmission [9], system component integration [10], and reduction in the number of switching components [11] have been demonstrated.A spatial superchannel is an aggregation of signals conveyed in the orthogonal spatial dimensions of a multi-core fiber (MCF) or a multimode fiber (MMF).Spatial superchannels have been of interest recently [12,13], and their benefits for increasing per-fiber throughput [14] and enabling system component integration, such as replacing several single-mode amplifiers by one multi-mode amplifier [15], have been demonstrated.Beyond the link level, their potential for reducing switching complexity has been described [16], but not yet fully quantified.
In this paper, we study spectral and spatial superchannels in a unified perspective.We review the implementation and key attributes of both types of superchannels.Considering the link level, we show how these types of superchannels can reduce the number of fibers required for high-throughput transmission.Beyond the link level, we describe networking principles for these types of superchannels, and quantify how they can reduce the number of switching components required for high-throughput networking.We also describe potential drawbacks of spectral and spatial superchannels, such as reduced routing power.

Transmission of aggregated traffic
We define a channel as the modulated data stream corresponding to a single transceiver and a single optical carrier.In a standard WDM system, a channel encodes data in the two polarization modes of an SMF, occupies a bandwidth v  , and is routed independently from other channels.Let B denote the total optical bandwidth available for transmission.In a standard WDM system, the total number of channels available is 1 / wB     , where   is the floor function.
In the following two subsections, we review transmission techniques that aggregate multiple channels in the spectral and spatial dimensions.Although we describe spectral and spatial aggregation techniques separately in these two subsections, we employ them together in the remainder of the paper.

Spectral aggregation
Spectral aggregation simplifies networking by routing groups of adjacent channels as a unit.We define a spectral aggregation factor F as the number of channels aggregated into a spectral superchannel per spatial mode.Note that 1 F  corresponds to a standard WDM system.In elastic optical networking, spectral superchannels with variable or non-integer F are considered [17,18].In this paper, for simplicity, we assume F is an integer and is constant over the entire optical band, although our major conclusions do not depend on this   In a non-overlapping spectral superchannel, the constituent channels may be encoded and decoded independently, and strict frequency and clock synchronization between constituent transmitters is not required.Constituent receivers employ an oversampling ratio os r in the range 1.25 to 2 [8,[19][20][21], similar to standard WDM systems, and os r may be as low as 1, albeit with a performance penalty [22,23].In waveband systems [24], constituent channels are equivalent to standard WDM channels, with / s R   as high as about 0.85 [21].In Nyquist WDM systems, constituent channels are strictly bandlimited by digital, analog or optical pulse shaping [25], allowing / s R   to increase to about 0.95 [26].(a) Non-overlapping spectral superchannels Overlapping spectral superchannels are described in Figs.1(b) and 2(b).The spacing between constituent channels is equal to the symbol rate s R so the spectral efficiency can be increased by a factor up to / s R   as compared to a standard WDM system.No-guardinterval orthogonal frequency-division multiplexing (NGI OFDM) [27,28] can achieve this full spectral efficiency increase.In discrete Fourier transform-spread OFDM (DFT-spread OFDM), guard intervals are inserted between blocks to facilitate channel estimation [29,30].
The guard interval overhead is close to   which is typically about 0.1 [30].Hence, the overall spectral efficiency of DFT-spread OFDM is similar to Nyquist WDM.Assuming a guard band


is maintained between superchannels, the total number of overlapping superchannels that can be transmitted is In an overlapping spectral superchannel, maintaining orthogonality between constituent channels requires strict frequency and clock synchronization between constituent transmitters, e.g., by locking them to common frequency and clock references [31][32][33][34].It also requires low relative phase noise between constituent carriers, which can be achieved using narrowlinewidth lasers [32,33].Decoding a desired constituent channel while avoiding interference from overlapping channels requires a receiver bandwidth and sampling rate sufficient to capture the desired channel and relevant portions of the overlapping channels.In DFT-spread OFDM, each constituent channel has a nearly rectangular spectrum, and a receiver oversampling ratio os r of about 1.25 is sufficient [30].In NGI OFDM, each constituent channel overlaps with several of its neighbors, and a receiver oversampling ratio os r of at least 4 is needed to minimize interchannel interference [8].

Spatial aggregation
Spatial aggregation simplifies networking by routing signal propagating in different spatial dimensions as a unit.We define a link spatial aggregation factor L S as the number of channels aggregated into a spatial superchannel (per carrier frequency) in a transmission fiber.L S should be distinguished from the node spatial aggregation factor N S , which is the number of channels aggregated spatially at a node, as defined in Section 3.1.By analogy to spectral superchannels, we classify spatial superchannels into two categories: uncoupled and coupled spatial superchannels (see Table 2).S cores, the spectral efficiency per MCF is L S times the spectral efficiency of an SMF.Such an MCF is analogous a bundle of L S SMFs, but achieves a much higher spatial information density.Although it is desired to pack the cores as close as possible, couplinginduced crosstalk limits the achievable spatial information density.The tolerable crosstalk level depends on the modulation and coding scheme employed [35].Typically, a tolerable crosstalk level can be achieved with a core separation of about 38], in conjunction with crosstalk-minimizing design features, such as trenches [39] or inhomogeneous cores [40].The transmitter and receiver for an uncoupled spatial superchannel are equivalent to that for a collection of L S independent SMF systems.The ability to independently process signals received from different cores is an advantage of uncoupled spatial superchannels, but it requires that all components in the end-to-end network be designed to maintain low crosstalk between cores.
Coupled spatial superchannels, described in Figs.3(b) and 4(b), are based on multiplexing signals in the orthogonal spatial modes of an MMF [41,42] or a coupled-core MCF [43], which has a smaller core spacing than an uncoupled-core MCF [44].Because the modes overlap spatially despite being orthogonal, the spatial information density is higher than in SMF bundles or uncoupled-core MCFs.For an MMF or MCF with L S orthogonal spatial modes, the spectral efficiency per fiber in the linear regime increases in proportion to L S , assuming mode-dependent loss is small [45,46].
In a coupled spatial superchannel transmitter, L S dual-polarization signals can be encoded independently or jointly [47,48] and launched into the L S orthogonal spatial modes, or into any near-unitary combination of these modes [49].During propagation along the link, signals in different modes inevitably become coupled to the degree that at the receiver, it is necessary to use 22

LL SS 
MIMO signal processing to equalize for modal crosstalk and modal dispersion (MD) [50,51].Typically, the complexity of this MIMO processing is larger than that required to compensate for chromatic dispersion (CD) [52,53].Because mode coupling cannot be avoided, it is not necessary to design network components to minimize mode coupling, which is an advantage of coupled spatial superchannels.In fact, it may be desirable to induce strong mode coupling in order to minimize nonlinear effects [54,55], minimize modal delay spread to control MIMO signal processing complexity [50,52], and minimize mode-dependent loss to maximize spectral efficiency and minimize outage probability [45,56].

Fibers per link
Thus far, we have described spectral and spatial aggregation separately.In the remainder of this paper, we consider using them in tandem, assuming the spectral aggregation factor F and the spatial aggregation factors where   is the ceiling function, and where the number of superchannels per fiber per bandwidth B is given in Sec.2.1 for non-overlapping or overlapping frequency superchannels.1,1 P denotes the required number of fibers for a standard WDM system.Figure 5 shows the number of fibers required per link ,

Networking of aggregated traffic
In this section, we study how spectral and spatial aggregation affect data traffic networking, in terms of switch implementation complexity and routing flexibility.

Superchannel networking principles
All spectral and spatial aggregation techniques increase switching granularity from the level of constituent channels to that of superchannels, decreasing optical switching hardware complexity, but potentially sacrificing routing flexibility, which is commonly quantified in terms of routing power [57,58].
In the strictest definition of a superchannel, all constituent spectral and spatial channels are transmitted, routed and received as a unit [5], as we assume in this paper.While not analyzed here, some schemes enable individual constituent channels to be dropped and added independently at intermediate nodes.
In non-overlapping spectral superchannels, independent dropping and adding of constituent channels is possible provided that there is sufficient guard band between the channels to accommodate the transition band of any wavelength-selective switching components.By contrast, in overlapping spectral superchannels, because of spectral overlap, dropping a constituent channel is typically possible but adding one is extremely difficult without regenerating the entire superchannel.
Similarly, in uncoupled spatial superchannels, independent dropping and adding of constituent channels in different cores is possible.In contrast, in coupled spatial superchannels, propagation induces frequency-dependent crosstalk between channels in different modes [50,59] Spatial aggregation along a link by a factor L S involves transmission in an MCF or MMF with L S cores or spatial modes, as described in Sec. 2.2 above.The case 1 L S  corresponds to transmission in SMF.Additional spatial aggregation at a node can further reduce the routing hardware complexity of the node.This may be achieved at the ROADM input and output ports by coupling multiple fibers to/from the link into very short sections of uncoupled-core MCF and switching all its cores as a unit.These MCFs may have singlemode or multi-mode cores, depending on the type of fibers used in the link.We define a node spatial aggregation factor N S as L S times the number of fibers aggregated at the ROADM input and output ports.When using additional spatial aggregation at nodes, the number of fibers required per link becomes

ROADM architecture
ROADMs at the nodes route superchannels between incoming and outgoing links and allow some superchannels to be dropped and added [60].The two main functionalities of ROADMs are: 1. Express functionality routes superchannels from an incoming link to an outgoing link without modifying their content.An optical path between one port of the incoming link and one port of the outgoing link is required.
2. Add/drop functionality enables transceivers at the node to change the content of one or more superchannels at the node.Optical paths between one port of the incoming link and one drop port, and between one add port and one port of the outgoing link, are required.The degree of a ROADM d is defined as the number of incoming or outgoing links, and typically ranges from 2 to 6. Assuming the offered throughput requires Depending on the implementation, ROADMs may comprise components for wavelength switching, wavelength selection, (de)multiplexing, power splitting, optical amplification, and optical cross-connect.Current WDM systems often use ROADMs based on a "broadcast-andswitch" architecture [2].This architecture uses power splitting on all incoming fiber ports, which reduces the signal-to-noise ratio.Moreover, as add/drop ports are wavelengthdependent, it is limited to colored operation.
To enable flexible networking in complicated mesh topologies, ROADMS are desired to have three properties; they should be colorless (operating at any wavelengths between any two ports), directionless (operating in any direction for any degree) and contentionless (enabling different routing paths simultaneously for different wavelengths without affecting other routing operations) [61,62].
In this paper, we focus on a "route-and-select" ROADM architecture, as shown in Fig. 6, which is colorless, directionless and contentionless [11].In this architecture, wavelengthselective switches (WSSs) are used on both incoming and outgoing fibers for combining and selecting wavelengths.Some of the WSS ports are allocated for express functionality, and are directly connected to the corresponding ports of the desired link.The remaining WSS ports are allocated for add/drop functionality, and connected to the corresponding set of multi-cast switches (MCSs).MCSs are used for connections between add/drop ports, and are capable of dropping/adding any wavelength to/from any add/drop port to/from any direction, providing full flexibility to the architecture.Each incoming/outgoing fiber in a link corresponds to a set of MCSs.Optical amplifiers (not shown in Fig. 6) are used at MCSs and WSSs to compensate for their losses.Fiber shuffle panels (FSPs) are used as passive connectors to simplify cabling.

Switching components per node
The major challenge in scaling optical networks to high throughput is the limited number of ports available on WSSs and MCSs.By increasing switching granularity, the number of switching components required to route a given throughput can be reduced.
In our analysis, we assume full contentionless express and add/drop functionalities such that (i) there exists an optical path for routing any superchannel at any wavelength in any direction, and (ii) there exists an optical path and a corresponding add/drop transmitter/receiver pair for each superchannel in the optical band.We also assume that each spectral or spatial superchannel is routed as a single unit and no further aggregation is applied at intermediate nodes.We compute the minimum number of switching components required to support a given aggregate throughput.
Let N and M denote the number of selection ports and the number of splitting ports of the MCSs.Note that the number of splitting ports M determines the maximum possible ROADM degree d .The number of MN  MCSs required for adding or dropping all spectral-spatial superchannels is / F wN   per fiber per direction.The number of 1 N  WSSs required is determined by the number of optical paths needed for express and add/drop functionalities.The number of ports required for express functionality is d in both incoming and outgoing directions.The number of ports required for add functionality in the ongoing direction is given by the number of MCSs connected to transmitters.The number of ports required for drop functionality in the incoming direction is given by the number of MCSs connected to receivers.For a single-stage WSS per fiber, as shown in Fig. 6, a maximum of Nd  MCSs for add or drop can be supported.To provide a larger number of optical paths, cascaded WSSs should be used.By cascading each of the To compute the total number of WSSs required, this latter factor should be multiplied by  to maintain a similar number of switching components.Spectral aggregation reduces the number of routable channels within the optical band, reducing the number of MCSs required to add/drop the entire band, and reducing the number of WSSs required to form connections with the MCSs.Spatial aggregation allows a given number of MCSs and WSSs to add/drop and route a higher aggregate throughput in proportion to N S .

Routing power
A drawback of decreasing the number of ports is a reduction in the number of routable input and output port combinations.Following [57,58,60], we define the routing power as log( ) log( ) , where ROADM C is the number of connection states supported by a ROADM and network C is the number of all possible states supported by the network.R is a measure of node-to-node connectivity in a mesh optical network.The number of traffic demands that can be satisfied by a ROADM is found to be roughly linearly correlated with R [60], and the wavelength-blocking probability of the network decreases with R [58].Note that R is lower bounded by 0 (a ROADM with no routing functionality) and upper bounded by 1 (a ROADM with full routing functionality for the assumed channel granularity).To determine network C we consider a standard WDM system with a spectral granularity of v  and assume that all-optical wavelength conversion is not available.
We first analyze the routing power reduction for the express functionality.In determining network C , 1,1 P fibers in both incoming and outgoing links are used.Each channel can be routed in any direction including back to the incoming direction, which yields !d options.
Considering all wavelength channels, .The routing power R is roughly inversely proportional to F for large F and to N S for large N S .We now analyze the routing power reduction for the add/drop functionality.We first consider the total number of drop options for an incoming direction.In the standard WDM system, there are is the binomial coefficient representing the number of distinct k-combinations among n elements.Among the  channels, the number of add options are . Hence, the total number of add/drop options per direction becomes .Similar to the express functionality, the routing power R is roughly inversely proportional to F for large F and to N S for large N S .By increasing the network granularity from the channel level to the superchannel level, the major compromise is a reduction of the node-to-node connectivity, i.e., the number of paths available between any two nodes in the network.The routing power decreases with increases in both F and N S , which reduces the total number of traffic demands that may be satisfied by a ROADM.We discuss further implications of the routing power in Sec. 5.

Other considerations
In this section, we highlight additional considerations that are important for evaluating spectral and spatial aggregation techniques.

Switch implementation
Spectral and spatial superchannels pose special requirements for the MCSs and WSSs that comprise ROADMs.
Spectral superchannels require WSSs that can switch variable-width frequency bands with minimal spectral gaps in order to make efficient use of the available optical spectrum.These requirements are satisfied by WSSs using liquid crystal-on-silicon (LCoS) switching planes [28,63,64].LCoS WSSs are now widely deployed in optical networks.
Spatial superchannels require switches that can accommodate multi-core or multi-mode signals while achieving port isolation and spectral selectivity (in a WSS) similar to those of single-mode switches.These requirements may be met by remapping multi-mode beams to/from a plurality of single-mode beams [65][66][67], but at the cost of significantly increasing the switch's internal complexity.An attractive alternative switch design strategy is to start with a single-mode switch and simply scale certain internal dimensions to accommodate the multi-mode or multi-core beam [68].The ratio between the radii of the multi-mode or multicore beam and the single-mode beam is represented by a scaling parameter  .Port isolation and spectral selectivity can be maintained if various internal dimensions in the switch are scaled by different powers of  .Several different design strategies are possible.For example, to scale an LCoS WSS while maintaining the same number of ports, the LCoS pixel pitch may be scaled by 1/ , the LCoS switching plane size may be scaled by  , and the port spacing may be scaled by 2    .The choice of switch scaling parameter  depends on the type of spatial superchannel and the fiber design.
Uncoupled spatial superchannels require the largest values of  .When using identical uncoupled-core MCFs for transmission and switching,  is determined by the core spacing, which must be far larger than the individual core radii to minimize crosstalk in long-haul transmission.Assuming hexagonally packed cores with typical core separations of 9-11 times the core radius [36][37][38] yields  ~10 -12 for LN SS  = 7 and  ~19 -23 for LN SS  = 19, values that would result in excessively large switches.Whether using uncoupled-core MCFs or multiple SMFs for transmission, these may be first coupled MCFs, as explained in Sec.3.1.Because these sections of MCF at the switches are very short, a smaller core spacing can maintain permissible crosstalk levels.Using the model in [40] for an MCF of 10-cm length, we estimate that a core separation of about 6.5 -8 with trenches [39] can yield tolerable crosstalk levels (less than 35 dB).Hence, coupling of multiple SMFs at ROADMs yields  ~7.5 -9 for N S = 7 and  ~14 -17 for N S = 19.In all cases of uncoupled spatial superchannels,  scales as 0.5 N S for large N S .
Coupled spatial superchannels permit smaller values of the switch scaling parameter  .
For coupled-core MCFs, the core spacing is chosen to optimize the transmission properties and the optimized core separation is about 4.5 -6 times the core radius [44].Hence, the required scaling parameter is  ~5.5 -7 for LN SS  = 7 and  ~10 -13 for LN SS  = 19.
The smallest values of  are obtained using MMFs.Assuming graded-index MMFs and LN SS  , the required scaling parameter is  ~1.2 -1.3 for LN SS  = 3 and  ~1.7 -1.9 for LN SS  = 15.For large N S ,  scales approximately as 0.2 N S -0.25 N S [68].The small values of  in MMF make coupled spatial superchannels in MMF promising for scaling up switches while maintaining a simple switch design and a constant number of switch ports.
In WSSs for coupled spatial superchannels, the frequency-dependent transfer function is affected by the complex field profiles of multi-mode beams [67,68].Different pure modes have slightly different passband responses, and are coupled to other modes, particularly at frequencies near the passband edge.When multiple modes are present, interference between modes can alter the transmission and coupling coefficients, shifting the passband center frequency and changing its bandwidth [68].If these phenomena are found to negatively impact system performance, they can be minimized by modestly increasing the scaling parameter  [68].Mode-coupling matrices may be used to compute the mixed modes having the narrowest or widest bandwidths, or having the largest center-frequency offsets.In a system with many cascaded WSSs and strong mode coupling, the end-to-end response per switch may be characterized by a mode-averaged transmission coefficient [68].

Optical amplification
Spectral superchannels pose requirements similar to standard WDM channels for optical amplifiers, requiring careful design of amplifiers to minimize wavelength-dependent gain, and typically requiring active control of wavelength-dependent gain [69].Spatial superchannels may share optical amplifiers, provided they are designed to amplify signals in multiple spatial dimensions.This may reduce the number of amplifiers per link by a factor 1 L S  , and may reduce the number of amplifiers within a node by a factor 1 N S  .The sharing of control systems and management interfaces represents an opportunity to reduce the power consumption per spatial dimension [70].
A potential challenge for spatially multiplexed amplification is to provide equal gain for each spatial dimension, since gain differences reduce the achievable spectral efficiency and may cause outage in spatial superchannels [45,46].Approximately equal gain per spatial dimension can be achieved in multi-core EDFAs [72] or in multi-mode EDFAs with optimized doping profiles [74].Active control of spatially dependent gain may help minimize spectral efficiency loss and outage probability.Adaptive equalization of mode-dependent gain in multi-mode EDFAs using a spatial light modulator in the pump beam or at the amplifier output was proposed in [77].

Fiber nonlinearity
Distortion and crosstalk caused by the Kerr nonlinearity limit achievable spectral efficiency in fiber systems [4].
Spectral superchannels (excluding waveband superchannels) may be subject to more nonlinear degradation than standard WDM channels for several reasons.All the methods have a denser power spectrum than standard WDM channels.Nyquist WDM and DFT-spread OFDM have slightly higher peak-to-average power ratios [8] than standard WDM channels, while NGI OFDM has a significantly higher peak-to-average power ratio [6].Nevertheless, in Nyquist WDM and NGI OFDM, nonlinear phase shifts are dispersed by CD, and the two methods are expected to be subject to similar nonlinear degradation [8,78], which is confirmed by numerical [79] and experimental [80] studies.In DFT-spread OFDM, the power per subcarrier is low and the subband bandwidth can be optimized to reduce nonlinear interactions, so less nonlinear signal degradation is expected than for NGI OFDM [78,81].Numerical [81] and experimental [82] studies demonstrate about a 1-dB superior performance of DFT-spread OFDM over NGI OFDM.Despite these various comparisons, a complete comparison of nonlinear performance among all spectral superchannel techniques does not exist, and is an important topic for future research.
Uncoupled spatial superchannels are expected to have an achievable spectral efficiency per spatial mode similar to single-mode systems, since propagation in uncoupled-core MCF and SMF is fundamentally similar [44].
Coupled spatial superchannels in coupled-core MCF or MMF are subject to complex nonlinear effects.The scaling of spectral efficiency per spatial mode depends on how intraand inter-modal effective areas scale with the number of modes and how inter-modal interactions are affected by phase matching and envelope walkoff [54,55,83,84].Strong mode coupling can mitigate nonlinear effects [54,55].Coupled-core MCFs offer advantages of large effective areas and strong mode coupling [44].The comparison of achievable spectral efficiency per mode in coupled spatial superchannels to single-mode systems is still unclear, and is an important topic for future research.

Transceiver integration
Both spectral and spatial superchannels have the potential to reduce transceiver cost per bit through electronic and photonic integration [5,16] and through sharing of control systems and management interfaces.Spatial superchannels may benefit from sharing a single laser for all spatial dimensions.The sharing is strictly required for coupled spatial superchannels (and is shown in Fig. 3(b)) but also possible for uncoupled spatial superchannels (although it is not shown in Fig. 3(a)).

Signal processing complexity
The various spectral and spatial aggregation methods have different implications for signal processing complexity, which is a major factor affecting transceiver power consumption.
Non-overlapping superchannels based on Nyquist WDM typically employ digital pulse shaping at the transmitter and digital matched filtering at the receiver, e.g. each using a rootraised-cosine filter.The required length of a time-domain filter (or its equivalent in a frequency-domain filter) scales inversely with the excess bandwidth factor [85].Even with excess bandwidth factors as low as 0.05, the required filter length is only about 40 symbols [85].This is far smaller than the filter length required to equalize CD, which is about 500 symbols for a 2000-km long-haul system [52].Hence, digital pulse shaping for Nyquist WDM is expected to contribute modestly to overall signal processing complexity.
Overlapping spectral superchannels based on DFT-spread OFDM can be designed to have small spectral overlap between constituent channels and can be implemented with low oversampling ratios, e.g., os 1.25 r  [32].In a straightforward design, DFT-spread OFDM requires one extra DFT and IDFT at the transmitter and receiver, respectively [82], which increase signal processing complexity substantially.Subband signal processing architectures can reduce the complexity of equalizing CD, polarization mode dispersion (PMD) or MD for DFT-spread OFDM [86,87].In NGI OFDM, as mentioned above, each constituent channel overlaps with several neighbors, and a receiver oversampling ratio os r of at least 4 is needed to minimize interchannel interference [8,80].In NGI OFDM, one analog-to-digital converter may be shared among two or more constituent channels, but at the cost of degraded performance [29] or requiring higher sampling rate.Since the computational complexity per information bit for equalizing CD, PMD or MD scales at least in proportion to the oversampling ratio os r [52], the computational complexity for NGI OFDM is higher than for standard WDM systems and other spectral superchannel techniques.The signal processing complexity of overlapping spectral superchannels is increased if lasers with large phase noise or frequency offset are used, which necessitate interchannel interference cancellation by digital filtering [88].
Uncoupled spatial superchannels have a signal processing complexity per bit equivalent to standard WDM systems.Moreover, it is possible to reduce the complexity by sharing common subsystems (e.g.phase/frequency recovery [89]) among different spatial dimensions.
Coupled spatial superchannels require 22 LL SS  MIMO equalization [50,51].The computational complexity of MIMO equalization is an important issue because of the spatial dimensionality L S and the potentially large delay spread resulting from MD [50].Since the MIMO channel may change on sub-millisecond time scales [90], the MIMO frequencydomain equalizer should be adaptive [53].Using optimized frequency-domain MIMO equalization, the complexity per information bit scales sublinearly in the MD delay spread and in L S [52,53].

Discussion
Spectral and spatial aggregation can reduce the number of fibers required for high-throughput transmission and reduce the number of components required for high-throughput switching, but at the cost of a reduced routing power.The reduced number of fibers and switching components is expected to reduce network capital costs considerably.The various spectral and spatial aggregation methods have different advantages and disadvantages.
Among options for non-overlapping spectral superchannels, Nyquist WDM is most promising because it yields the highest spectral efficiency despite a slightly higher DSP complexity.Among options for overlapping spectral superchannels, DFT-spread OFDM is most promising because it requires the lowest oversampling ratio and yields the best nonlinear performance.Nyquist WDM and DFT-spread OFDM achieve similar spectral efficiencies, and the choice between them is less obvious.DFT-spread OFDM has somewhat higher signal processing complexity and requires synchronization of all transmitters to frequency and clock references.
Uncoupled spatial superchannels require only 22  MIMO signal processing at the receiver, but require designing all the components in a network to ensure low end-to-end crosstalk, and require switching components with substantially increased complexity or size, owing to large values of the scaling parameter  in MCFs.Transmission through uncoupled- core MCFs is scalable to only limited spatial aggregation factors, because MCFs become inflexible beyond a certain number of cores.This limitation can be overcome by transmitting through multiple SMFs and using uncoupled-core MCFs to form spatial superchannels only inside the ROADMs.Coupled spatial superchannels require higher-dimensionality MIMO signal processing, which may increase transceiver power consumption, and require control of end-to-end mode-dependent loss and gain in a network to prevent outage or a loss of spectral efficiency.But coupled spatial superchannels in MMF enable scaling switching components to large spatial aggregation factors with only a modest increase in size, owing to the small values of the scaling parameter  in MMFs.
We have used deterministic routing power as a simple measure to quantify connectivity in a network.The random nature of traffic demands, although ignored in this paper, can be an important factor governing network performance.Assuming a fixed superchannel bit rate, if traffic demands require throughputs much larger than the superchannel bit rate with high probability, a reduced routing power has minimal impact on network efficiency.If those conditions are not satisfied, then employing variable spectral-spatial aggregation can improve network efficiency, but may increase network management complexity.
The various options enabling variable spectral-spatial aggregation offer different advantages and drawbacks.Combining variable non-overlapping spectral with variable uncoupled spatial aggregation provides the greatest rate flexibility, but does not minimize optical switching hardware complexity.Alternatively, combining variable non-overlapping spectral with fixed coupled spatial aggregation may provide sufficient rate flexibility while minimizing optical switching hardware complexity.The analysis and implementation of flexible spectral-spatial aggregation are important topics for future research.
We have assumed that aggregation is applied at the source node and that the corresponding disaggregation is applied at the destination node.Further aggregation and disaggregation at intermediate nodes, known as grooming [91,92], can further improve network efficiency, particularly when traffic demands are highly variable.It is desirable to perform grooming without requiring O-E-O regeneration at the intermediate modes, and the various options enabling variable aggregation differ in this regard.Non-overlapping spectral and uncoupled spatial aggregation may allow components in individual frequency bands or spatial dimensions to be added or dropped optically at intermediate nodes (see Sec.

Conclusion
We have reviewed techniques for transmission and switching traffic that is aggregated in spectrum and space.We have shown that spectral and spatial aggregation can reduce network routing complexity by reducing the required number of fibers per link and the required number of switching components per node.The number of fibers required per link is determined by the link throughput and the perfiber transmission capacity.Spectral aggregation yields a modest decrease in the number of fibers and spatial aggregation reduces the number of fibers in inverse proportion to the spatial aggregation factor.
The number of switching components required per node is determined by the node throughput, the ROADM architecture, the per-fiber transmission capacity and the number of ports per switch.Focusing on a colorless, directionless and contentionless ROADM architecture, we have demonstrated that combined spectral and spatial aggregation can significantly reduce the required number of switching components.A common drawback of spectral and spatial aggregation is a reduction in the routing power, which reduces the number of traffic demand variations that can be satisfied by the network.
We have discussed several issues that are relevant for various implementations of spectral and spatial superchannels.Among options for spectral superchannels, Nyquist WDM and DFT-spread OFDM are the two most promising.Among options for spatial superchannels, coupled spatial superchannels in MMF is by far the most favorable in terms of optical switch scaling.

Fig. 4 .
Fig. 4. Spectral and spatial patterns for: (a) uncoupled spatial superchannels, and (b) coupled spatial superchannels.MCF: multi-core fiber, MMF: multi-mode fiber.Uncoupled spatial superchannels, described in Figs.3(a) and Fig. 4(a), are based on multiplexing signals in the different cores of uncoupled-core MCFs.For an uncoupled-core MCF with L S cores, the spectral efficiency per MCF is L S times the spectral efficiency of an

S
are constant throughout a network.Accommodating a link throughput beyond the capacity of a single fiber requires multiple fibers per link.Spectral and spatial aggregation can minimize the number of fibers required.Let , b ag R denote the aggregate throughput per link and let b R denote the bit rate of a standard WDM channel.Using a spectral aggregation factor F and a link spatial aggregation factor L S , the number of fibers required per link is given by

1 S L = 3 S L = 15 S L = 7 Fig. 5 .
Fig. 5. Number of fibers required per link PF,SL vs. aggregate throughput per link for different spectral and link spatial aggregation factors F and SL.The figure assumes Rs = 35 Gbaud, Δν = 50 GHz, QPSK modulation, 20% FEC overhead and use of the entire C band.Curves labeled "overlapping" assume the full spectral efficiency increase achieved by NGI OFDM.
link, the total number of ROADM ports required for both incoming and outgoing directions is, N FS dP .Note that we are considering possible additional spatial aggregation at the nodes.

#Fig. 6 .
Fig.6.Route-and-select ROADM of degree three (d = 3), showing some exemplary port connections for: (1) express functionality for same incoming and outgoing directions, (2) express functionality for different incoming and outgoing directions, (3) drop functionality, and (4) add functionality.For simplicity, a single-WSS-stage architecture is shown and optical amplifiers at the input and output ports of WSSs and MCSs are not shown.WSS: wavelengthselective switch, MCS: multi-cast switch, FSP: fiber shuffle panel.

Figure 7 11 Tbit
Figure 7 shows the number of switching components required at a node for different spectral and node spatial aggregation factors F and N S for four different values of aggregate throughput per direction.Here we assume d = 4 for the ROADM, M = 8 and N = 16 for the MCSs and N = 16 for the WSSs.Scaling nodes to very high throughputs while limiting growth in the number of switching components requires both spectral and spatial aggregation.For example, scaling from 11 Tbit/s with     , 1,1

Table 1 . Spectral superchannel comparison.
). Non-overlapping spectral superchannels are described in Figs.1(a) and Fig.2(a).Constituent channels have a symbol rate s R and a spacing v  sufficient to ensure negligible mutual interference between them.Spectral efficiency is proportional to / s vR  NGI: not applicable Spectral efficiency (linear regime) s