Photonic integrated circuit optical buffer for packet-switched networks

A chip-scale optical buffer performs autonomous contention resolution for 40-byte packets with 99% packet recovery. The buffer consists of a fast, InP-based 2x2 optical switch and a silica-on-silicon low loss delay loop. The buffer is demonstrated in recirculating operation, but may be reconfigured in feed-forward operation for longer packet lengths. The recirculating buffer provides packet storage in integer multiples of the delay length of 12.86 ns up to 64.3 ns with 98% packet recovery. The buffer is used to resolve contention between two 40 Gb/s packet streams using multiple photonic chip optical buffers. 2009 Optical Society of America OCIS codes: (060.1810) Buffers, couplers, routers, switches, and multiplexers; (250.5300) Photonic integrated circuits References and links 1. D. Blumenthal, B.-E. Olsson, G. Rossi, T. Dimmick, L. Rau, M. Mašanović, O. Lavrova, R. Doshi, O. Jerphagnon, J. Bowers, V. Kaman, L. Coldren, and J. Barton, “All-optical label swapping networks and technologies,” IEEE J. of Lightwave Technol. 18, 2058-2075 (2000). 2. G. I. Papadimitriou, C. Papazaoglou, and A. S. Pomportsis, “Optical switching: switch fabrics, techniques, and architectures,” IEEE J. Lightwave Technol. 21, 284-404 (2003). 3. E. F. Burmeister, D. J. Blumenthal, and J. E. Bowers, “A comparison of optical buffering technologies,” Optical Switching and Networking 5, 10-18 (2008). 4. R. Langenhorst, M. Eiselt, W. Pieper, G. Groβkopf, R. Ludwig, L. Küller, E. Dietrich, and H. G. Weber, “Fiber loop optical buffer,” IEEE J. Lightwave Technol. 15, 324-335 (1996). 5. N. Chi, Z. Wang, and S. Yu, “A large variable delay, fast reconfigurable optical buffer based on multi-loop configuration and an optical crosspoint switch matrix,” Optical Fiber Com. Conf. OFC, OFO7 (2006). 6. B. A. Small, A. Shacham, and K. Bergman, “A modular, scalable, and transparent optical packet buffer,” IEEE J. Lightwave Technol. 25, 978-985 (2007). 7. D. F. Welch, F. A. Kish, R. Nagarajan, C. H. Joyner, R. P. Schneider, Jr., V. G. Dominic, M. L. Mitchell, S. G. Grubb, T.-K Chiang, D. D. Perkins, and A. C. Nilsson, “The realization of large-scale photonic integrated integrated circuits and the associated impact on fiber-optic communication systems,” IEEE J. Lightwave Technol. 24, 4674-4683 (2006). 8. P. C. Ku, F. Sedgwick, C. J. Chang-Hasnain, P. Palinginis, T. Li, H. Wang, S. W. Chang, S. L. Chuang, “Slow light in semiconductor quantum wells,” Opt. Lett. 29, 2291-2293 (2004). 9. F. Morichetti, A. Melloni, C. Ferrari, and M. Martinelli, “Error-free continuously-tunable delay at 10 Gbit/s in a reconfigurable on-chip delay line,” Opt. Express 16, 8395-8405 (2008). 10. R. S. Tucker, P.-C. Ku, and C. J. Chang-Hasnain, “Slow-light optical buffers: capabilities and fundamental limitations,” IEEE J. Lightwave Technol. 23, 4046-4066 (2005). 11. E. F. Burmeister, J. P. Mack, H. N. Poulsen, J. Klamkin, L. A. Coldren, D. J. Blumenthal, and J. E. Bowers, “SOA gate array recirculating buffer with fiber delay loop,” Opt. Express 16, 8451-8456 (2008). 12. M. Enachescu, Y. Ganjali, A. Goel, N. McKeown, and T. Roughgarden, “Part III: routers with very small buffers,” SIGCOMM Comput. Commun. Rev. 35, 83–90 (2005).


Introduction
All-optical routers have the potential to offer greater bit-rate transparency and protocol flexibility than electrical routers while reducing power consumption [1].One proposed solution toward realizing these goals is optical packet switching (OPS), which provides an efficient use of bandwidth by multiplexing data packets in time as well as wavelength and is the solution closest to the current Internet Protocol (IP) [2].However, a remaining challenge in OPS is to resolve contention between packets competing for router resources.In electrical routers packets may be stored indefinitely in RAM.To avoid dropping packets or adding unnecessary latency in optical routers, optical memory, or buffering, is needed.
Previously, there has not been an optical buffering solution that is compact, scalable, and operates with data at high bit rates [3].Optical memory technology must be able to buffer data from practical packet streams; this includes packet lengths of at least 40 bytes at bit rates of 40 Gb/s and higher with guard bands no more than several nanoseconds in length.At the same time, there are additional considerations such as cost, power consumption, and footprint that would limit the possible success of the technology.Delay line buffers dominate the proposed buffering approaches and have demonstrated successful buffering [4][5][6], but would offer more promise as a commercial memory element as an integrated technology.Integration has been shown to offer benefits such as lower cost, improved performance and better reliability [7].Slow light buffers are interesting in the push to reduce footprint [8][9], but have fundamental limitations, largely in the form of a bandwidth-delay limit [10].
In this paper, we demonstrate the first optical buffering of 40-byte packets of 40 Gb/s data using an on-chip optical buffer.The device consists of an InP-based, fast 2x2 switch and a silicon oxynitride waveguide delay.The buffer is flexible in implementation, but is demonstrated here in the recirculation configuration, thus offering longer storage times in a smaller footprint.A schematic illustrating the use of the buffer elements for packet storage is shown in Fig. 1.A detailed description of the optical buffering device will be provided in Section 2. The experimental results and discussion on the characterization and system demonstrations are presented in Section 3. a.

Buffer approach
The chip-scale buffering device presented here has been designed to provide a prototype toward a fully integrated device that offers flexibility in implementation.Two material systems are used to achieve both large gain (InP-based) and high transparency (silica-based).
The base buffer cell is comprised of a fast InP 2x2 switch with monolithic amplifiers to compensate for propagation and coupling losses from a silica waveguide delay line.This base cell can be cascaded to allow for simultaneous storage of more packets.The array may be used either as FIFO (first-in-first-out) memory, or can allow for packets to be re-ordered if prioritization is implemented.For the most common packet length of 40 bytes (acknowledgement packets in TCP/IP), the devices should be used in recirculating operation to minimize the number of cells needed.Longer packet lengths are limited in recirculating operation by the length of the delay line, but an array of cells can provide for feed-forward operation (Fig. 1b) and unlimited packet length.With the addition of second read and write ports, the buffer can also be used for speed-up (Fig. 1c).In this report the devices are used as recirculating buffers, both as single base cells and arrayed.

InP 2x2 switch
The 2x2 switching device affords negligible power penalty for a data rate of 40 Gb/s, the ability to switch within packet guard bands (<2 ns), high extinction ratios (>40 dB) for cascadability, and enough gain to compensate for its own insertion loss as well as that of the delay loop.A semiconductor optical amplifier (SOA) gate matrix is used as the switching structure to guarantee low crosstalk and for fast switching.Four gain amplifiers are monolithically integrated with the six switching amplifiers and are all less than 650 µm to reduce saturation effects.The schematic of the switch with the delay is shown in Fig. 2. Details of the switch operation and characterization were reported previously in [11].
Fig. 2. Schematic of an SOA gate matrix switch with a delay.

Silica recirculation loop
The recirculating loop is the other essential component of the buffer, important for providing nearly transparent delay.A silica-on-silicon buried ridge waveguide of core dimensions 5.5 by 5.5 microns provided the necessary low propagation loss at the small expense of large bend radii.The index contrast using silicon oxynitride was 0.76%, standard for the foundry, ANDevices.The waveguide design is conservatively limited to a minimum bend radius of 6 mm, but was spiralled on the chip to reduce space.The area needed for 2 m of delay is 6.4 cm 2 .Passive measurements were taken over a wavelength range from 1525 nm to 1575 nm for the silica waveguides to verify that long lengths of delay are possible.Measurements show propagation losses of less than 0.04 dB/cm at 1550 nm, varying less than 0.001 dB/cm over the 50 nm span.Polarization dependent loss for 200 cm of waveguide was approximately 1 dB and chromatic dispersion was approximately 130 ps/nm•km.

Measurement setup
The InP devices were soldered and wirebonded to aluminum nitride submounts and cooled to approximately 20˚C.The silica delay chip was held using a stage with 6 degrees of freedom to align the two pairs of waveguides simultaneously.The optical signal (1560 nm) was modulated and analyzed using an SHF 50 Gb/s BERT with RZ 2 31 -1 pseudo-random bit sequence (PRBS) data at 40 Gb/s.A variable attenuator and a polarization controller were placed in the setup before the device to maintain a TE-polarized input since the amplifiers are polarization dependent.A 1.2-nm bandpass filter was placed before the receiver to reduce the amplified spontaneous emission (ASE).
Optical data packets at 40 Gb/s were generated to test multiple circulations.Layer 2 packet measurements used 40-byte packets that were analyzed with a PC as the BERT cannot synchronize with data that contains long blank spaces.The packet consists of a 32 bit idler, 64 bit identifier, 8 bit label, and 216 bits of repeated PRBS 2 7 -1.The label strings allowed for packet reordering and the identifiers were evaluated upon receipt for bit errors to determine if the packet was recovered.The switch timing was synchronized with the packet arrival using a payload envelope detect circuit and field programmable gate array (FPGA) based board.

Device characterization
The photonic chip buffer is comprised of the InP switch and silicon oxynitride waveguide delay and achieved 64 ns of packet storage (5 circulations).The total loop loss without gain is estimated to be 30 dB; composed of 8 dB of silica propagation loss, 14 dB for two couplings between the chips, and the remainder from the InP circuit losses.The four amplifiers in the path provide slightly more gain for one circulation, but the buildup of amplified spontaneous emission (ASE) lessens the gain for greater numbers of circulations.In order to find the optimal input power and the dynamic range of the buffer, the power penalty at a BER of 10 -9 was measured for one circulation.Figure 3a shows that the dynamic range was approximately 15 dB, thus making it practical for system use and allowing for multiple devices to be cascaded.Negative power penalty was observed due to the reduction of noise between packets from the gating SOAs as well as from slight pulse reshaping from the SOAs.Packet memory was then tested with the photonic chip buffer and 5 circulations (64 ns) was reached with 98% packet recovery (98% of packet identifier strings having no incorrect bits) (Fig. 3b).

Contention resolution
The remaining demonstrations used two on-chip buffers cooperating to resolve contention.The buffers ran autonomously using a payload envelope detect circuit to discern upcoming contention, an arbiter to make buffering decisions, and electronic channel processors to send signals to the buffer device.Buffer operation without pre-programmed switching was important to demonstrate that the buffers would be successful in a larger system, specifically a router linecard.To enable autonomous operation, half of the incoming data was split to payload envelope detect (PED) circuits that give the two electronic channel processors (ECP) knowledge of packet arrival (Fig. 4).Each ECP sends port requests to the arbiter board (ARB) which tracks the packets and performs logic.The arbiter sends signals to the ECPs which send gating control signals to the optical buffers for read, write, or bypass state operation.The first experiment used buffers on two separate channels to perform contention resolution under autonomous control.A stream of three packets was used to exercise several buffering states (Fig. 5).The buffers were required to delay a total of four of the six packets (three packets on each channel) in order to avoid temporal collisions at the output port when the two streams were combined (tunable delay lines (TDL) were set to make the distance to the buffer the same for both arms).The sensitivity packet measurements show that all packets had greater than 99.5% packet recovery (Fig. 6).Several of the packets show negative power penalty due to the gating performed by the switch which will decrease the ASE level and also reshaping from the amplifiers as seen in device bit error rate measurements.The second experiment demonstrated the use of two photonic chip buffers inline on one channel to provide contention resolution for an empty channel, and more importantly show simultaneous multiple packet storage.Inline the two buffering devices represent one buffer with two memory cells.The firmware in the arbiter was changed to reflect the new positions and roles of the buffers, whereas all hardware remained the same except that the buffers were placed inline (Fig. 7).The same stream of three packets was sent through the first buffer in which the third packet was buffered for one time slot.The stream then passed through the second buffer which delayed the first and third packet for one time slot and the second packet for two time slots (Fig. 8).Greater than 99% packet recovery was measured for all packets (Fig. 9).The results for the concatenated buffers were slightly better than the parallel buffers, due to better alignment between the InP and silica chips.

Conclusions
The packet capacity and maximum storage time of this buffering approach can be increased to provide a realistic solution for optical routing.Recent work in buffer sizing shows that with little sacrifice in performance, optical core routers can be equipped with much less memory capacity than what is currently used in electrical routers.Enachescu et al. report that 10-20 packet buffers may be sufficient for 80-90% link utilization if there are many aggregate flows which can serve to smooth burstiness [12].Although more conclusive work is needed, optical buffering appears to be more practical than was initially assumed.
The optimization of the amplifier material platform and a decrease in loop losses are highly important.In order to lengthen the storage time the output saturation power of the amplifiers should be increased and, even more importantly, the noise figure should be decreased.The key to a low noise figure is a decreased internal loss, especially for amplifiers with small confinement factors.In addition, a decrease of 15 dB of total loop loss would result in increasing the maximum number of circulations from 5 to 25 for a noise figure of 6, assuming an OSNR of 20 dB is required.This decrease in loop loss may be realized by reducing the coupling loss between the InP and silica waveguides using spot size converters.
The first on-chip optical buffer for 40-byte packet lengths and a bit rate of 40 Gb/s demonstrated successful contention resolution between two packet streams.A single memory element showed up to 5 circulations, or 64 ns of storage, with 98% packet recovery.Significant increases in storage time can be made with simple improvements to reduce the loop loss and the noise figure of the amplifiers.Autonomous contention resolution was performed with 99% packet recovery for a packet capacity of 2, as well as for 2 separate buffered channels.The results presented used the buffers in a recirculating configuration, but the buffer can also be used in feed-forward operation for longer packet lengths.These strengths show that optical buffering elements consisting of SOA-gated switches and silica waveguides offer a promising solution for populating buffers in optical routers.Multiple delay lines can be efficiently packed into the same size by interleaving waveguides in the same spiral.Consequently, up to 100 such buffers could be integrated onto a single die.

Fig. 3 .
Fig. 3. (a).Dynamic range of input power for one circulation with 40 Gb/s RZ 2 31 -1 PRBS.Insets show scope traces of the bit stream.(b) Packet recovery measurements showing 98% packet recovery for up to 5 circulations, or 64 ns of storage, using a silica delay chip.

Fig. 4 .
Fig. 4. Schematic of the set-up for resolving contention between two buffered packet streams.

Fig. 5 .
Fig. 5. Oscilloscope traces showing packets at the inputs to the buffers, after the buffer outputs, and the combined output.

Fig. 7 .
Fig. 7. Schematic of the set-up for contention between a delay path and a packet stream.

Fig. 8 .
Fig. 8. Oscilloscope traces showing packets in the empty contention path, at the input to the buffer, after the buffer output, and the combined output.

Fig. 9 .
Fig. 9. Packet recovery measurements for the packet stream using two optical buffers.