Co-design of differential transimpedance amplifier and balanced photodetector for sub- pJ/bit silicon photonics receiver

This paper presents the design and implementation of a fully differential optical receiver, which is aimed for short reach intensity modulation and direct detection (IMDD) transceiver links. A Si-Ge balanced photodetector (PD) has been co-designed and packaged with a novel differential transimpedance amplifier (TIA). The TIA design is realized with a standard 28nm CMOS process and operates with a standard digital supply (1V). Without using any equalization or DSP techniques, the proposed receiver can operate up to 54Gb/s with a BER less than the KP4 limit (2.2×10) under an optical modulation amplitude (OMA) of -8.6dBm, while the power efficiency has been optimized to 0.55pJ/bit (0.98pJ/bit if output buffer is included). © 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement


Introduction
Global interest in developing silicon photonic interconnects to meet the growing demands for short reach communication has increased dramatically over the last decade [1]. On the receiver side, this has led to significant research effort in developing direct detection receivers, including the development of high bandwidth, high responsivity silicon photonics photodetectors, high-speed, low noise transimpedance amplifiers, and advanced digital signal processing (DSP) techniques for complex modulation formats within an IMDD transceiver. While there has been a great deal of research effort on these individual topics, consideration of the integration of these design aspects is limited. In this work, we present an integrated silicon photonics receiver where the PD and TIA are co-designed synergistically in terms of process selection, device packaging, power efficiency, bandwidth, noise figure, and detection scheme.
Generally, bandwidth and noise figure are the most critical parameters for the design of the TIA. In the past few years, many designs [2][3][4][5][6] using advanced BiCMOS technology have been reported. Using a 3.3V power supply, a 100Gb/s differential BiCMOS TIA was reported with 8pA/√Hz averaged input-referred noise current density in [5]. In [6], a differential optical receiver operating at 90 Gb/s with a sensitivity of -7.1 dBm OMA was reported by using a 55nm BiCMOS process with a 3.3V power supply. Compared to the use of the BiCMOS process, the inclusion of the TIA design in a standard bulk CMOS process offers the advantages of lower cost and power consumption. Moreover, adopting the same standard CMOS process for the TIA in the optical receiver as for the subsequent DSP chips eliminates the need for interconnection between the two modules.
In recent years, several CMOS based TIA designs [7][8][9][10] have been proposed. In [8], regulated with an on-chip voltage regulator which was powered from a 2.5V supply, a 112Gb/s PAM4 TIA built upon a 28nm CMOS process was reported with an optical sensitivity down to -5.1dBm. In [9], based on a 65nm CMOS process with a 1.2V power supply, a 40GHz TIA with 19.8pA/√Hz averaged input-referred noise current density was demonstrated. While these CMOS based TIA designs have achieved an excellent bandwidth together with a moderate noise figure, they require excessively high power supplies, which leads to a requirement for additional interfacing circuits (e.g. level shifter) between the analogue front-end circuits and the corresponding digital functions. To the best of the authors' knowledge, only the designs reported in [11,12] have adopted standard CMOS power supply levels (~1V). However, the design reported in [11] lacks integration information. The design presented in [12] does include the integration work with a PD, but this optical receiver can only cope with a Non-Return-to-Zero (NRZ) signal. The single-to-differential conversion scheme adopted in [12] introduces additional phase mismatch within the front end circuit and may limit the option to use different kinds of complex modulation formats.
In this work, we present the design and characterization of an optical receiver that is composed of a 28nm CMOS TIA and a balanced Si-Ge photodetector. The electrical and optical units were co-designed and packaged, providing fully differential output signals. The proposed TIA circuit operates with the standard CMOS supply (~1.0V). Without using any equalization or DSP techniques, the proposed receiver can operate up to 54Gb/s NRZ with a BER less than the KP4 limit (2.2×10 -4 ) with an OMA of -8.6dBm, whilst the power efficiency is 0.55pJ/bit (or 0.98 pJ/bit including the output buffer).
This paper is organized as follows: Section 2 reviews existing optical receiver schemes for direct detection systems and outlines the advantages of the proposed differential optical receiver. The detailed circuit topologies and design considerations are presented in Section 3, together with the simulation results. The experimental setup and results are illustrated in Section 4, followed by a comparison with the current state-of-the-art direct-detection receivers. Section 5 discusses future work and concludes the paper.

Optical receiver architecture for direction detection
Differential circuit topologies together with truly differential I/O signals are essential for high-speed transceivers, not only because of their immunity to common-mode noise but also because of the corresponding digital circuits (e.g. Analogue to Digital Converter) favour truly differential inputs. Thanks to the optical mixing of the input signal and local oscillator laser, as shown in previous works [4,13], the generation of differential signals with a coherent optical receiver is relatively straightforward. However, within an IMDD link, since the PD is a single-ended device in nature, additional circuits have to be introduced to generate complementary signals. Regardless of the internal circuit topologies, the architectures of the front-end optical receiver within IMDD links could be categorized into four groups, which are shown in Fig 1. The most common solution is to incorporate dedicated electrical building blocks within the front-end TIA circuit, either with a dummy input to a replica of the TIA stages [10,14,15] as shown in Fig 1(a) or a single-to differential (S2D) amplifier [9,16], as shown in Fig 1(b), or a mixture of above two solutions [8]. However, both of these options suffer from transimpedance gain mismatch and bandwidth degradation issues. To overcome this limitation, the most common solution is to add several stages of differential voltage amplification within the receiver chain, albeit at an increase in receiver design complexity and the introduction of linearity challenges. As a result, most of these designs can only be used with NRZ signals. The design presented in [8] does demonstrate results using PAM-4, but this was achieved by including a dedicated variable gain amplifier (VGA). Another solution is to utilize the input node of the TIA as the negative input to the corresponding voltage amplifier [12], as shown in Fig 1(c). This solution avoids the need for additional post voltage amplification, however, the phase mismatch between the input and output nodes of the TIA inherently limits its utilization with multi-level modulation formats. The solution presented in [6,17] seems to be the most elegant arrangement which has solved all of the issues mentioned above. As highlighted in Fig 1(d), the current flow directions among the cathode and anode of the PD are complementary in nature, which could be utilized to generate fully differential currents for the differential TIA. In addition to this, the fact that differential electrical signals are generated from a single optical signal source indicates a √2(3dB) improvement in the signal-to-noise ratio (SNR), which further promotes its attractiveness. However, the fact that the PD is reverse biased leads to a considerable direct current (DC) voltage difference (>1V) among the two input nodes of the TIA. Therefore, on-chip DC blocks (C dc =8pF) have to be incorporated into this solution. According to [6], these 8pF on-chip DC blocks result in a 25fF parasitic shunt capacitance. With the 28nm CMOS process utilized in this work, this leads to over 2300μm 2 chip area with a total 155fF parasitic capacitance. Therefore, this arrangement (Fig 1 (d)) may be acceptable for a dedicated bespoke RF process but is fundamentally impractical for bulk CMOS processes, taking into account both technical and economic considerations. In contrast to the aforementioned solutions, our approach involves splitting the input optical signal into two identical parts as shown in Fig 2. The resulting two optical signals are then fed into two identical PDs, where the cathode node of PD1 is tied to a bias voltage Vbias1 (e.g. 2V) whereas the anode of the PD2 is tied to Vbias2 (e.g.-1V). In this way, both direct and alternating currents from the two PDs are generated complementarily. It is worth noting that this architecture is fundamentally different from the balanced photodetection scheme [18] within traditional differential phase-shift keying (DPSK) receivers, in which the differential signals are generated via a delay interferometer. With the proposed solution, we utilize the complementary current flow directions from each PD. The proposed solution requires two off-chip DC bias voltages for the two PDs, but this overhead is negligible when compared with the on-chip DC biasing system that is shown in [6].
The main drawback of the proposed approach is the relatively worse SNR performance, due to the fact that the optical signal is split by 2. Although the differential transimpedance gain is increased by 2 as well, the uncorrelated noise source from each arm of the differential TIA leads to √2 times increase in the differential root-mean-square (RMS) output voltage noise as described in [6]. As a result, the overall SNR performance of the proposed optical receiver is worsened by √2 (the equivalent of 3dB penalty), when compared with the singleended case with same input optical power. However, as shown in the following sections, the differential photocurrent can benefit the TIA design in various aspects and lead to new considerations for the overall performance of an optical receiver.

Circuit topology of the proposed transimpedance amplifier
Based on the differential architecture proposed in Fig 2, the implementation of the TIA might be realized by using traditional TIA circuit topologies. However, in the high-speed optical receivers, the parasitic effects introduced by device packaging (e.g. bonding wires) must be carefully analyzed ahead of circuit design.

Device packaging
The typical model of a shunt feedback resistor (SFR) TIA connected to the PD is depicted in Fig 3(a), from which the transimpedance gain can be derived as: Where I in is the input photocurrent, V out is the output voltage, A is the DC voltage gain of the amplifier, R F is the resistance of the feedback resistor, and C PD represents the junction capacitance of the PD. From equation (1), the 3-dB bandwidth can be determined as (A+1)/(2πR F C PD ). To enhance the bandwidth, as shown in Fig 3(b), some designs [14] utilize the bonding wire between the PD and TIA as a series inductor L b1 where the corresponding transimpedance gain can be derived as: With proper control and modelling of the equivalent inductance of the bonding wire, the 3-dB bandwidth could be approximated as √2A/(2πR F C PD ) [19]. However, in a packaged optical receiver, as shown in Fig 3(c), the biasing the PD is often realized via another bonding wire L b2 connected to a DC bias node, which leads to a revised system model: From the numerator of the equation (3), it is clear that a pair of zeros are placed at the frequency of , which creates a null for the transimpedance gain. To illustrate this more clearly, the Bode plot of the above three equations are highlighted in Fig 3, in which we assumed an ideal amplifier with a voltage gain of 5 over an infinitely wide bandwidth and assign typical values for C PD , L b1 , L b2 and R F . As expected, with 80fF of PD capacitance and 300pH for bond wire inductance, a null in the transimpedance gain can be observed at a frequency of 32GHz. Although in a real scenario the contact resistance within the junctions of the PD could lead to a higher damping factor, a sharp roll-off within the transimpedance gain at high frequency (>20GHz in Fig 3) always exists. To solve this issue, other solutions have been proposed, such as butterfly-packaging [16] or the use of on-chip shunt capacitors [20], all of which introduce additional overhead for the device footprint and the system complexity.
The discussion above highlights the critical importance of the device packaging within the integration of optoelectronic devices. The excellent testing results [7,11] obtained without considering device packaging (e.g. in testing RF probes and cables link the PD and TIA) may be elegant in some specific aspects but are unrealistic for practical implementations. Targeting operation with a standard digital power supply (~1V at the 28nm node) and therefore avoiding the previously mentioned interfacing requirements makes the TIA circuit design significantly challenging. The achievable gain-bandwidth product is limited and the use of advanced TIA topologies, such as mirrored cascode (MC) or the regulated cascode (RGC) which relies on stacking multiple transistors is not possible. In order to demonstrate our proposed approach the standard inverter-based SFR TIA topology (Fig 4(a)) has been used which is the most balanced choice under a standard digital power supply as shown in [12].

TIA Circuit design
Since fully differential photocurrents have already been generated from the proposed balanced PD, the TIA front-end circuit has been designed with differential inputs, rather than the traditional two single-ended TIAs. As shown in Fig 4(

Published by
intrinsic feedback loop within the cross-coupled pairs accepts the differential photocurrents and forwards the inverting-amplified signals to the other side, thereby allowing the amplifier's transconductance to be reused. Therefore, these transistors (M N1 , M P1 ) could be sized to a relatively small dimension, which in turn leads to better power efficiency. At high frequencies, similar to the mechanism within the Inductor-Capacitor (LC) oscillator, the equivalent inductance (L b1 ) of the bonding wire and the corresponding load capacitance (C load ) create a resonant peak at the frequency of As long as the value of C load and L b1 can be properly modelled and controlled, this resonant peak could be utilized for bandwidth enhancement.
The main challenges related to our approach are the overshooting of the resonant peak and the tendency that cross-coupled inverters may enter into latching status. To resolve these two problems, our solution was to incorporate a pair of resistive loads (R load ) and tie them to a proper biased point (~VDD/2 in cross-coupled inverters). In the actual circuit implementation, as shown in Fig 5,these resistive loads were realized by inserting a small inverter-based shunt feedback amplifier (M P2 , M N2 , R F ), which were naturally biased at about half the VDD. Furthermore, an NMOS transistor (M N_RF ) was connected in parallel with the feedback resistor (RF') to provide the voltage-gain tuning capabilities, so that the gain-bandwidth performance of the overall TIA circuit can be controlled for different applications.  Fig 4(b), the overall optical receiver could be designed. The photonics chip was designed based on the Interuniversity Microelectronics Centre (IMEC) iSiPP50G platform. As shown in Fig 5, the input optical signal was split into two identical parts by a 1×2 Multimode Interference (MMI) splitter. The resulting two optical signals were fed into two identical photodetectors (Ge-Si-PD), which have high responsivity (~1.1A/W) and medium bandwidth (~33GHz). During the design phase, one of the biggest issues with the PDK provided by this foundry is that these PDs were delivered as a black-box, from which the author cannot determine the exact polarity of the PD and hence the DC current flow directions. This leads to a challenge for building the proper DC-offset cancellation within the TIA circuit. Therefore, as illustrated in Fig 5, two preliminary solutions are adopted here: Firstly, the proposed cross-coupled inverters are designed with standard CMOS inverters, in which the transconductance (g m ) of NMOS (M N1 ) and PMOS (M P1 ) are approximately equal. This arrangement can ensure that the proposed TIA front-end can cope with either current flow directions. Secondly, two pairs of resistor ladders (R 1 and R 2 , R 3 and R 4 ) are

Published by
incorporated within the voltage amplifier and output buffer, so that a proper DC bias point can be maintained over a relatively broad DC input current range. Meanwhile, this arrangement allows the shunt series peaking inductors (L 2 , L 3 , L 4 , L 5 , L 6 ) to be added within the inverter chain, which could benefit the bandwidth enhancement. These inductors utilized in TIA design are a fully customized design which is analyzed and modelled by Keysight-ADS ® .

Simulation results
When running the simulation, the four bonding wires (L b1 , L b2 ) that connect the PD and TIA were modelled within Keysight-ADS ® , in which we estimated the gold bonding wire with a diameter of 18μm forming the bonding loop with a height less than 100μm and length less than 400μm. The generated 8-port S-parameter file was then imported into Cadence-Spectre ® for post-layout simulation. Following the PD model reported in [6], the junction resistance of the PD was set as 10Ω. According to the bandwidth performance (33GHz) reported in iSiPP50G's datasheet, the overall capacitance, including junction capacitance and pad capacitance, of this PD was estimated to be within the range 75-80fF. It can be seen that both red curves in Fig.6(a) and (b) demonstrate some similar features, including the resonant peak within the range of 26-28GHz, a sharp roll-off above 32GHz, all of which are predicted in the above theoretical analysis. Furthermore, in order to further analyze the performance enhancement contributed by the proposed crosscoupled inverters, the simulation was performed with the cross-coupled inverters disabled. The resulting transimpedance gain is then plotted as the blue curves in Fig 6. It can be observed that the proposed cross-coupled inverters lead to considerable performance enhancement in each case. In high gain mode (V CT =1.0V), the cross-coupled inverters contribute 9dB gain enhancement. In high bandwidth mode (V CT =1.1V), the resonant peaking created by the cross-coupled inverters contributed to a 23% bandwidth extension, accompanied by a gain enhancement up to 5.5dB. Note that, whilst the two blue curves of Fig 6 do not represent the best performance of a standard SFR TIA. The simulation clearly shows the difference in performance with and without the proposed cross-coupled inverter within the front-end of the TIA. With the proposed approach the gain and bandwidth performance can be enhanced simultaneously, whereas in traditional SFR TIA designs a trade-off between the gain and bandwidth performance has to be made [21]. Obviously, the pre-conditions to this are co-ordination in the design of the photonics (PD) and electronics (TIA), accompanied by an accurate and repeatable device packaging process. Besides the high gain and high bandwidth modes discussed above, as shown in Fig 6(c), tuning the control voltage (V CT ) to a higher value could further lower the transimpedance gain at low frequencies, while the resonant peaking created by the an on-chip eq system.
The simul can observe t group delay v DC to 25  In addition between the s SNR improve ended case [6 for the prop correlations a address this c outputs is at a means the dif also proven in outputs and w less than the testing equipm detect a single module based u ed.
ute noise perfo ce the transim r to avoid ent he noise perfo and blue curve ible. This enh h further highli e performance y simply be tri Hz, as highligh 0 Simulated input igh bandwidth n to the above single-ended o ement is accou 6] under the sa posed design, among the tw concern. It can an average 1.5 fferential outpu n Fig 11(b) Published by be multiplied by 1.3 (≈2.28dB) to account the overall SNR performance with differential outputs. Fig 11 Comparison of noise spectra density between single-ended output and differential outputs when operating at high bandwidth mode (a) Simulated output noise spectral density (b)Simulated input-referred noise current density.
In summary, incorporating cross-coupled inverters can significantly enhance the gain and bandwidth performance, accompanied with outstanding power efficiency. This approach is particularly useful under the limited digital power supply (~1V), where other advanced TIA circuit topologies are hard to implement. Due to the lack of a DC-offset cancellation module, some preliminary configurations have been made to ensure the functionality. The side effect of this is a moderate noise performance, which could be significantly improved if DC-offset cancellation modules were included.

Experimental results
To validate the structure and circuit topologies of the proposed differential optical receiver, the photodetector has been fabricated in the IMEC iSiPP50G process, in which a pair of PDs, one MMI and one grating coupler have been carefully laid out with an overall chip area of 0.17mm 2 . Furthermore, in order to quantify the optical loss of the grating coupler, a test structure containing a grating pair connected together via a short waveguide, has been purposely included. The TIA design has been fabricated on a shared MPW run by using the TSMC 28nm CMOS process with a back-end-of-line (BEOL) option 1P8M5X1Z1U. As shown in Fig 12, the design occupies a 0.5mm 2 chip area, which includes all of the DC and I/O pads.

Published by
The importance of device packaging has been fully analyzed in section 3.1. Following the simulation model, the PD chip and TIA chip were carefully placed within the cavity of a fully custom designed PCB. During the packaging process, the output pads of the PD and input pads of the TIA were precisely aligned by using an industry-standard flip-chip bonding machine. Through the use of a programable ball-bonding machine, the length and height of the bonding loops were established at 350μm and 80μm respectively, as highlighted in Fig  12. Furthermore, in order to minimize the current spike within the power supply rails, doublewire bonding was applied to the DC power pads within the TIA. To fully quantify the performance enhancement contributed by the proposed cross-coupled inverters, the power supply of the cross-coupled pair was purposely routed to a separate pad, which is marked as VDD2 in Fig 12.  Fig 13 illustrates the experimental setup for the evaluation of the performance of the codesigned receiver. An electrical 2 7 −1 Pseudo-Random-Bit-Sequence (PRBS) test signal was provided by an SHF Bit Pattern Generator (BPG SHF-12104A) which was then amplified by a broadband amplifier (CENTELLAX OA4MVM3) and fed into a 40Gb/s LiNbO 3 intensity modulator (Mach-40 TM 005) to modulate the light of a 1550-nm tunable laser. After passing through an EDFA, optical switch and optical filter, the modulated optical signal (TX) was analyzed by using a Keysight Digital Communication Analyser (DCA 86100D). This recorded TX signal was then forwarded into the proposed optical receiver. One of the differential outputs from the proposed TIA was fed into the DCA and recorded as the "RX" signal while the other output signal from the TIA was terminated with a standard AC-50Ω. A power and control module provided the necessary PD biasing and power supplies for the whole system.

Measurement results
Firstly, the optical loss of the grating coupler was characterized by using the grating-pair test structure on the PD chip. The resulting 6dB optical loss was taken into account in the TX signal that was obtained at each data rate.
Traditionally, S-parameter testing results are used to derive the transimpedance gain and bandwidth of the TIA. With the co-designed and co-packaged PD and TIA demonstrated in this work, this indeed requires a broadband lightwave component analyzer, which is not present in our lab. Therefore, we purposely tested the TX and RX signal under different operation modes and calculated the equivalent transimpedance gain over a broad speed range.  . Fig 14 (a) shows the input optical eye diagram at 28Gb/s indicating an OMA of 535uW (-2.72dBm). Fig 14(b) shows the schematic of the optical and electrical circuit highlighting the optical power of -8.72dBm and -11.72dBm before and after the 3dB optical splitter respectively, the electrical current at the input of the TIA of 74.08μApp and the 1.1A/W responsivity of the photodetector. Fig  14(c)and (d) show the single-ended electrical output eye diagrams from the TIA with the cross-coupler inverters enabled in high gain mode and high bandwidth mode respectively.

Published by
Whilst Fig 14(e) and (f) show the output eye diagrams with the cross-coupler inverters disabled for the two operating modes.
The transimpedance gain can then be calculated by dividing the measured eye-amplitude with respect to this photocurrent. By repeating this process for each speed under different control voltages, the resulting transimpedance gain at different speeds is calculated as shown in Fig 15. During testing, a tiny adjustment was applied to the EDFA so that the variation of OMA was controlled within the range of ±0.1dBm. When the proposed optical receiver is tested at higher data rates, the bandwidth of the LiNbO 3 modulator is a major limiting factor. As shown in Fig 16, the quality of the TX eye is reasonably good at 40Gb/s but becomes considerably worse at 50Gb/s and above. Therefore, the proposed TIA has been purposely tuned into the peaking mode, as has been explained in Fig 6(c), to compensate for the bandwidth drop at the transmitter side. The resulting singleended electrical output signals at 52Gb/s and 54Gb/s are shown in Fig 16(c) and (d). When operating at 54Gb/s, the recorded power consumption is 53.1mW, of which the cross-coupled inverters consume 2.4mW. This is about 4% higher than the simulation results shown in Fig 9 and which is reasonable considering typical process variations. Therefore, we estimate a 4% increase in power consumption (29.8mW) within the TIA and voltage amplifier stages and therefore the resulting power efficiency of the proposed design is 0.55pJ/bit (0.98pJ/bit including the output buffer) when operating at 54Gb/s.

Published by
The noise performance of the proposed optical receiver is a major consideration in this work. However, the traditional noise measurement method [9,11] used with standalone TIAs do not apply to the proposed co-packaged receiver, since bonding wires between PD and TIA play a dominant role in terms of bandwidth enhancement. Therefore, as has been highlighted within all the eye-diagrams in Fig 14 and Fig 16, the SNR performance is recorded for every data rate under different operation modes and is plotted in Fig 17. It is clear that the proposed cross-coupled inverters can significantly enhance the SNR at lower frequencies. The situation becomes complex at higher operating speeds. In high gain mode, as shown in Fig 17(a), when operating at rates higher than 28Gb/s, the use of the cross-coupled inverters does not result in a better SNR, which is believed to be due to the limited bandwidth (≈13GHz) of the TIA. In high bandwidth mode, as plotted in Fig 17(b), the cross-coupled inverters only lead to a slight improvement. We believe the bandwidth limitation at the transmitter side to be the dominant factor that limits the overall SNR performance. Therefore, as shown with the purple curve in Fig 17(b), peaking mode is enabled which demonstrates a considerable SNR improvement.
Note that the SNR results plotted in Fig 17 are measured from the single-ended output. Based on the analytical results presented in Fig 11, the differential SNR is obtained by multiplying the measured SNR shown in Fig 17 by 1.3. We can link the measured SNR to projected biterror-rate (BER) values using the well-known equations below, which apply to Gaussiannoise impaired signals [23]: According to [23], above approximation is correct within 10% for Q>3. We therefore calculated the BER based on the differential SNRs that are at least higher than 3.6. The resulting BER results at an input OMA at -8.7dBm (±0.1dB)are then calculated and plotted in Fig 18. It can be seen that the proposed optical receiver can operate error-free up to 32Gb/s and can achieve a BER below the KP4 level at 54Gb/s, while the overall power consumption is 29.8mW ( 53.1mW including the output buffer).  Table 1 summarises the performance of the fully integrated optical receivers that have been published in recent years. Within this table, only fully packaged optical receivers are listed. Certain challenges were presented while comparatively reviewing prior works: Some of the works only reported the BER at the KP4 threshold, whereas others have stated errorfree testing results. Furthermore, the power consumption within the output buffer is another factor that may lead to confusion. To avoid any confusion, we have listed all the operating conditions of our proposed design. It is clear that the proposed work operates with the lowest power supply voltage and achieves the best power efficiency. Even though our design can only detect a 54Gb/s at the KP4 limit, we believe this is mainly due to the limited bandwidth of the modulator used on the transmitter side. Overall, the excellent figure-of-merit (FoM) [24] demonstrates the superiority of the proposed design. Table 1 Comparison of the RX with the state of the art [6] IMEC [24] Hitachi [12] IBM [8,25] Intel [10] This work

Future work and conclusion
In this work, we have demonstrated the design and implementation of a fully differential optical receiver, which is aimed for short reach intensity modulation and direct detection (IMDD) transceiver links. A Si-Ge balanced photodetector (PD) has been co-designed and packaged with a novel differential transimpedance amplifier (TIA). Without using any equalization or DSP techniques, the proposed receiver can operate up to 54Gb/s NRZ with a BER less than the KP4 limit (2.2×10 -4 ) with an OMA of -8.6dBm, whilst the power efficiency is 0.55pJ/bit (or 0.98 pJ/bit including the output buffer). Obviously, this work should be further optimized in various aspects. As has been indicated in Section 3, a properly designed DC offset-cancellation module is essential and can lead to considerable performance enhancement in various aspects. However, we believe that the most valuable work for the next step is to adopt flip-chip bonding between the PD and TIA. The obvious benefit is the bonding-wire induced inductance (L b2 ) with the PD biasing