A 1.9 ps-rms Precision Time-to-Amplitude Converter With 782 fs LSB and 0.79%-rms DNL

Measuring a time interval in the nanoseconds range has opened the way to 3-D imaging, where additional information as distance of objects light detection and ranging (LiDAR) or lifetime decay fluorescence-lifetime imaging (FLIM) is added to spatial coordinates. One of the key elements of these systems is the time measurement circuit, which encodes a time interval into digital words. Nowadays, most demanding applications, especially in the biological field, require time-conversion circuits with a challenging combination of performance, including sub-ps resolution, ps precision, several ns of measurement range, linearity better than few percent of the bin width (especially when complex lifetime data caused by multiple factors have to be retrieved), and operating rates in the order of tens of Mcps. In this article, we present a time-to-amplitude converter (TAC) implemented in a SiGe 350 nm process featuring a resolution of 782 fs, a minimum timing jitter as low as 1.9 ps-rms, a DNL down to 0.79% LSB-rms, and conversion rate as high as 12.3 Mcps. With an area occupation of 0.2 mm2 [without PADs and digital-to-analog converter (DAC)], a FSR up to 100 ns, and a power dissipation of 70 mW, we developed a circuit suitable to be the core element of a densely integrated, faster and high-performance system.


I. INTRODUCTION
P ICOSECOND timing of events such as single-photon detection has been subject to a steadily increasing interest in recent years, leading to its wide exploitation in various applications and fields [1], [2], [3], [4], [5]. The most advanced analyses could be enabled by the combination of fast and high-precision timing systems. While recent advances in single-photon detectors design have paved the way to a precision as low as few picoseconds [6], [7], new techniques have been demonstrated effective to operate such detectors at tens of MHz [8], and detector technologies are mature to fabricate dense arrays with thousands of pixels [9], [10], [11], [12], the design of time measurement circuits able to combine high precision, high linearity, and high speed are still an open challenge. Indeed, best-in-class time-measurement circuits able to achieve picoseconds timing jitter on a timescale of several nanoseconds are now realized only by commercial systems [13], [14]: made of discrete components, these solutions are bulky and have been limited to a few channels so far. Two architectures are mostly used to design timing circuits: time-to-digital converters (TDCs) and time-to-amplitude converters (TACs) followed by an ADC. On the one hand, TDCs can be more compact [15], [16], [17], [18], they can be implemented directly in FPGAs [19], [20], [21], and have recently gained popularity also as part of digital phase-locked loops (PLLs) [22], [23], [24], but with different requirements with respect to light detection and ranging (LiDAR) and biological applications [e.g., PLLs typically require sub-ns full-scale ranges (FSR)]. On the other hand, TAC-based architectures have been the solution of choice so far in high-demanding applications, particularly in lifetime analysis, thanks to their superior linearity that can be combined with picoseconds timing precision on the whole nanoseconds FSR, and high scalability resorting to application-specific integrated circuits (ASICs).
In this work, we present a single-channel TAC which has the same architecture presented in [25], but having every core element revolutionized to reach unprecedented timing performance. We focused on a detailed noise analysis, which allowed us to completely redesign the critical blocks for jitter minimization. Preliminary results were anticipated in [26] where a former version of this chip was characterized in an unoptimized measurement setup environment, but remarkably better performance and detailed analysis of circuits and noise contributions are reported here for the first time. The new converter is able to provide a timing jitter down to 1.9 psrms, combined with a differential nonlinearity (DNL) as low as 0.79%-rms of the LSB, a timing resolution down to 782 fs, and a FSR up to 100 ns.
The article is organized as follows: in Section II, the TAC operating principle is described; in Section III, the architectural transistor-level description of the circuit is provided; in Section IV, the extensive experimental characterization is reported; finally, conclusions are drawn in Section V.

II. OPERATING PHASES
A TAC is basically a capacitor that is charged by a constant current during a given time interval. As a result, a voltage output is produced that is proportional to the time difference between start and stop signal where I conv is the constant conversion current, C conv is the conversion capacitor, T stop and T start are the time of arrival of the stop and the start signals, respectively, and V out,0 is the initial value on the capacitor that should be ideally zero. Clearly, to achieve high performance in terms of precision, linearity, operating rate, stability against temperature, power supply, and potential disturbances, the actual structure of our TAC is much more complex. Before going into the details of the designed circuit, it is worth focusing on the main phases of the TAC operation first. The finite state machine (FSM) of the designed converter is shown in Fig. 1. The standard operation of a TAC consists in four main phases: IDLE, CONVERSION, HOLD, and RESET. The role of the initial CALIBRATION phase will be clarified later in this section. Initially, the converter is in IDLE phase until reception of an Ext. Start. When it occurs, the CONVERSION phase begins and the conversion current starts to flow through the conversion capacitors. At this point, if an Ext. Stop is detected within the FSR limits, the TAC enters the HOLD phase where the converted value is preserved until it is acquired by the following electronics. However, it is possible that the converter never receives an Ext. Stop signal following an Ext. Start. In this case, the conversion current could keep flowing into the conversion capacitors causing the saturation of the stage. To avoid this issue, potentially leading to a long recovery time, the TAC internal logic stops the current integration as soon as the conversion stage output exceeds its range. The occurrence of this scenario is marked by the set of the internal over range (OR) signal, which brings the TAC directly into the RESET phase. Overall, the RESET phase can either directly follow the CONVERSION phase, because of the OR signal, or come next the HOLD phase. In the second case, a Reset TAC is received, meaning that the converted value has been acquired by an external unit (e.g., an FPGA) and the TAC can be reset. At the end of the RESET phase, the TAC returns in the IDLE phase, where it is ready to start a new conversion. During the whole time interval between the start of a conversion and the end of the associated reset phase, the TAC is not able to accept any other start signal. For this reason, this time interval is called dead time and it sets the maximum speed of the converter.
The analog output produced by a TAC typically requires an external ADC to ultimately convert the data into a digital word. However, the relatively poor linearity of commercial ADCs can easily impair the linearity of the state-of-theart TACs. To avoid this issue, the dithering technique can be exploited [27]. This solution has been implemented with a co-integrated digital-to-analog converter (DAC) producing analog values that are arbitrarily added to the TAC output and then digitally subtracted in post-processing. In this way, it is possible to average out the nonlinearity of the external ADC by converting multiple occurrences of the same start-stop interval in different parts of the ADC input range. This solution can be effectively applied to repetitive analysis such as time-correlated single-photon counting (TCSPC) that requires exactly the conversion of the same start-stop interval many times to build a histogram. In principle, the dithering could completely rely on the linearity of the DAC, providing a known conversion factor between the analog added value and the digitally subtracted one. However, the impact of DAC nonlinearity can be minimized during the CALIBRATION phase, which consists in acquiring all the possible DAC values and storing them in a lookup table that is used during the TAC operation. In this way, DAC main nonlinearity contributions can be substantially reduced, thus avoiding affecting the dithering effectiveness.

III. TAC STRUCTURE
The converter architecture is depicted in Fig. 2. The time interval acquisition chain consists in three cascaded elements: the front-end logic followed by the conversion stage and the output stage. These are co-integrated with the aforementioned DAC for the exploitation of the dithering technique.

A. Conversion Stage
The core of the TAC is the conversion stage, where the actual conversion of a time interval into a voltage signal is performed. In this work, the conversion stage directly produces a differential signal to minimize the number of stages on the signal path, an aspect that is crucial to achieve an extremely low noise. In Fig. 3(a), the schematic of the conversion stage is shown. The conversion current is generated in two steps. A constant-gm stage produces an internally temperature-compensated and low-noise reference current thanks to the usage of the network constituted by the Q 1−2 BJTs and the R 1−3 resistors, combined with the double-cascaded MOSFET mirror (M 1−8 ) resulting into an overall positive but stable feedback.
The reference is fed to a set of controllable multibranch mirrors which produce the reference currents I conv+ and I conv− from the nMOS and pMOS mirrors, respectively. Generating both conversion currents from the same reference leads to a strong minimization of relative mismatches between the two output branches (a maximum mismatch of 0.8% has been observed in simulation). The designed converter features four selectable FSRs (100, 50, 25, or 12.5 ns), corresponding to four different values of the differential conversion current (100, 200, 400, or 800 µA) produced by the current mirror. I conv+ and I conv− are guided by a series of switches controlled by the signals ST/ST and SP/SP. These are produced by the front-end logic that sets the switches configuration in each phase of the TAC operation.
Following the FSM of Fig. 1, in CALIBRATION, IDLE, and RESET phase, the integration stage is in buffer configuration, i.e., feedback switches controlled by ST/ST are all closed. In these phases, I conv+ and I conv− are deviated toward the integration stage and the high aspect ratio of the feedback transistors ensure an almost zero initial value on the capacitors. During the CONVERSION phase, feedback switches are open: in this way, the current can flow in the conversion capacitors, producing a differential voltage signal whose amplitude is proportional to time.
The differential integration occurs on two capacitors C conv by 10 pF each, leading to a differential output range from 0 to 2 V per every FSR, while the common mode is fixed to 1.9 V by the bandgap reference. Finally, in the HOLD phase the capacitors are isolated by switching (SP/SP) from 01 to 10 and redirecting the conversion currents to 3.3 V and gnd. The deviation of the conversion currents into the digital ground and 3.3 V voltage domain avoids any abrupt current transition into the analog 5 V power supply, potentially compromising the performance of the integrator while the conversion is ending.
Since the amplifier of the conversion stage experiences different configurations during the various phases, we resorted to two single-ended operational transconductance amplifier (se-OTAs) to produce a differential signal in this stage (Fig. 3b), thus avoiding potential instability. In particular, when entering the RESET phase, the charged conversion capacitors are suddenly shorted by the feedback switches. The speed of this transition is desirably high to minimize the reset time, which directly affects the overall dead time of the converter. In an unpublished previous version of this TAC, we observed how this transition can easily cause the instability of the conversion stage if a single fully differential OpAmp is used. When the charged capacitors are shorted at the beginning of the RESET phase, both the input nodes of the fully differential OpAmp would follow the outputs resulting into a large signal applied to the input differential pair. The common-mode feedback (CMF) network cannot easily follow the fast reset transition. If the common-mode voltage at the end of the reset phase exceeds the input range of the CMF, the proper functionality of the common-mode feedback network could not be restored and rail-to-rail self-sustaining oscillations would be triggered. Increasing the bandwidth of the CMF network could help mitigating this issue, but this approach would be demanding as this network should be faster than the reset one. For this reason, we resorted to a different approach, based on the exploitation of 2-se OTAs to generate a differential output.
Strong reliability as well as high disturb rejection capability have to be met in this core stage of the TAC. To this aim, the symmetry of the designed structure has been maximized by interdigitating the two OTAs. This allowed us to reach a power supply rejection ratio (PSRR) of 120 dB, a commonmode rejection ratio (CMRR) of 90 dB, and a low-differential input-referred offset (OS). In Fig. 4(a)-(c), it is possible to appreciate the simulated robustness of our architecture against process and mismatch variations. It is noteworthy that, while the OS is not an issue on a single-channel converter, a low value enables the exploitation of the calibration algorithm in a multichannel structure [30] based on this TAC. Both OTAs provide a slew rate greater than ±1 V/12.5 ns =±80 V/ns [see Fig. 4(d)], to ensure the proper behavior of the conversion stage on the shortest FSR (12.5 ns) when the steepest output voltage rise occurs. Finally, to avoid any instability during the RESET phase while minimizing its duration, a wide gain bandwidth product (GBWP) along with a large phase margin must be guaranteed against every process variation. In this stage, the GBWP is always greater than 270 MHz, with a minimum observed phase margin of 70 • . Both values have been extracted considering the load capacitance (≃400 fF per terminal) resulting from the input terminal of the output stage which is always connected to the conversion stage (see Fig. 2).
The integration process occurring in the conversion stage significantly contributes to the overall jitter of the converter.  Monte Carlo simulation on 1000 occurrences for process and mismatch variation. (a) and (b) PSRR and CMRR never lose more than the 0.8% with respect to the expected value of 120 dB and 90 dB. (c) SR is always far greater than the value needed for the 12.5 ns FSR. (d) differential OS is within ±6 mV and can be calibrated in a multichannel structure.
Charging a capacitance C conv with a constant current for a time interval T acts as a gated integrator, leading to the following expression for the rms σ V,n noise superimposed to the signal at the end of the conversion where, referring to Fig. 3(a), S n,v is the voltage-noise power spectral density (PSD) deriving from the bandgap reference, while S + n,i and S − n,i are the current-noise PSDs associated with I conv+ and I conv− , respectively. In our design, S n,v does not add any significant noise contribution during conversion due to the elevated CMRR, because any variation of reference node results in a common-mode disturb, which is rejected. Moreover, a decoupling capacitance connected to the voltage reference acts as low-pass filter which further decreases the noise. Thus, S + n,i and S − n,i are the main contributions for integration noise.
The S + n,i PSD can be written as the sum of its dominant contributions where S M P n,i is the PSD deriving from the output transistors connected to the constant-gm stage, S M 1,2,7,8 n,i is the PSD of each transistor in the bandgap reference, S R B n,i is the PSD of R B , and β is the mirror factor which changes according to the selected FSR. This expression can be simplified by looking only at its main contribution, i.e., the output MOS which is generating the conversion current, thus obtaining Similar considerations hold also for the S − n,i PSD which can be approximated as S − n,i ≃ S M N n,i . Overall, we have a differential PSD of S n,i constituted by its wideband noise (S w ) and 1/ f noise (S 1/ f ) coefficients of the output transistors. These terms can be written as follows: where the K w and K 1/ f are coefficients depending on process and operating temperature, while the conversion current and the transistor sizing can be chosen by the designer. Substituting these terms in (2), the integrated noise can be computed as follows: where ε is the lower bandwidth limit equal to the inverse of the experiment duration. A full explanation of how this conclusion is drawn can be found in the Appendix. The integration noise can be expressed as a temporal jitter by multiplying for the conversion coefficient (C conv /I conv ), resulting in the following: It is worth noticing that, if the white noise contribution is dominant, the precision is proportional to The area of the current generator is 0.027 mm 2 , corresponding to 53% of the overall area of the conversion stage, while the maximum power dissipation is 8 mW, corresponding to 11% of the whole converter budget. The large dimensions of both the current generator and the conversion capacitors have also a beneficial impact on reducing any process variation for the very nucleus of the circuit [28]. In this way, we achieved an excellent correspondence between simulated and measured performance, as will be underlined in Section IV.

B. Output Stage
The analog signal produced by the conversion stage must be fed to an external ADC to produce digital data that can be processed by an external elaboration unit. To this aim, we used a commercial ADC (LTM9011-14 by Analog Devices), which requires a differential input between ±1 V with a common mode of 0.9 V. Since the conversion stage produces a differential signal between 0 and 2 V, with a common mode of 1.9 V, a second stage is used to shift the TAC output within the input range of the chosen ADC. Moreover, this stage sums the differential signal coming from the DAC to the converted values, thus enabling the exploitation of the dithering technique.
A simplified schematic of the output stage internal structure is shown in Fig. 5(a). The two se-OTAs in this stage exploit the exact same architecture depicted in Fig. 3b. The sole difference is the adoption of a class AB OTA output stage instead of a class A. This is due to the large amount of current necessary to drive the ADC input capacitance that the output stage is required to supply. Moreover, while instability for abrupt transitions is not an issue here, this structure has been preferred to a fully differential one for providing a high-impedance load to the conversion stage, with beneficial effects on the linearity of the converter as it does not require any current under any operating condition.
Considering initially I DAC+ = I DAC− = 0 in Fig. 5(a), for simplicity, the differential and the common-mode output of the output stage are being V out,diff and V out,cm the differential and common-mode contributions, I shift a downshifting current, and I cmf the current provided by the CMF network shown in Fig. 5(b). To shift both the differential and common-mode dynamics by 1 V down, it is necessary to have I shift = 2 · I cmf .
In this stage, one of the main contributions to the differential noise is given by I shift . To limit its contribution, a filtering capacitor C F has been added in parallel to R F + R dith , providing a low-pass filtering action that does not affect the signal and thus its settling time. With I shift = 400 µA, R F + R dith = 2.5 k , and C F = 2 pF, we obtained an overall noise of the output stage as low as 166 µ V-rms, corresponding to less than 1.1 ps-rms on the 12.5 ns FSR. To increase the temperature robustness of this stage, the temperature dependence of the I shift current has been compensated with an inverse dependence of the feedback resistors (R F + R dith ) used to produce the downshifting voltage.
A switch in series to I shift allows us to deviate this current toward ground, thus removing its contribution from both the differential and the common-mode output. While the common-mode is preserved by the CMF network, the differential output is significantly varied by this choice. This change is necessary to properly handle the DAC signal during the CALIBRATION phase. Before going into more details, the overall effect of the DAC on this stage must be analyzed. Referring again to Fig. 5(a), the contribution of the DAC current signal on V OUT,diff is equal to R dith · (I DAC+ − I DAC− ). With a differential current up to 1 mA and R dith = 125 , the DAC entry is ±62.5 mV, corresponding to 1/16 of the differential output range. Such contribution has been proven to be a fair trade-off between having a wide useful time range and significantly improving the linearity of the system [25]. The DAC common-mode contribution is easily compensated by the CMF current network. Hence, in the CALIBRATION phase, V OUT,diff = 0 V must be ensured to acquire the DAC signal alone and to build the lookup table for the dithering, as explained in Section II. If the downshifting current is not removed, half of the DAC signal would not be properly acquired as it would fall off the input range of the ADC. To avoid this issue, we used a switch to remove the I shift contribution, ensuring a proper acquisition of the whole DAC signal placed at the center of the ADC input range.

C. Front-End Logic
The front-end logic (Fig. 6) is responsible both for converting the external start and stop low-voltage differential signals (LVDSs) into differential rail-to-tail digital signals, and for regulating the proper behavior of the TAC producing the internal control signals.
The generation of the ST/ST and SP/SP is based on a conversion chain equal to the one reported in [25]. Two input comparators exploit a bipolar transistor input pair to minimize the jitter contribution of this stage. Then, a flip-flop (FF) per each path is used to ensure that a stop signal is accepted by the converter only after a start signal has been received. This is an important feature, especially when the converter is used in reverse start-stop mode in TCSPC. In this scenario, the stop signal coincides with the periodical excitation laser while a start signal occurs only when a photon is detected. Since the probability of detecting a photon in a classic pile-up limited TCSCP experiment [29] is kept well below unity, several stops reach the detector before a start occurs and they must be all discarded.
At the arrival of the stop signal, the front-end logic produces a strobe signal fed to the external electronics as a flag indicating the occurrence of a valid conversion. Indeed, the external ADC can operate in free running mode and with the strobe signal it is possible to select the ADC output actually carrying the converted value.
Once the output voltage has been successfully acquired, the Reset TAC signal triggers the RESET phase which forces a reset on both the FFs in the start and stop paths. The pulser block guarantees a minimum duration of ≃21 ns for the RESET phase, corresponding to the minimum time needed to reset the capacitors. During this phase, the converter is insensitive to any start or stop signal.
Finally, it can happen that a Stop signal never occurs after a triggering start, resulting in a measurements exceeding the selected FSR. In this case, the OR detector circuit connected to the analog signal TAC+ identifies when the voltage exceeds the V th,high value and it forces a reset in the converter. The hysteresis in the OR detector uses another threshold value V th,low to maintain a high value at the output of this block for a time interval long enough to ensure a full reset of the TAC.

D. Dithering DAC
The many advantages of using a TAC to measure a time interval with picosecond precision come along with the need of an external ADC to produce a digital output. This could Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. seriously impair the linearity of the system, but the dithering technique is a powerful tool to minimize this issue. To this aim, we integrated the DAC shown in Fig. 7, whose output is summed to the converted value by the output stage and then digitally subtracted in post-processing. The DAC features a 10-bit current-steering segmented architecture consisting of a 7-bit binary-weighted DAC for the least significant bits (LSBs) and a 3-bit thermometer-coded DAC for the most significant bits (MSBs). The hybrid structure has allowed us to combine low-power dissipation (7 mW), low current noise (350 nA-rms), and a good linearity (≃0.15 LSB peak-to-peak). During the CALIBRATION phase, a lookup table of the DAC values is built, thus making its residual nonlinearity totally negligible.
The DAC operation is controlled by means of the Reset DAC and Clock DAC input signals. The first one sets the initial condition of the converter, while the latter one is used to increment by one step the DAC output, thus producing a differential staircase current output. The two currents (I DAC+ and I DAC− ) flow into the feedback resistors R dith of the output stage [see Fig. 5(a)] giving the target contribution of ±62.5 mV. According to post-layout simulations, the DAC outputs following a Clock DAC signal reaches its final value (with an error <0.5 LSB) within 10 ns, which is well below the time needed to produce a valid TAC output.

IV. EXPERIMENTAL RESULTS
The designed converter has been fabricated in 350 nm Si-Ge technology and extensively tested. In Fig. 8, the IC micrograph is shown with the main blocks highlighted. The chip area is 0.33 mm 2 without PADs, with only 0.2 mm 2 owing to the TAC and 0.13 mm 2 due to the DAC, which can be easily shared in a future multichannel version of the chip. The experimental characterization of the chip has been carried out using a custom PCB at room temperature, which provides a temperature-controlled environment of 45 • C for the chip.
The measurement setup block diagram is represented in Fig. 9. As it will be better analyzed in Sections IV-B and IV-C, the start signal could come from a pulse generator after having been split to generate the stop signal, or from a photon detector module (PDM module from MPD) generating random signals. The start origin changes according to the performance  to be characterized. After conversion, the FPGA (Kintex7 from Xilinx) manages the acquisition of the converted value from the ADC and it sends the control signals (e.g., Reset TAC ) to the converter. Finally, the histogram data are saved in an on-board memory and downloaded by a PC via a USB3.0 link. To avoid a redundant discussion, only the measurements corresponding to the FSR of 12.5 ns will be treated, but all the considerations can be extended to the other FSRs.

A. Timing Resolution
The resolution of the circuit corresponds to the width of its time bins. In principle, for a TAC it is equal to the nominal FSR divided by 2 n , where n is the number of bits of the external ADC. In our case, with a 14-bit ADC, the theoretical timing resolution is given by (12.5 ns/2 14 ) = 763 fs. However, due to process variation and other nonidealities, the measured timing resolution can be different. Starting from the minimum measurable start-stop time interval, we linearly increased a passive delay present on the stop path by fixed steps of duration D s . To exclude jitter from the resolution computation, several conversions are made for the same start-stop time interval, which leads to a Gaussian distribution for each interval, having a peak in correspondence of the converted value (Fig. 10). The variance of every peak corresponds to the jitter-rms present for that start-stop time interval and it will be discussed in the next section. The actual converter resolution can be measured as follows: where N p is the number of detected peaks, D k is the introduced delay ideally equal to D s , and N LSB,k is the number of LSBs from two consecutive peaks. Computing the resolution in this way allows us to average down any nonuniformity of the passive delayer. The measured converter resolution is equal to 782 fs.

B. Timing Precision
The timing precision measures the ability of the TAC to provide always the same output voltage in response to a given start-stop interval. The precision has been measured as follows: the output of a pulse generator is fed to an active splitter to generate both the start and the stop signal, to avoid any contribution of the generator to the overall measured jitter. Then, the adoption of the passive delayer on the stop path allowed us to evaluate the jitter as function of the measured time interval duration.
It has been shown (Section II) how the timing jitter of the converter depends both on constant contributions (e.g., output stage noise) and on contributions that depend on the duration of the start-stop time interval (integration noise). Since the latter is the dominant factor in our design, we expect to measure a rising trend with time. In Fig. 11, the experimental rms timing jitter as function of the start-stop interval is shown. For all curves, a rising trend can be observed, confirming the effectiveness of our noise analysis for both short and close-to-FSR time intervals.
In nominal operating conditions, i.e., by acquiring a single sample at the TAC output after receiving the strobe signal, a jitter as low as 2.6 ps-rms is observed, while a maximum of 3.4 ps-rms appears in correspondence of a long time interval. However, despite the excellence of the obtained result, the overall jitter can be further improved. Indeed, it is possible to lower down the impact of uncorrelated output noise by acquiring multiple samples of the same converted value and  averaging them out in post-processing. In Fig. 11, a jitter reduction obtained using two or eight samples is depicted. In the latter case, we were able to decrease the timing jitter down to less than 2.9 ps-rms on the whole FSR. With a short time interval, a jitter as low as 1.9 ps-rms has been measured, the lowest ever reached on a wide FSR with an integrated time converter up to date.

C. DNL and INL
Variations in the silicon process or in the layout create static differences and mismatches in the circuit, translating in a nonhomogeneous resolution along the bins of the discrete time-interval axis. Moreover, the linearity of the acquisition chain can be easily impaired by the following ADC nonlinearities. The DNL describes these deviations of the quantization steps from the ideal value of the average resolution computed in (10).
To measure the converter DNL, we used two uncorrelated signals as inputs, i.e., a pulse generator as stop and a single-photon detection module (PDM by microphoton devices) as start. The photodetector within the PDM module is kept in a dark environment thus producing an output only due  I   PERFORMANCE SUMMARY AND COMPARISON TO THE STATE-OF-THE-ART to its dark count noise, which is uncorrelated to any other source. The response of an ideal system to such stimulus would be a flat histogram. Thus, any deviation from such output can be surely ascribed to a nonlinearity of the system.
In Fig. 12, the DNL of the designed circuit is shown. The dithering technique substantially reduces the impact of the ADC on the DNL of the whole system. Using the dithering technique, the converter can provide a DNL as low as 0.79% of the LSB rms on the whole FSR. The peak-to-peak DNL is equal to 4% of the LSB, mainly due to the oscillating behavior for short time intervals, otherwise is <1.5% LSB peak-to-peak on most FSR.
The integral nonlinearity (INL) quantifies the discrepancy between the ideal converter characteristic and the measured one per every bin. It is a consequence of the accumulation of errors in the resolution (DNL), because the sum of individual unmatched bins causes a nonlinearity in the actual characteristic. In Fig. 13, the INL curves with and without the dithering signal applied are depicted. Also in this case, the INL benefits from the exploitation of the dithering technique. A peak-to-peak INL <2.5 LSBs is observed, with a 1.12 LSB-rms value.

D. Conversion Frequency
The conversion mechanism needs a minimum amount of time to perform the following operations: capacitor charging in the conversion phase (T st-sp ), complete settling of the output signals (T settling ), acquisition of the converted value (T acq ), and reset of the conversion capacitors (T reset ). Clearly, the maximum operation rate of a single converter is upper limited by the inverse of the sum of all these contributions The overall maximum dead time of the converter would be T st-sp,max (FSR) + T settling,max (38.7 ns) + T rst (21 ns) = 73 ns which corresponds to a maximum rate of 13.7 Mcps. In addition, some time is needed by the FPGA to sample the output signal: if we consider a single sample of the ADC, having a conversion time of 8 ns, the overall dead time would be 81 ns, thus corresponding to a conversion rate as high as 12.3 Mcps. In our measurement, we experimentally verified that the settling time of the output is compatible with the simulated values. Moreover, it is possible to further improve the conversion rate thanks to the Fast-TAC architecture proposed by Peronio et al. [30], potentially allowing a speedup by even one order of magnitude.

V. CONCLUSION
We reported the extensive description and characterization of a new TAC fabricated in 350 nm SiGe process and the comprehensive noise analysis that allowed us to design all crucial elements of the circuit. This work is intended to be one of the key elements for a new class of fast-TCSPC acquisition systems based on the results presented in [8] and [29], requiring to combine a conversion speed in the 10 Mcps range with sub-picosecond resolution on tens of ns FSR, picosecond precision, and high linearity. The performance of our work are summarized in Table I and compared with the state-ofthe-art integrated timing circuits presented in literature [16], [18], [25], [31], [32], [33], [34]. To the best of our knowledge, the lowest jitter on a wide FSR is obtained in this work, being even competitive against another TAC [34] designed for PLLs (with a range far shorter with respect to the other works). At the same time, a close-to-ideal DNL of 0.79% LSB-rms is reached, i.e., one order of magnitude lower compared to stateof-the-art TDCs, considering a measured resolution (LSB) as low as 782 fs.

APPENDIX
The integrated noise can be derived considering the wideband and 1/ f PSDs of (5) filtered by a gated integrator having window of integration equal to T . Let us split the analysis for each component. The wideband noise transferred to the output can be derived as follows: leading to an expression of σ V n ,w dependant on √ T . The output noise due to the 1/ f PSD, instead, results in the solution of an infinite noise contribution for f − → 0, corresponding to an infinite observation time for our experiment. We shall consider the lower bandwidth limit ε set by the finite observation time of the experiment (T obs ), resulting in ε ≃ 1/T obs . Thus, the 1/ f noise output noise becomes: where Ci(·) is the cosine-integral function that can be expressed as follows: Ci(x) = +∞ x cos z z dz = γ + log x + x 0 cos z − 1 z dz (14) where γ is the Eulero-Mascheroni constant. If we consider a long enough observation time, i.e., several different time interval acquisitions, it is possible to consider ε ≪ (1/2π T ) and, indeed, x − → 0. In these conditions, the following expression is obtained for the 1/ f noise component: Finally, it is possible to express the overall integration noise by simply summing the quadratic contributions of (12) and (15), resulting into the expression derived in (6).