A multichannel time-to-digital converter ASIC with better than 3 ps RMS time resolution

The development of a new multichannel, fine-time resolution time-to-digital converter (TDC) ASIC is currently under development at CERN. A prototype TDC has been designed, fabricated and successfully verified with demonstrated time resolutions of better than 3 ps-rms. Least-significant-bit (LSB) sizes as small as 5 ps with a differential-non-linearity (DNL) of better than ±0.9 LSB and integral-non-linearity (INL) of better than ±1.3 LSB respectively have been achieved. The contribution describes the implemented architecture and presents measurement results of a prototype ASIC implemented in a commercial 130 nm technology.


Introduction
In high energy physics time measurements play a crucial important role to perform particle tracking and identification. This usually requires the implementation of hundreds or thousands of measurement channels. Recently, fine-time resolution measurements in the sub-10 ps domain have received increasing attention in the high energy physics community. Novel sensor designs have emerged reaching the sub 10 ps resolution domain [1]. To fully exploit the potential of such new sensor designs, fine-time resolution measurements in the ps domain have become necessary. To allow a large range of different experiments/applications to profit from such a new development, a high degree of flexibility with respect to the time-resolution, power consumption, readout interface and dynamic range is required.
Today, device mismatches as well as jitter due to thermal noise and power supply disturbances have become the limiting factors in state of the art TDC designs. To overcome these limitations, fast signal edges to limit the effect of thermal noise and power supply disturbances as well as calibration techniques on a per bin basis to reduce the effect of device mismatches are required. Especially, in a multichannel system, calibration on a per bin basis can represent a considerable design overhead and time intensive task during production. A novel multipurpose TDC ASIC offering fine-time resolution measurement capabilities at a reduced calibration effort is highly appreciated by the community.
In this contribution a fine-time resolution TDC precisely matched to requirements of the upcoming experiments and application requiring a large number of channels integrated on a single ASIC is presented. The proposed architecture follows a global timing generator approach where only one instance of the timing generator is implemented per ASIC. To compensate for device -1 - mismatches a calibration mechanism, on the level of the global interpolator, avoiding the need to calibrate each single channel separately is presented. To reduce the effect of thermal noise as well as power supply disturbances, short propagation delays as well as fast signal edges are sustained throughout the complete design.

Architecture
The proposed architecture, as shown in figure 1, employs a multistage concept using a DLL in the 1st stage and a passive interpolation concept [2] in the 2nd stage. A reference clock, nominally running at 1.5625 GHz, serves as the time reference of the TDC. The DLL of the interpolator employs 32 elements to generate LSB sizes as small as 20 ps and is limited by the gate delay of the technology. To overcome the intrinsic technology limitation a passive interpolation to further divide the LSB size by a factor of four is employed, to achieve LSB sizes as small as 5 ps. The first as well as the second stage are fully self-tracking, only defined by the reference clock period. Process-voltage and temperature variations are compensated by the loop down the TDC finest resolution level of the TDC. The auto tracking behavior of the loop offers the end-user to trade-off resolution against power consumption solely by adjusting the reference clock frequency. To extend the dynamic range of the DLL, a counter is added to the design. Depending on the detailed implementation of the channels, a counter synchronization mechanism, to avoid an invalid code to be latched by the time capturing registers (TCR), might be necessary. This issue is well known in literature [3] and will not be discussed any further in this contribution.
The generated time code (interpolator + counter) is then distributed to connect to the respective channels of the TDC, here referred to as the channel matrix. To sustain fast signal edges across the channel matrix several channels are grouped into segments and served by so called distribution -2 - buffers. A per bin adjustment feature to calibrate out device mismatch introduced by the interpolator structure as well as by the distribution buffers themselves is implemented inside the distribution buffers. No local interpolation is required on a channel per channel basis avoiding the need to calibrate each channel separately. However, in this case the time capture registers are required not to significantly contribute to device mismatches of the TDC.

Fine-time interpolator
The fine-time interpolator (figure 2) is based on a DLL employing a modified version of Maneatis Delay-Cell element [4] to achieve short propagation delays at an early interpolation stage. A detailed description of the employed cell can be found in [5]. Under nominal conditions, delays as small as 16 ps in a commercial 130 nm technology have been achieved when the design is operated at 1.2 V supply voltage. To decrease the impact of thermal noise and supply noise disturbances the DLL is to be operated at high reference clock frequencies. This will force steep signal edges to be propagated across the loop and decreases the influence of device mismatches converted into timing errors. Slow voltage and temperature variations are compensated by the loop and propagated to the fine resolution level of the TDC. To keep the influence of the charge pump noise current low as well as to suppress disturbances introduced on the control voltage, the loop is designed with low bandwidth. However, a clean reference clock needs to be provided not to degrade the precision of the TDC. Whereas the loop is built up fully-differential to increase power supply robustness, the fine-time generation and distribution is accomplished in a single-ended manner to reduce power consumption. This represents a rather feasible approach as the propagation delay introduced by the fine-time generation and distribution network is small compared to the propagation delay introduced by loop.
To further divide the generated DLL bins by a factor of four, a passive interpolation circuit is employed in a succeeding stage of the interpolator. A resistor ladder is connected between the outputs of the DLL forming a voltage divider circuit. In total a set of 128 signals is generated by the interpolator, here referred to as the fine-time code. To compensate for capacitive loading of the passive interpolator structure, a non-linear scaled resistive divider is employed to achieve uniform -3 -  bin sizes. This passive interpolation circuit does not only yield low power consumption but also has a positive effect on device mismatches as several transitions of the interpolator act on a single transition effectively averaging device mismatches across several bins. At the output of the interpolator a buffering stage to sustain steep signal edges is employed. From Monte-Carlo simulations the interpolator's timing uncertainty across all 128 bins has been estimated to be 1.5 ps-rms when operated with 5 ps LSB sizes.

Channel matrix
Both the fine-time code of the interpolator as well as the counter code is distributed to the channels of the TDC. To allow a high number of channels to be integrated on a single ASIC, several channels are grouped into segments. Thereby each segment is served by dedicated distribution buffers. The schematic diagram of the buffers as well as its calibration feature is depicted in figure 3. A 5 bit delay-adjust, integrated in each buffer, allows to fine-tune the propagation delay of the buffer in 1 ps steps. This allows not only to compensate for device mismatches but also to compensate for INL errors up to 6.4 LSBs.
The structure of each single channel is depicted in figure 4. The fine-time-code/counter-state is connected to the D input of the time capture registers (TCRs) whereas the event signal is connected to the clock input. On the arrival of an event signal (rising edge) the current state of the D-inputs is sampled into the TCR, from which the exact time of arrival can be inferred and is illustrated in figure 5.

Time capture registers
The TCRs represent a crucial building block in fine-time resolution TDC designs. Poorly designed TCRs can lead to unnecessary high power consumption or introduce large device mismatches degrading the TDCs timing precision. Low capacitive loading on the D-input and low power con--4 -  sumption renders a traditional master-slave FF, as depicted in figure 6, the ideal candidate for a TCR in a multichannel TDC.
If the event signal of the input to the TCR remains constant no power is consumed by the register itself. However, the power consumption of the distribution buffers driving the clock input of the registers scales with the capacitive loading of the clock input. To minimize the loading of the clock input, the cell has been carefully laid out to keep capacitive loading at a minimum. Two differently sized versions of the proposed TCR have been realized, here referred to as the strong and weak version. The strong TCR has been designed to intrinsically provide sufficient matching for a 5 ps LSB design whereas the weak implementation has been optimized for a 10 ps LSB design. For the weaker TCR version the matching performance is decreased by approximately 50% consuming half the power.

Experimental results
To assess the performance of the proposed architecture, the fine-time interpolator together with one complete channel segment including a total of 8 channels has been realized in a commercial 130 nm technology. A microphotograph of the ASIC wire-bonded to the test board is depicted in figure 7. Different channel configurations have been implemented by the demonstrator to experimentally verify the timing precision of the weak and strong version of the TCR as previously described in section 2.3.

TDC transfer function
To extract the transfer characteristics of the TDC, a completely flat sequence of 1 million random events is issued to the TDC. From the number of events collected by each respective bin, the actual LSB size can be determined. Exemplary the LSB size of a single channel for the weak as well as the strong version TCR is depicted in figure 8. To account for the decreased performance of the weak TCR version, the DLL is operated at 781.25 MHz reference clock for the evaluation of the weaker matching channel. In either case, a DNL of better than ±0.9 LSB and an INL of better than ±1.3 LSB have been achieved. From the linearity measurement the expected single-shot precision, due to quantization noise and non-linearities of the TDC, can be estimated by σ TDC = σ 2 qDNL + σ 2 wINL , where σ qDNL and σ wINL have been calculated as described in [6] and are listed in figure 8.

Timing precision
To experimentally extract the time resolution of the TDC a flat sequence of uncorrelated events is sent to two distinct channels. For one of the channels a known cable delay is added generating a fixed phase difference between the two channels. Different cable lengths are employed to generate time differences where the second edge is arriving a) within the same b) in the succeeding and c) within multiple DLL clock cycle(s). From the recorded measurements a histogram is created from which the precession of the TDC can be extracted. No obvious dependence when using different cable delays could be identified. Exemplary, the histogram of a 490 ps wire delay is -6 -2014 JINST 9 C01060  shown in figure 9. A single-shot precision of better than 2.5 ps-rms for the strong and 5 ps-rms for the weak version of the TCR have been achieved.

Power consumption
The power consumption of the full ASIC is depicted in figure 10. Over the full DLL clock frequency range, the average power consumed per channel reaches from 43 mW/channel for a 5 ps LSB size setting down to 18 mW/channel for a 20 ps LSB size setting.
If more channels are to be integrated on a single ASIC, the power consumption contribution from shared resources (e.g. interpolator, I/O) can further be reduced. As the contribution of each channel to the total power consumption cannot be directly measured by the test setup, the proportional power consumption contribution of each channel is estimated from simulation. As listed in table 1, the channel configuration utilizing the weak version of the TCR achieves a 37% smaller power consumption per channel (at approximately half the time resolution).

Summary
In this contribution the architecture, design-trade-offs as well as measurement results of a fine-time resolution multichannel TDC have been presented. The presented architecture is fully self-tracking and allows to trade-off time resolution vs. power consumption to precisely match the performance of the TDC to the end users requirements. Stable performance across PVT variations down to the finest interpolation level, is guaranteed by the locking mechanism of the DLL. An on chip calibration scheme to calibrate out device mismatches on a per bin basis has been described. As this calibration feature is common to a larger set of channels a per-channel calibration is not required by this architecture. The performance of two different channel configurations has been evaluated experimentally. In both cases the DNL and INL have been measured to be lower than ±0.9 LSB and ±1.3 LSB respectively. For the stronger TCR, time resolutions of better than 2.5 ps-rms at -8 -a power consumption of 24.1 mW/channel have been demonstrated. With the weaker TCR, time resolutions of 5 ps-rms at a 37% lower power consumption have been achieved.