The VeloPix ASIC

VeloPix, a 130 nm CMOS technology chip with data driven and zero suppressed readout, will be used as a readout chip for the hybrid pixel system of the LHCb Vertex Locator (VELO) upgrade. The upgrade, scheduled for LHC Run-3, will enable the experiment to be read out at 40 MHz in trigger-less mode, with event selection being performed in the CPU farm. The highest occupancy ASICs will experience rates of more than 900 Mhits/s, and the closest pixels are 5.1 mm from the LHC beams. This paper will present the VeloPix ASIC along with the first test results without a sensor.

: VeloPix, a 130 nm CMOS technology chip with data driven and zero suppressed readout, will be used as a readout chip for the hybrid pixel system of the LHCb Vertex Locator (VELO) upgrade. The upgrade, scheduled for LHC Run-3, will enable the experiment to be read out at 40 MHz in trigger-less mode, with event selection being performed in the CPU farm. The highest occupancy ASICs will experience rates of more than 900 Mhits/s, and the closest pixels are 5.1 mm from the LHC beams. This paper will present the VeloPix ASIC along with the first test results without a sensor.

Introduction
This paper presents the VeloPix application specified integrated circuit (ASIC) and the first measurements of the fabricated device. The Large Hadron Collider beauty (LHCb) Vertex Locator (VELO) will be upgraded to a trigger-less system with a full detector readout at 40 MHz. The VELO will be a hybrid pixel system, featuring silicon pixel sensors with 55 × 55 µm 2 pitch, read out by the VeloPix ASIC. Sensors and chips will approach the interaction point to within 5.1 mm and be exposed to a radiation dose of up to 370 MRad or 8 × 10 15 1 MeV n eq cm −2 . The highest occupancy chips have hit rates of more than 900 Mhits/s/chip and produce data rates of over 15 Gbit/s. VeloPix has many features in common with the Timepix3 ASIC [1], however VeloPix is optimised for speed and radiation hardness. Each ASIC reads out an array of 256 × 256 pixels with 55 × 55 µm 2 square pitch. A total of 624 ASICs are needed for the full VELO readout. Table 1 shows the main characteristics of VeloPix. The readout architecture is data driven and zero suppressed using no trigger. Due to limited cooling of the ASICs in the final application the power is limited to < 1.5 W cm −2 or 3 W per ASIC. The chip is optimised for electron collection.
The ASIC is equipped with single-event upset (SEU) protection because of the severe radiation environment. In order to meet the data output rate requirement while keeping the power consumption within the budget a dedicated 5.12 Gbit/s output serialiser, the Gigabit Wireline Transmitter (GWT) [2], has been implemented. The VeloPix ASIC was submitted in May 2016, the first wafers received back at CERN in August, and this paper presents the first measurements obtained during September 2016. Figure 1 shows the architecture of VeloPix. The chip uses a double column layout where common super pixel logic is shared between 2 × 4 pixels. Bonding pads (using top metal) are situated mostly over the digital area, and shielded from digital noise using lower metal layers. A 10-bit 40 MHz gray-encoded time stamp is distributed across the full pixel matrix to identify events in time. The chip sends out only 9 bits of Time of Arrival (ToA) information, and the 10th bit is used for internal latency monitoring. When there is a valid hit at the front-end, a data packet is created and a 40 MHz ToA is added to the packet inside the super pixel. If more than one front-end has a hit at the same time, all hits are included in the same packet. This packet must then propagate through all super pixels between the source of the packet and the End of Column (EoC). The EoC pushes the packet into a data fabric, and finally the packet travels through a fabric central node to a router.

Chip architecture
This router is used at the periphery to forward packets to active GWT outputs. Due to nonuniform data rates in the upgraded VELO, all four links need not be active in all chips. More details on the readout architecture, especially at the column-level, can be found in [3].
The chip sends out all data packets in frames of four packets. Each frame contains a header 0xA for frame alignment, and a parity bit for each packet to detect transmission errors.

Pixel front-end
A pixel front-end connected to a super pixel is shown in figure 2. The analog front-end has a singleended preamplifier, a feedback capacitor (4 fF), leakage current compensation and a test capacitor (5 fF) for test pulse injection. The discriminator threshold is tunable via a global non-monotonic Digital-to-Analog converter (DAC) (14 bits) and a local pixel DAC of 4 bits. The global DAC consists of a 10-bit DAC which defines the DAC LSB and a 4-bit DAC which defines the global voltage range. This architecture produces a non-monotonic sawtooth voltage output. The voltage is used for global offset calibration. The 4-bit local pixel DAC is used for mismatch compensation between pixels. An on-pixel mask bit is used to disable noisy pixels. The analog front-end layout was created in a full-custom manner. All NMOS transistors in the analog front-end are enclosed layout transistors (ELTs). ELTs are used to mitigate leakage effects in transistors.
The digital front-end synchronizes discriminator output to a 40 MHz clock. A time-overthreshold (ToT) processor sends a valid_event pulse to the super pixel when the linear feedback shift register (LFSR) value reaches 0-3 counts (configurable). Data acquisition is disabled when the global shutter signal is 0. The super pixel writes a data packet into its first-in first-out (FIFO) whenever there is a valid event from any of its 8 pixels, and adds a ToA to it. The super pixel then requests access to a data node, and data nodes propagate data down the column to the EoC. The place and route of the layout for the compiled digital logic was done using digital implementation tools. Finally, the timing was verified using static timing analysis tools. To reduce the systematics between pixels, measures were taken in the digital implementation flow. An inverter was placed manually close to the discriminator output to minimize output capacitance of the discriminator, and to make the capacitance identical in all pixels. Pixel configuration latches on the digital layout side were placed manually close to the analog front-ends because these cells are static during the data acquisition. This reduces the coupling of digital signals to the analog front-end.

Measurements
Measurements presented here are taken from the very first chip tested at CERN. They are also done without a sensor bonded to the chip. Full characterisation and production testing will be done during the last quarter of 2016. Measurements were taken using the Speedy PIxel Detector Readout (SPIDR)-readout system developed at Nikhef [4]. This system is capable of a single VeloPix readout at full speed of 20.48 Gbps when all four GWT links are active.
The analog power consumption for the chip was measured to be 387 mW. The digital power was 374 mW after the chip power-up, increasing to 694 mW when pixel matrix clocking was enabled. Enabling ToA for all columns increases digital power by 24 mW, to a total of 718 mW. An increase of few hundred mW is expected due to the hit activity in the pixel matrix. These numbers show that the design meets its target of < 1.5 W cm −2 .
The left side of figure 3 shows the threshold scan (s-curve) of a single pixel. The x-axis shows the global threshold DAC code, and the y-axis the number of pulses counted by the pixel. For all these measurements it is assumed that the on-pixel test capacitance (see C test in figure 2) for all pixels is 5 fF which is the value obtained from pixel parasitic extraction. Each curve shows pixel response with different energies from 0 e − to 5656 e − . For each point, 25 analog test pulses are injected into the pixel and global threshold value scanned starting from a DAC code 6100 downwards. The earliest response is seen with the highest energy as expected (when the threshold is also highest). The noise floor around DAC code 5680 is same for all energies.
On the right side of figure 3, a DAC code corresponding to the midpoint of the s-curve is chosen for each energy. The plot shows the response with different energies is linear with respect to the global DAC codes. From this plot it can be extracted that the gain is 24.6 mV/ke − . Figure 4 shows the spread of Equivalent noise charge (ENC) in pixels. Mean of the noise is 62.9 e − , assuming a gain of 25 mV/ke − . Noise spread is 4.1 e − rms. The right side of figure 4 shows that during a threshold scan over the noise floor no systematics are observed. Figure 5 shows the threshold spread in the pixel matrix. The threshold spread has been measured with the local pixel DAC set to 0x0 (mean 1378) and the local DAC set to value 0xF -4 -2017 JINST 12 C01070  (mean 1513). The mean numbers correspond to a linearised global DAC range with an LSB of 0.38 mV. Finally, by calculating the equalisation, pixel-to-pixel threshold mismatch is 40.3 e − rms.
The right side of figure 5 shows no systematic effects in the pixel matrix. Table 2 shows a summary of the measurements. First results show that VeloPix is working as expected. The production testing is done later in 2016, and the first testbeams are also foreseen in 2016.