Development of a digital readout board for the ATLAS Tile Calorimeter upgrade demonstrator

During the LHC shutdown in 2013/14, one of the ATLAS scintillating Tile Calorimeter (TileCal) on-detector modules will be replaced with a compatible hybrid demonstrator system. This is being built to fulfill all requirements for the complete upgrade of the TileCal electronics in 2022 but augmented to stay compatible with the present system. We report on the hybrid system's FPGA based communication module that is responsible for receiving and unpacking commands using a 4.8 Gbps downlink and driving a high bandwidth data uplink. The report includes key points like multi-gigabit transmission, clock distribution, programming and operation of the hardware. We also report on a firmware skeleton implementing all these key points and demonstrate how timing, trigger, control and data transmission can be achieved in the demonstrator.


Introduction
The Large Hadron Collider (LHC) upgrade plans include 14 TeV center-of-mass energy (7+7 TeV) and close to one order of magnitude increased luminosity by 2022 (upgrade Phase-II). This means increased radiation levels and reduced accessibility, since when replacing parts one now has to take activation into consideration. This, together with the aging of the present readout electronics for the scintillating tile hadronic calorimeter [1] of ATLAS [2] implies that an electronics replacement is necessary. Obsolescence of the old components and the requirement to improve the efficiency of the first level trigger suggests a complete redesign, where all data are read out to the counting room. This allows more efficient signal processing to reduce pile-up effects, as well as more advanced trigger algorithms based on finer-granularity data. Here one should use state-of-the-art optical links to achieve the required bandwidth and radiation tolerant FPGAs for flexibility and redundance.
To gain experience with the new design a demonstrator project was initiated, where one module with the new electronic system would be inserted into the detector, at the end of the first long shutdown period in 2013/2014 (Phase 0). However, in order not to disturb current data taking the demonstrator must be kept compatible with the present system. If successful, the aim is to insert three additional demonstrators in following yearly shutdowns and to combine data with a similar electromagnetic calorimeter demonstrator.

The ATLAS Tile Calorimeter
The Tile Calorimeter (TileCal) is the hadronic calorimeter of ATLAS, designed to measure the energies of jets and hadrons. Analog data from TileCal is also merged into trigger tower data used by the first level trigger of ATLAS. TileCal consists of 4 partitions: two extended barrels at the outside and two central barrels in the middle. Each partition is divided into 64 slices. The detector electronics are placed at the base of each slice in retractable "superdrawers". In total there are 256 superdrawers in TileCal and each superdrawer digitizes and processes signals from up to 48 photomultiplier tubes (PMT). At present each superdrawer consists of two drawers but the aim is a further division into four largely-independent minidrawers.

The present readout electronics
The present readout electronics consist of four types of boards (figure 1): the 3-in-1 Front-End card, the 3-in-1 mainboard, the digitizer board and the link board. The 3-in-1 card is a multipurpose board for amplification and shaping PMT pulses, producing calibration pulses and integrating the PMT current in response to a circulating Cs-source. It is controlled by the 3-in-1 mainboard. The digitizer board samples two versions of the shaped PMT signals, with high and low gain. Low gain signals from the 3-in-1 cards are also merged into projective towers pointing at the interaction point, to be transmitted to the trigger. The detailed data are temporarily stored in pipeline memories while waiting for validation by the first level trigger. For triggered events the relevant data are transferred to derandomizing buffers and readout. The radiation environment around the TileCal electronics is relatively benevolent, allowing the use of FPGAs. The ionizing dose where the electronics is situated is below 1 Gy/y.

The future readout electronics
While the current design stores digitized data on the detector until a level 1 accept signal from the first level trigger is received, the next generation will read out all the data continuously, thus moving the pipeline and de-randomizing memories to the counting room. Other differences are changed -2 - modularity to be compatible with the minidrawer concept, and new Front-End boards (FEB). Although three different types of Front-End cards are being developed: a modified 3-in-1 [6], the custom FE-ASIC and the QIE [7], only the modified 3-in-1 can produce the analog trigger signals necessary to be compatible with the present system, and is therefore the candidate for the demonstrator. The final system choice, which does not need this compatibility, must be decided based on test beam results.
The main components in the new on-detector readout system are the Front-End boards, the MainBoard for data acquisition and the link DaughterBoard for control and off-detector communication. The demonstrator MainBoard, adapted to the 3-in-1 FEB, controls them and digitizes their outputs (high gain, low gain and integrator output). The DaughterBoard, developed at Stockholm University, is the key component responsible for the multi gigabit data communication with the off-detector as well as for controlling and monitoring of all the on-detector electronics. To reach the demonstrator goal, hard-and firmware must be thoroughly tested, verified and later proven to be sufficiently radiation tolerant.

The second prototype DaughterBoard
The development of the future readout electronics has been divided into several steps, each with more functionality and improved performance. The board itself is equipped with two XC7K160T Kintex-7 FPGAs from XILINX [9], a large 400 pin SAEF-40-06.5-10-A connector with a custom pin layout on the bottom side, a 100 pin MEG-ARRAY connector to hold a AFBR-775BEPZ PPOD module from AVAGO, and a QSFP+ connector for further high speed communication modules. Figure 2 shows a picture of a fully mounted version of the second prototype DaugherBoard.
For the DaughterBoard the major aspects to be developed were radiation hardness, high speed data readout and redundancy. The approach chosen for redundancy was to create two completely independent sides in one board, which both can run in parallel and perform the same functions. Radiation hardness was ensured by using components that were either already radiation tested or likely to be radiation tolerant. Furthermore the board was designed for high speed data transmission with up to 10 Gbps using no external circuitry for clock synthesis and cleaning. These require a conscientiously designed power distribution network with adequate filtering.
-3 -Power distribution network. Each side, and therefore each FPGA, has its own power distribution network. The main power supply to the board is 10 V, derived from custom made Low Voltage Power Supply (LVPS) [8] and delivered through the MainBoard via the 400 pin connector. This voltage drives local Point-Of-Load switching DC-DC voltage regulators to be distributed to all components on its designated side. For high performance, all voltages are additionally filtered using a passive LC filter network. This was found to be the best practice because measurements showed that the noise on the internal supply voltage of the FPGA could couple into the high speed serial transceiver (GTX) data stream and therefore distort the high speed serial signal. Reducing this noise source, clock and GTX signal integrity were improved significantly.
Serial high speed communication. The DaughterBoard provides two kinds of interfaces for high speed serial communication. The first, with a MEG-ARRAY connector for pluggable parallel optics devices (PPOD) from AVAGO, is capable of transmitting 5 Gbps over twelve parallel lanes. The second is a standard QSFP+ connector for transmitting and receiving data with up to 10 Gbps on four parallel lanes. Normally, the GTX should be driven by a high quality clock signal provided by an external source. In order to avoid adding additional components in a radiation environment, these transceivers are driven by an internal clock signal. As a fall-back solution additional clock circuitry was placed on the board that allowed a high quality clock signal to be synthesized if needed.
Connectivity. The communication between DaughterBoard and MainBoard is through a 400 pin SEARRAY connector from SAMTEC with a custom pin layout divided into two identical areas on each side of the board. The general connector structure consists of four different parts, differential signaling, 3.3 V based single ended communication, voltage monitoring and JTAG configuration. The connector pinout allocates 96 LVDS lines, 24 single ended LVCMOS or LVTTL lines, 16 lines for voltage monitoring and two JTAG chains, divided in the middle of the connector. Additionally, the board provides SMA connectors for clock input and GTX Transceiver communication as well as GPIO lines accessible through two 14 pin header connectors.

Hardware characterization
Various measurements were performed to electrically characterize the system in detail and find issues that can be solved in a later revision. Of main importance was the performance of the power distribution network, the internal clock nets and the serial high speed communication under two different resource utilization conditions. As a reference the best and worst case scenarios were evaluated, using a minimal design with only about 1% of logic utilization and using an extensive design with about 50% of all flip-flops and 30% of all look-up tables utilized. The switching frequency of the instantiated logic was 41.667 MHz, which is in the same range as the 40.08 MHz global LHC clock, letting us predict whether the system will perform satisfactorily under normal operating conditions in TileCal. Additionally the performance of the LVDS lines at 640 Mbps was characterized in terms of crosstalk.
Power distribution network. All DC-DC regulator outputs on the board are filtered using passive LC filters with an 2.2 µH coil and two 100 µF ceramic capacitors. For de-coupling purposes, several capacitors with various values and footprints are distributed over the whole board as well.
-4 - Of special importance is the spot noise of the 1.0 V core voltage of the FPGA which was measured at two distinct frequencies. The first is 285 kHz, the switching frequency of the DC-DC regulator and the second is 41.667 MHz, the switching frequency of the logic. The switching noise emitted from the DC-DC converter was reduced by the filter network from 4.5 mV to 30 µV. When measuring the contribution for 41.667 MHz, no significant difference between minimal and extensive utilization could be observed.
Clock performance. Two 100 MHz oscillators on the board are connected to global clock input pins of the FPGA, and used for clock synthesis. The generated 240 MHz clock signal was measured with a differential probe soldered to a LVDS pin pair of the 400 pin connector. Figure 3 shows the random and bounded-uncorrelated jitter spectrum and track for low and high utilization. When utilizing half of the FPGA, the jitter on the clock net increases deterministically. One can see that the main frequencies contributing to this increase are 40 MHz and 80 MHz, as a result of the increased power demand from the supply voltage. The clock finally used for driving the GTX-PLL is slightly better, because it never enters a global clock net that was used for this measurement. For this reason, what is shown is a worst-case scenario. Comparing this clock with reference signals available on off-the-shelf development boards leads us to conclude that it is possible to achieve stable data communication even at 10 Gbps.
Serial high speed communication. The DaughterBoard design utilizes all of the eight available GTX units within one FPGA. Five of these are connected to the MEGARRAY connector, two to the QSFP+ connector and one to the SMA connector next to the FPGA. To evaluate the GTX performance a differential probe was soldered to one pin pair of the MEGARRAY connector. The eye diagrams shown in figure 4 were generated with a PLL directly driven by an internal synthesized 250 MHz clock signal. There, minimal and extensive utilization were directly compared. With extensive utilization the deterministic jitter was increased by 4 ps and the phase jitter was increased by 6 ps due to the usage of the FPGA clock net. Nevertheless, the eye is still wide open and the calculated bit error rate was also in the same order of magnitude.
-5 - LVDS crosstalk. To ensure the data integrity on the LVDS lines, which will be driven at 640 Mbps, crosstalk tests were performed. For this test all LVDS lines situated on the same layer were driven with pseudo random data at 640 Mbps. The resulting noise was measured with a differential probe at one un-utilized LVDS line, situated in the middle of the connector. It was found that the noise magnitude contribution due to crosstalk was negligible. It seemed that most of the noise contribution was due to current drawn at the supply power plane and not due to crosstalk. In total the overall peak-to-peak noise increased by about 5 mV when all LVDS lines were utilized.

Firmware implementation
The above-mentioned tests as well as the implementation of the complete communication were done using the same firmware elements. The design was only slightly modified to fit the test as well as possible. Otherwise the clocking scheme, the instantiation of all IO signals as well as the high speed communication implementation were all based on the same basic design that will be used in the final version of the firmware. Because the final version of the upgraded readout hardware will be based on the current version of the DaughterBoard but may have another MainBoard solution, special care was taken to keep every firmware component as generic as possible.
The IO block components were written in a way that allows easy adaption to another IO configuration. Switching between in-and output can be realized by setting generic values of the corresponding IO bus component. This is true for the LVDS IO bus as well as the LVCMOS signals. In the current firmware version the GBT protocol [10] is used to transmit commands and data on-and off-detector. However in the final version the GBT protocol will not necessarily be used for transmitting data off the detector.

DaughterBoard test system
A test system was set up utilizing a DaughterBoard, a MainBoard, a QSFP+ mezzanine adapter board from HiTechGlobal, a KC705 [12] and a ML507 [13] evaluation platform from Xilinx ( figure 5).
-6 - The KC705 was used to generate a 5 Gbps data stream which was sent to the DaughterBoard. There, a 250 MHz clock signal was recovered from the data stream and used directly to start the data transmission with two data lanes at 10 Gbps back to the KC705. The data were encoded with the GBT protocol to be able to discover bit errors within the data-stream. For voltage compatibility reasons it was not possible to use the high quality clock on the QSFP+ mezzanine, so a 125 MHz clock signal from the ML507 was used instead as reference. Communication between KC705 and DaughterBoard was stable over a test period of 5 days with 3 bit errors resulting in a bit error rate of approximately 1 · 10 −15 , which is an acceptable level for TileCal. Communication between DaughterBoard and MainBoard was tested initially checking the communication with the MainBoard FPGAs and data reception from the 12 bit 40 Msps ADCs, and no design issues have been discovered so far.
Under development. Now that communication between the various boards is established, the next step is to refine the firmware in such a way that dedicated commands can be send to the MainBoard and 3-in-1 Front-end-boards using a PC connected to the KC705. IPbus [11] will be used as the protocol to establish this communication and transmit data and commands between PC and KC705. Furthermore a protocol for sending the sampled data off the DaughterBoard is under development, allowing to separate between configuration data and sampled data from the PMTs.

Conclusion
The recent DaughterBoard is a large step towards a working demonstrator. With this version it was possible to successfully verify some of the key components necessary for the implementation of a TileCal readout demonstrator. However, there will be one more version of the DaugherBoard available, which is already under construction. This board will implement the currently missing remote programmability and provide additional connectivity, for example to the hardware that controls the high voltage supply to the PMTs. If the new revision performs satisfactorily, it will be the one implemented in TileCal in the middle of 2014.