The Versatile Link Demo Board (VLDB)

The Versatile Link Demonstrator Board (VLDB) is the evaluation kit for the radiation-hard Optical Link ecosystem, which provides a 4.8 Gbps data transfer link for communication between front-end (FE) and back-end (BE) of the High Energy Physics experiments. It gathers the Versatile link main radiation-hard custom Application-Specific Integrated Circuits (ASICs) and modules: GBTx, GBT-SCA and VTRx/VTTx plus the FeastMP, a radiation-hard in-house designed DC-DC converter. This board is the first design allowing system-level tests of the Link with a complete interconnection of the constitutive components, allowing data acquisition, control and monitoring of FE devices with the GBT-SCA pair.


Introduction
The radiation-hard Optical Link project for LHC is a common development for all LHC experiments and other interested collaborations. The objective of the project is to supply experiments with a radiation-hard, magnetic-field-tolerant 4.8 Gbps data transfer link for all communication between detector front-ends and counting rooms. Based on point-to-point optical links it can handle timing signal broadcast with low jitter, trigger transmission with deterministic latency, slow control, status monitoring and readout data. In this context, the CERN ESE group designs, develops and qualifies (in collaboration with several international partners) all ASICs, optoelectronic devices and ancillary electronics necessary to operate the link. Within this project several new devices were born, to be used by the experiments and are currently being produced in quantities reaching 70,000 pieces. These components are called GBTx [1] (GigaBit Transceiver ASIC dedicated to serialization, deserialization and data and clock recovery), GBT-SCA [2] (Slow Control Adapter, ASIC dedicated to slow control and status monitoring) and VTRx/VTTx [3] (Versatile optical transceiver/dual transmitter and equipped with custom designed ASICs and qualified optical sub-assemblies). All of them are specifically designed to sustain the really high levels of radiations met within LHC detectors. Each of them has been carefully tested and qualified against experiments' requirements, however the community was lacking a simple way of testing and characterizing the complete system with all of them being interconnected.
The Versatile Link Demonstrator Board (VLDB) is the evaluation kit for the radiation-hard Optical Link ecosystem. It gathers the three aforementioned radiation-hard ASICs and modules -1 -composing the main elements of the Link plus the FeastMP [4], a radiation-hard in-house designed DC-DC converter made of custom-designed ASIC and qualified components. This paper introduces the VLDB and its components in section 2 before presenting several interconnection schemes reproducing typical use-cases. Section 3 is dedicated to tests performed with the VLDB characterizing the high-speed link in terms of jitter, temperature and radiation. Finally, section 4 focusses on GBT-SCA tests done using the VLDB.

The VLDB: main components and interconnection schemes
The VLDB (figure 1) is composed of the following radiation-hard ASICs and modules.
The GBTx is a radiation-hard ASIC which implements a high-speed link (3.2 to 4.48 Gbps of payload depending on the operational mode). The ASIC gathers in a pair of optical links Data Acquisition (DAQ), Timing, Trigger and Control (TTC) and Slow Control (SC) information. Up to 40 front-end modules can be connected to the GBTx using its electrical link (e-link) ports, which offer line rates from 80 to 320 Mbps. In order to achieve this, the GBTx has an associated 120-bit frame, which includes Slow Control (SC) and data together with an optional Forward Error Correction (FEC) field. The GBTx also serves as a LHC bunch clock ( 40 MHz) distributor as it recovers the clock from the incoming serial data and transmits it to every connected front-end, as well as to 8 phase adjustable clock outputs. The GBTx serializer/deserializer uses a scrambled Reed-Solomon encoding scheme in order to cope with the radiation and Single Event Upsets (SEUs) that may happen in the radiation environments for which it is designed. Fully configurable, the GBTx features more than 400 registers accessible over its serial link or via an I2C interface available on the VLDB.
Along with the GBTx, the GBT-SCA is dedicated to monitor and control front-end modules. It can handle up to 16 I2C, 8 SPI and 1 JTAG chains, plus 32 GPIO lines, 31 ADC analog inputs, a temperature sensor and 4 DAC channels. The GBT-SCA is controlled via the GBTx using the previously mentioned SC field.  The VTRx is an optical transceiver which meets the radiation tolerance and high-speed requirements in the FE interfaces. The VTRx package is compatible (with small adjustments) with -2 -  commercial SFP+ modules, therefore the VLDB can feature either one or the other for functional tests in a radiation-free environment.
Finally the VLDB is equipped with 2 FEASTMP DC-DC converters supplying 1.5 V and 2.5 V/3.3 V to both on-board ASICs and VTRx/SFP+. These voltage converters are also radiation tolerant, while being able to handle an input from 5 to 12 V.
Designed for flexibility, the VLDB is the first design allowing system-level tests of the full GBT chipset family with a complete interconnection of the constitutive components. It is also delivered with a USB-to-I2C dongle and a control software [5] to help users with the GBTx configuration. HDMI connectors allow to test communication with experiments front-end electronics and to try multiple interconnection schemes between several VLDBs playing all the roles foreseen by the various use cases.
For a first contact with the radiation-hard Optical Link ecosystem, a basic setup could consist of one VLDB, an FPGA-based KC705 board [6] (acting as back-end board), an optical link and 8 e-link connections through HDMI cables and an e-link FPGA Mezzanine Card (FMC). Using the setup depicted in figure 2 data can be serialized by the GBT-FPGA core [7] in the Kintex 7, be transmitted over the optical link, recovered by the GBTx, distributed on the e-links and checked back in the FPGA.
A second type of setup can be built using the GBT-SCA to control a set of devices that would be located in the front-end. This system needs an FPGA (a KC705 board in this case), a VLDB and the devices to control through I2C, SPI, JTAG, GPIOs, etc. In the system presented in figure 3 the FPGA is controlled by a PC via Ethernet. It provides the data for SCA commands. These commands travel via the optical link to the GBTx, being deserialized and transmitted to the SCA, which in turn communicates with the different boards connected. This setup allows control over a myriad of devices, as each I2C link can typically control up to 127 devices, while there also are 8 SPI control lines.
The GBTx can act as a clock distributer, creating a hierarchy of devices. This kind of setup is shown in figure 4. Using a GBT-SCA to program a slave GBTx is also possible, and these GBTx can also have their own SCAs controlling front-end devices. As usual, a KC705 FPGA is used to control the master GBTx, while the slaves can fully be dedicated to data acquisition. In this manner, up to 8 slave GBTx can receive the clock from one master.

JINST 12 C02020
The configuration presented in figure 2 was used to characterize jitter of the GBT chipset and test the full ecosystem in temperature and under radiations. The system presented in figure 3 is currently used to evaluate the slow control of FE devices over the full link. The results are presented in the coming sections.

Clock quality and Radiation tests
A very important feature of the GBTx is that it guarantees low jitter of the recovered clock and constant latency of clock and data both during a run and between resets or power cycles. This is mandatory to distribute Timing, Trigger and Control to front-end modules. As temperature may vary both in Back-End crates and in Front-End locations, the jitter and the phase of the clock recovered by the GBTx have to be characterized, ensuring a continuous and limited variation with temperature or between power cycles. The setup presented in figure 2 was used both for detailed jitter characterization of the recovered clock at room temperature and for a study on phase and jitter variation over temperature and power cycles. For both studies, the GBT-FPGA transceiver of the KC705 was using a clean reference clock provided by a CG635 generator from Standford Research Systems.

Jitter characterization
The GBTx has two main ways of redistributing the received clock, which is recovered from the serial link when the transceiver or receive modes are selected. Firstly, it has up to forty e-links which can output both data and clocks. Secondly, it also features eight clock outputs whose phase can be adjusted using configurable phase shifters. The frequency of all the clock outputs can be set at 40, 80, 160 or 320 MHz. In the VLDB, twenty of these e-link ports are available on the mini-HDMI connectors.
A phase noise analysis was performed at room temperature both on the e-link and on the phase-shifted clock outputs with the GBTx in transceiver mode. The results, presented in figure 5, are showing better phase noise performance for the e-link clocks than for the phase shifted ones: the integration of the phase noise over a 1 Hz to 10 MHz range reveals a random jitter of 12 ps rms for the elink, to be compared to about 15 ps rms for the phase shifted clocks. This slight excess of jitter for the second type of clock outputs is explained by the presence of a phase noise peak round 1 MHz due to the delay chain that these clocks have to go through in order to be phase variable.

Temperature tests
The main objective of these tests is to demonstrate to what extent clock phase is altered while applying controlled variations of temperature and to ensure that the phase deviation is only continuous.
The setup uses a climate chamber to modify the ambient temperature in which the GBTx is placed, and the chip is reset every 30 seconds. The phase of different recovered clocks (phase-shifted clocks, e-link clocks and internal clocks) has been tracked with respect to the clean reference clock provided by a CG635 generator and feeding the GBT-FPGA transceiver. The results ( figure 6) show a phase linearly drifting with temperature by typically a few hundreds of picoseconds from 5 to 65 degrees. No phase jump has been detected during 10,000 repetitive reset cycles.

Radiation tests
An exhaustive radiation campaign was performed in 2014 and 2015 both with X-rays for Total Ionizing Dose (up to 100 MRad) and heavy ions for SEUs. The Bit Error Rate (BERT) was monitored and clock jitter was also measured, showing just a 7.7% increase after irradiation. Also SEU tests had been carried out at the Heavy Ion Irradiation Facility (HIF), in Louvain-la-Neuve, Belgium, with results showing that the dominant SEU effect on the GBTx are lock errors [8].
A total dose radiation test was done at the IRRAD Proton facility at CERN [9]. The aim of this test was to evaluate the TID that the full system could sustain, aiming to reach higher levels of TID than in previous tests. Two VLDBs were set in the centre of the beam trajectory, with the GBTx receiving directly the impact of 24 GeV/c protons from the PS accelerator. The beam size was centered around the GBTx in a 12 × 12 mm square. A schematic and a picture of the setup are shown in figures 7 and 8, respectively. The registers in the GBTx were read continuously to check the corrections made internally by the SEU correction mechanism. The setup withstood up to 400 MRad in working conditions, although the optical link showed lock errors triggered by every proton spill, due to the very high intensity of each spill (approximately 3.4 × 10 11 protons per spill).

Slow control studies
The setup presented in figure 3 is currently used to study software and firmware solutions for slow control of front-end devices over the full link. Three of them, mostly developed by experiments and either fully based on Hardware Description Language (HDL) or mainly software-based, are currently under evaluation. As the systems presented have different types of instructions and interfaces, in this document a common naming is used. The interface between the PC that controls the setup and the FPGA is the control interface, and the commands travelling through it are firmware commands or transactions in the case they do not contain any payload. Between the SOL40-SCA and SCA the commands are called SCA commands, while between the SCA and the front-end (FE) devices they are called FE instructions. The FE instructions pass through a SCA channel, which can be one of the I2C interfaces, the SPI, the JTAG, etc.

Different SCA systems
The systems presented in figures 9a and 9b are mostly based on HDL firmware in the back-end FPGA. They are making use of the LHCb SOL40 firmware core (4.2) to process the SCA commands directed to the SCA, then encoding them in an HDLC frame [10]. The firmware commands, and therefore the SCA commands, come in both cases from a computer, but the control interfaces and the FPGAs used are different.
In the first system (figure 9a) the firmware commands are sent through Ethernet using a TCP/IP protocol. The firmware consists of a Microblaze system that receives them, the SOL40-SCA core, and the GBT-FPGA module. However, in the second setup (figure 9b) the Ethernet-TCP/IP control interface is replaced with a PCIe one. In this way the system performance is better, as the PCIe achieves a lower latency. This is the approach used by the LHCb experiment. Finally the third system uses a software approach. In this setup most of the work to compose the HDLC frame and to address the different interfaces is done in software written in TCL or C languages. In this way the firmware only receives the frames already composed via the JTAG control interface. The frame then is serialized in order to be sent using the EC channel. This type of system (represented in figure 9c) is in development by the ATLAS experiment.

SOL40-SCA IP
The CERN experiment LHCb is advancing towards a triggerless architecture. In order to do so, an upgrade for the Experimental Control System is being developed. This system uses the GBT ecosystem, and in particular the SCA to interface with the FE using the different SCA channels available. In this framework, LHCb has developed a firmware core fundamental in both HDL based systems presented in this publication: the SOL40-SCA [11]. The SOL40-SCA is divided in 4 layers, each of them having a different role.
The first one is the interface layer. It implements a set of registers which are accessed via an address map. The registers can be dedicated to send commands, receive the reply or control. A set of command or reply registers, including both header and data ones, forms an ECS (Experiment Control System). The whole SCA register structure (several of them per channel) can be accessed using this reduced set of registers. Under normal conditions, if the registers are not already populated -6 -  with the required values, several registers must be written in the SOL40-SCA (each one using a firmware command) to generate one SCA command.
The buffer layer is after the interface. It has two different purposes, depending of the direction of the operation. If it is a write, it acts as a FIFO memory, making it possible to continuously send firmware commands to the SOL40-SCA. In case of a reply packet coming from the SCA the buffer layer has a memory with one position per channel. In each position a full set of reply registers are stored. In this way a user can get a reply from one SCA channel, do some operations with another SCA channel, and afterwards come back to read the replies from both of them.
The protocol layer converts the ECS packet structure into the specified frame expected by the SCA to handle each of the protocols (I2C, SPI, JTAG, etc.). It also implements a queue of commands going to the SCA and from it, managing each e-link, in case there are several of them connecting with different MAC layers.
Finally, the MAC layer creates the HDLC frame with the data provided from the protocol layer, or extracts it from a SCA packet. It encodes the frame provided by the protocol layer into HDLC and ensures serialization or deserialization.
-7 - Each instance of a SOL40-SCA is associated with a GBT-Link, where it is able to manage from 1 up to 40+1 GBT-SCAs in a Front-End [11]. It does so replicating some internal layers, which impacts its resource usage. In this publication the SOL40-SCA is always managing one SCA.

FPGA resources usage
The SOL40-SCA IP resource usage in the HDL systems is compared to the alternative softwarebased approach, expected to be much lighter on FPGA resources. A Xilinx XC7K325T device (on a KC705 board) has been used to implement both the HDL system and the software one. The results are presented in percentages in table 1. The table reports the resource usage of the whole system made of a Microblaze-Ethernet lite subsystem (not detailed), GBT-FPGA and SOL40-SCA with its sublayers. In the second part of the table the software system resources are given (the JTAG control interface is not detailed) disaggregating the External Control (EC) serializer and the GBT-FPGA, which logically consumes the same resources as in the first system.
It can be seen that the SOL40-SCA IP uses a non-negligible amount of resources, ranging from 9.64% of the FPGA slices to just a 0.66% of the block RAM, while the EC serializer makes use of only a 0.39% of the slices in the device. This may be important for different systems as several links may be implemented in the same device where the resources are limited.

Access times and interface comparison
The most important difference between the HDL systems is the control interface used to communicate with the FPGA. In one system TCP/IP is used, while in the other PCIe is the chosen one. Both setups use the SOL40-SCA IP, which requires several firmware commands to send a FE instruction over an SCA channel. For example, in order to perform a read from the I2C SCA channel, 13 firmware commands have to be issued to the SOL40-SCA. This number varies across the different SCA channels, needing 15 in the SPI case, and only 8 commands for a GPIO read. When the TCP/IP control interface is added on top of this, the number of transactions is triplicated (because of the acknowledgements), which considerably slows down the time required to access the FE devices (shown in table 2). Shorter access times are reached using the PCIe control interface, -8 - which is better tailored for this type of application as it only has a 50 ms latency for the same I2C instruction. The HDL-PCIe setup was used to program an Artix 7 XC7A35T device over JTAG via the SCA. Unlike the HDL-TCP/IP system which uses a TCP socket to control the setup from a PC, this one uses PCI express. As previously mentioned, this makes a notable difference in the time required to do thousands of operations, as is the case when programming an FPGA. The SOL40-SCA IP requires at least 9 SCA commands to send 128 bits (size of the JTAG buffer in the SCA) via the JTAG channel of the SCA. The frequency of the JTAG clock stops having an effect on the total configuration time of an FPGA at 10 MHz, where the overhead imposed by the SCA architecture acts as a bottleneck. Figure 10 shows the time needed to program 2 types of Xilinx FPGAs (Artix7 and Kintex7) which can be up to 10 minutes. Using the HDL-TCP/IP system the time needed to perform the same actions would be about one hour.

Conclusions
The VLDB is extensively used by the High Energy Physics community. Sixty boards have been produced and all of them were distributed to users or used for an extensive characterization campaign. Many VLDB boards -together with the USB to I2C dongle and the software to configure the GBTx -were used as a starter kit for GBTx configurations and as a reference for new front-end boards design. Lately, the interest of the community focused on the SCA control and the VLDB is widely used to understand and qualify FE configuration and monitoring. The first studies presented in this paper highlighted the need of optimized software and firmware solutions to minimize access time to the Slow Control devices in the FE. Fifty new VLDB are currently being produced to meet the increasing needs from the users.