The DAQ and control system for the CMS Phase-1 pixel detector upgrade

In 2017 a new pixel detector was installed in the CMS detector. This so-called Phase-1 pixel detector features four barrel layers in the central region and three disks per end in the forward regions. The upgraded pixel detector requires an upgraded data acquisition (DAQ) system to accept a new data format and larger event sizes. A new DAQ and control system has been developed based on a combination of custom and commercial microTCA parts. Custom mezzanine cards on standard carrier cards provide a front-end driver for readout, and two types of front-end controller for configuration and the distribution of clock and trigger signals. Before the installation of the detector the DAQ system underwent a series of integration tests, including readout of the pilot pixel detector, which was constructed with prototype Phase-1 electronics and operated in CMS from 2015 to 2016, quality assurance of the CMS Phase-1 detector during its assembly, and testing with the CMS Central DAQ. This paper describes the Phase-1 pixel DAQ and control system, along with the integration tests and results. A description of the operational experience and performance in data taking is included.

In 2017 a new pixel detector was installed in the CMS detector. This so-called Phase-1 pixel detector features four barrel layers in the central region and three disks per end in the forward regions. The upgraded pixel detector requires an upgraded data acquisition (DAQ) system to accept a new data format and larger event sizes. A new DAQ and control system has been developed based on a combination of custom and commercial microTCA parts. Custom mezzanine cards on standard carrier cards provide a front-end driver for readout, and two types of front-end controller for configuration and the distribution of clock and trigger signals. Before the installation of the detector the DAQ system underwent a series of integration tests, including readout of the pilot pixel detector, which was constructed with prototype Phase-1 electronics and operated in CMS from 2015 to 2016, quality assurance of the CMS Phase-1 detector during its assembly, and testing with the CMS Central DAQ. This paper describes the Phase-1 pixel DAQ and control system, along with the integration tests and results. A description of the operational experience and performance in data taking is included.

K
: Data acquisition concepts; Detector control systems (detector and experiment monitoring and slow-control systems, architecture, hardware, algorithms, databases); Modular electronics; Optical detector readout concepts

Introduction
The CMS collaboration has adopted the approach to rely on a highly granular pixel detector as key element for the reconstruction of charged particle tracks and interaction vertices. A detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables can be found in ref. [1]. The description of track reconstruction with the CMS tracker can be found in ref. [2].
The original CMS pixel detector [3] featured three barrel layers and two forward disks on each end. It was operated during LHC Run 1 (2010Run 1 ( -2012 and the first part of Run 2 (2015Run 2 ( -2016, and was designed to record efficiently and with high precision the first three space-points of a charged particle track near the interaction region up to an instantaneous luminosity of 1.0 × 10 34 cm −2 s −1 , with colliding bunch crossings (BX) at a spacing of 25 ns. The original pixel detector would not have sustained a satisfactory performance given the luminosity conditions expected in LHC running after 2017 due to inefficiencies in the front-end readout chip (ROC), and because the maximum throughput rate for the data links of the innermost layer would have been exceeded.
The goal of the Phase-1 pixel detector upgrade project [4] was to perform an evolutionary upgrade with minimal disruption of data taking and reconstruction by keeping the pixel size, sensor, and readout architecture the same, while improving the performance through a higher rate capability of the ROCs and larger data transmission rate, more robust track reconstruction through the addition of a fourth barrel layer, and a third disk per endcap, as well as a reduced material budget. The Phase-1 pixel detector was designed to maintain a high tracking performance at luminosities up to 2.5 × 10 34 cm −2 s −1 , corresponding to an average of 80 simultaneous inelastic interactions per 25 ns spaced BX, (these interactions are referred to as 'pileup'). The Phase-1 pixel detector, with modified data acquisition (DAQ) and control system, was installed during an extended year-end technical stop at the beginning of 2017. It is expected to deliver high quality data in the high luminosity environment of the LHC up to Long Shutdown (LS) 3, which is scheduled to start in 2024. Roughly 350 fb −1 of data will be collected until LS 3.
The Phase-1 pixel DAQ and control system has been developed using a combination of custom and commercial microTCA parts. Custom mezzanine cards on CMS-developed carrier cards provide a Front-End Driver (FED) for readout, as well as a Pixel Front-End Controller (FEC) for configuration, the distribution of clock, fast commands, and trigger signals, and a Tracker FEC for programming auxiliary electronics. The Tracker and Pixel FECs use the same hardware and differ in the firmware.
This paper describes the Phase-1 pixel detector DAQ and control system. Section 2 gives a system overview, section 3 describes the front-end ASICs, section 4 outlines the optical components and section 5 explains the back-end implementation. Sections 6 and 7 describe the Phase-1 pixel pilot system and laboratory tests, respectively. Section 8 explains the software used for the pixel detector operation. Section 9 provides an overview of the performance during operation.

System overview
The CMS Phase-1 pixel detector has three disks on both ends of the forward regions (FPIX) and four barrel layers in the central region (BPIX). An overview of the Phase-1 pixel detector DAQ and -2 -control system architecture, including auxiliary components required to interface with the central CMS services, is shown in figure 1. The CMS Phase-1 pixel detector contains one type of sensor, bump bonded to 16 ROCs [5]. The active area of the module is 16.2 × 64.8 mm 2 . The pixel size has remained the same as in the original detector, 100 × 150 µm 2 . The same n + -in-n technology as for the original detector is used for the silicon sensors. A high density interconnect (HDI) is glued on top of the sensor. The HDI provides signal and power distribution for the ROCs, and it carries the token-bit manager chip (TBM) and decoupling capacitors. The TBM chips are glued onto and wire-bonded to the HDI. The TBM chip is described in detail in section 3.2. They orchestrate the transmission of the data from the ROCs to the back-end electronics. The Phase-1 pixel detector features a fully digital readout system including new back-end electronics. The new ROCs with digital readout operate on a 40 MHz [6] clock1 and have a 160 Mb/s serial output data stream. This stream is encoded and multiplexed by the TBM using a 4b/5b encoding scheme, to reduce the impact of bit-errors during transmission [7] and for DC balancing. The TBM outputs one or two 400 Mb/s data streams. A dedicated ROC was designed for the sensor modules of the innermost layer of BPIX (layer 1) to cope with the higher hit rates. Layer 1 sensor modules require two TBMs to manage the higher data rates, while all other modules have one TBM.  The sensor modules are connected to the on-detector auxiliary electronics (portcards) via flex (FPIX) or twisted pair (BPIX) cables. The portcards are located in the 3 m long service cylinders, which also serve as mechanical support for all detector services (power, cooling and optical links) routed to and from the outside of the 6 m long pixel support tube. There are two different types 1The LHC frequency and clocks derived from the LHC frequency are referred to as 40 MHz (and multiples of 40 MHz) in this paper.

JINST 14 P10017
of optical hybrids on the portcards: the Pixel-Opto-Hybrid (POH) and the Digital-Opto-Hybrid (DOH), which facilitate communication with back-end electronics via optical links.
The POH converts the electrical signal from the TBM to an optical signal and delivers it to the FED. The FED handles decoding and deserialization, and builds event fragments, which are sent to the Central DAQ Front-End Readout Optical Link-40 (FEROL40) card [8], the first stage of the CMS Central DAQ chain, via a small form-factor pluggable (SFP+) 10 Gb/s S-Link Express transceiver (Tx). There are 24 input channels per FED card; two receivers (Rx) with twelve channels each receive the data from the sensor modules. The FED receives clock, trigger and fast commands (called TTC [9] for timing, trigger, and control) from the CMS Trigger Control and Distribution System (TCDS) [10] via a CMS-custom module called AMC13 [11] and the microTCA backplane. The clock runs at the LHC frequency of 40 MHz. The FED also provides a trigger-throttle system [9] (TTS) signal, which is a 4-bit status word as defined in ref. [12], to the AMC13. The AMC13 forwards the TTS signals from all the FEDs in a crate to TCDS. The TTS signal indicates whether FEDs are ready to accept triggers or not, and if the event counters are still synchronized . The overall TTS state depends on the status of each FED. At a given moment a FED should either accept or block CMS level-1 triggers (L1A) [13]. The pixel detector DAQ is able to maintain event synchronization across all FEDs with this back-pressure system. A total of 108 microTCA Pixel FEDs are required to read out the Phase-1 pixel detector.
The AMC13 also propagates the received signals to the Pixel FECs, which distribute them to the sensor modules via the portcards. On the portcards these signals are decoded, after the opto-electrical conversion in the DOHs, by the Tracker Phase Locked Loop (TPLL) [14] and Quartz Phase Locked Loop (QPLL) [15] chips. The signals are then forwarded to the sensor modules on dedicated lines passing through Delay25 chips [16], which provide functionality to delay trigger signals, sent and received clock and data signals with a granularity of 0.5 ns. Each sensor module connected to a pixel-control link is identified by a unique, hardwired 5-bit hub address. The Pixel FEC is also responsible for programming the TBM and the digital-to-analog-converter (DAC) registers of the ROCs. A total of 16 microTCA Pixel FECs are required to operate the Phase-1 pixel detector.
Registers on the portcards, including Delay25 chips, and DC-DC converters [17], used for powering, are programmed by the Tracker FEC via the Inter-Integrated Circuit (I 2 C) interface and Parallel Interface Adapter (PIA) port, respectively, of a Control & Communication Unit (CCU) [18]. Several CCUs are arranged in a ring-like topology (referred to as control or token ring) via semiredundant connections that carry clock and data signals. A total of 3 microTCA Tracker FECs and 12 token rings are required to control the Phase-1 pixel detector auxiliary electronics.
The number of optical readout links has increased with respect to the original detector from 448 to 672 for FPIX and from 1152 to 1696 for BPIX, resulting in a total of 2368 readout links. The first and second layer of BPIX use four and two links per sensor module, respectively, to cope with the higher occupancy and data rate. The third and fourth layer of BPIX, as well as the FPIX disks, use one link per sensor module.

Readout chips
The ROC used in the original CMS pixel detector, PSI46 [19], was designed for hit rates of a few tens of MHz/cm 2 , encountered at BPIX layer 1 for an LHC instantaneous luminosity of 1.0 × 10 34 cm −2 s −1 with a 25 ns bunch spacing. This readout chip performed well during the data taking periods from 2008 to 2016. However, it showed expected inefficiencies when operated at higher data rates when the LHC started operating at instantaneous luminosities above the design value. In addition the innermost layer was moved closer to the beam line. Therefore, new pixel ROCs had to be designed: an evolutionary update of the original PSI46, the PSI46dig used in FPIX and BPIX layers 2, 3, and 4, and a dedicated ROC for BPIX layer 1, the PROC600, to cope with the exceptionally high rates of up to 600 MHz/cm 2 .
The new readout chip evolved from the PSI46 ROC, keeping most of its characteristics: pulseheight readout, and 52 × 80 pixels organized in 26 double-columns of 2 × 80 pixels with common data transfer to latency buffers in the periphery outside the active pixel region. The digital Phase-1 pixel ROC (PSI46dig) is manufactured in the same 0.25 µm CMOS technology as the PSI46, and the overall layout and many building blocks remained unchanged. The two main improvements needed for the upgrade were larger data buffers and higher readout speed.
The double-column buffer sizes have been increased from 32 to 80 cells for the hits and from 12 to 24 cells for the time-stamps. Unlike the analog PSI46 ROC, the PSI46dig ROC outputs digital data for which an analog-to-digital converter (ADC) has been implemented in the chip. It is an 8-bit successive approximation register ADC running at 80 MHz. Digitized data are stored in a 64 × 23 bit First In First Out (FIFO), which is read out serially at 160 Mb/s. The 80 and 160 MHz clocks needed for the ROC operation are generated from the external LHC clock using a Phase Locked Loop (PLL) circuit.
During the trigger latency of the CMS experiment, currently 4.15 µs, the pixel hit data must be stored inside the ROC, and only data corresponding to triggered events are read out through the serial links. The internal transfer and buffer capacities of the ROC were designed to cope with rates up to 200 MHz/cm 2 . The rates of data loss have been measured with high-flux X-ray tubes for pixel hit rates of up to 300 MHz/cm 2 , and were found to be in excellent agreement with expectations based on detailed architecture simulations [20].
In addition to the higher rate capacity of the ROC, several other improvements with respect to the PSI46 have been implemented. An additional metal layer for power distribution was added, which allows a better decoupling of the power lines from the signal lines, resulting in an improved pixel response uniformity as well as lower noise and cross-talk. An optimized comparator reduces the time-walk from about 35 ns [5] to 15 ns [19], resulting in a reduction of the difference between the in-time threshold (within a time window of one clock cycle) and the time-walk independent absolute threshold from about 800 to 150 electrons. The above improvements reduce the effective operational threshold of the ROC from 3400 electrons in the original detector to 1700 electrons for the upgraded one. This is important when the amount of charge per hit starts to decrease after radiation damage to the sensors: a highly irradiated detector will slowly degrade in signal induced charge. With a lower threshold, the charge sharing among neighboring pixels can be exploited for position interpolation up to a higher integrated luminosity leading to a higher resolution.
-5 -Based on operational experience with the PSI46 ROC and irradiation tests, further optimizations of the internal biasing were made that extend the range of ionizing dose tolerated by the PSI46dig ROC, reducing the need to re-adjust DAC settings with increasing accumulated dose. The PSI46dig ROC performed well and without significant performance degradation after irradiation to up to 120 Mrad (4 × 10 14 cm −2 24 MeV protons, at the irradiation facility in Karlsruhe), which is the maximal dose expected during LHC operations for FPIX and BPIX layers 2, 3 and 4. A detailed study on radiation tolerance of the PSI46dig ROC can be found in ref. [21].
Despite the improved performance of the PSI46dig, its architecture would lead to unacceptable data loss rates for the innermost BPIX layer, where pixel hit rates up to 600 MHz/cm 2 may be encountered. A dedicated chip (PROC600) was designed for layer 1, with a complete re-design of the double-column periphery. The PROC600 features a four times higher hit transfer rate of pixels to the end-of-column buffers, and dead-time-free buffer management. The former is achieved by changing from single pixel to 2 × 2 pixel cluster transfers and the implementation of a simpler, handshake-free protocol. A faster and more power efficient analog bus was developed for the pulse height transfers. The data buffer was modified considerably: PROC600 has a ring buffer with 56 buffer units, each containing a cluster base address plus four analog storage cells for the charge pulse heights. The readout is zero-suppressed in order to remove pixels in the cluster with zero measured signal amplitude. Only hits which are validated by a L1A are read out without stopping the acquisition of new hits into the buffer. This avoids an interruption of the data acquisition process in the double-column or overwriting of data, as is the case in the PSI46dig ROC.
Both ROCs have performed well in the 2017 and 2018 data taking. For the PSI46dig ROC all targeted improvements, i.e. low noise, lower threshold, and lower inefficiency at high rates, have been confirmed during data-taking. Some shortcomings have been observed for the PROC600, like a higher than expected noise hit rate and the rare loss of data synchronization in double-columns. These can be mitigated by operational procedures: the former by an increase of the in-time charge threshold for layer 1 to 3500 electrons, as compared to 1700 electrons used for other layers. These issues could partially be mitigated by operational procedures and have been addressed in a revised design of the PROC600, which will be used in the planned replacement of the innermost BPIX layer in 2020 during LS2.
Details on the operational performance of both ROCs are shown in section 9.2.

Token-Bit Manager chip
The Phase-1 pixel detector TBM is a radiation-tolerant integrated circuit that controls the readout of groups of ROCs. It replaces the original TBM [22]. The TBM chip is mounted as a bare die, wire bonded to the HDI that is glued on the sensor modules. The principal functions of the TBM include the distribution of clock, L1As and fast commands as well as configuration data from the Pixel FECs to the ROCs. The TBM passes a token around the group of ROCs it controls to orchestrate the readout of data associated to a given L1A. The TBM keeps each arriving L1A on a 32-deep stack while waiting for the token to return if the token has not returned before the next L1A arrives. The TBM adds a header and a trailer to the data stream on each token pass.
The TBM has one or two cores that output serial data at 160 Mb/s. Two output data streams are non-return-to-zero-inverted (NRZI) and encoded with a 4b/5b scheme and multiplexed by a -6 -block called the DataKeeper into a 400 Mb/s stream which is transferred to the FED. There are three versions of the Phase-1 pixel detector TBM (summarized in table 1). The TBM08, used in FPIX disks and BPIX layers 3 and 4, combines two groups of ROC data, while the TBM09 and TBM10, used in BPIX layer 2 and layer 1, respectively, combine the output of four groups of ROCs into two 400 Mb/s data streams. The TBM09 and TBM10 differ in their timing settings, which are optimized to match the PSI46dig and PROC600, respectively. The data format for Phase-1 sensor modules is as follows: TBM Header, followed by ROC Headers and pixel-level event information, followed by TBM Trailer. Event number and stack count are included in the TBM Header, ROC Headers are followed by column and row addresses of the pixels with hits and hit amplitudes, and the TBM Trailer includes the error information.

Optical components
The optical readout link starts at the electro-optic POH interface inside the detector and ends at the opto-electric receiver module interface on the FED. The data coming from TBMs are sent by the POH at a rate of 400 Mb/s. The control optical link system is based on the same components as used in the original pixel system: a DOH communicating bi-directionally with a FEC, which uses standard SFP transceivers. The readout and control links are shown in figure 1.

Pixel Opto Hybrid (POH)
The POH is a printed circuit board (PCB) mounted on the detector service cylinders. Figure 2 shows the POH4 (left) used in BPIX and the POH7 (right) used in FPIX. The optical characteristics of the two variants are the same. The overall system requires 424 POH4 and 96 POH7.
The design of the POHs uses the Transmitter Optical Sub-Assembly (TOSA) component provided by the Versatile Link project [23]. The POH receives electrical signals from the TBM and converts them into optical signals to be transmitted to the back-end receiver on the FED installed in the counting room, about 65 m away from the detector. Each POH houses single-mode Fabry-Perot laser TOSAs operating at 1310 nm, as well as Digital Level Translators (DLT) and Linear Laser Drivers (LLD) [24]. The DLT chips convert the signals received from the TBM to levels compatible with the LLD and introduce a gain and an offset to the input signal. The LLD chips drive the laser TOSAs. They pre-bias the lasers at their working point and modulate them with a current proportional to the input signal. The modulation gain and pre-bias currents at the LLD are controlled through an I 2 C interface. The POHs are used to transmit balanced digital signals at a maximum bit rate of 400 Mb/s. A typical output optical eye diagram is shown in figure 3.

Digital receiver
The digital receiver module used on the upgraded microTCA FEDs is a commercial component. Since the lasers mounted on the POHs emit light at a wavelength of 1310 nm it was critical to identify a receiver module based on an InGaAs photodiode. Typically, high-density multi-channel receivers are based on GaAs photodiodes that operate with light at a wavelength of around 850 nm and are not sensitive to a wavelength of 1310 nm. One manufacturer was identified being able to produce fully qualified receiver modules [25] with 12-way arrays of InGaAs photodiodes. These are integrated in pairs on an Field Programmable Gate Array (FPGA) Mezzanine Card (FMC) board to be mounted on the FEDs. The receiver modules have a diagnostic feature that allows the DC photocurrent to be measured on each input channel individually. This was used during -8 -initial detector checkout to spot problematic fiber connections. Figure 4 (left) shows a picture of a Receiver-FMC (Rx-FMC), with an SFP+ transceiver attached to it for the Central DAQ link.

Control
The optical link system used to control the Phase-1 pixel detector uses the same components as the previous detector system [26] at the front-end. The DOHs, located in the service cylinders, transmit the control signals between the Pixel and Tracker FECs and the detector front-end. The back-end components that are housed in the FECs are standard single-mode SFP modules rated for 1-2 Gb/s data rates. These SFPs plug into custom-designed FMC boards, shown in figure 4 (right), that are mounted on the FECs.

Back-end implementation
The design of the back-end electronics for the Phase-1 pixel detector is based on microTCA modular electronics [27]. A microTCA carrier hub (MCH) card is used as communication interface between the microTCA electronics and the local area network (LAN). The microTCA backplane is used to distribute clock, trigger and fast commands that are received from the TCDS via the AMC13.
The FC7 microTCA FMC carrier [28,29], was selected as the platform for the new digital FED and the Pixel and Tracker FECs. As shown in figure 5, the FC7 is a full-size, doublewidth Advanced Mezzanine Card (AMC) holding a Xilinx Kintex 7 FPGA [30] and offering two low-pin-count compatible (LPCC) FMC slots.

Phase-1 Tracker FEC
The Tracker FEC is responsible for programming the auxiliary detector electronics, like the DC-DC converters, which is independent from the control of the sensor modules. Each Tracker FEC controls a control/token ring via semi-redundant connections that carry clock and data signals. The -9 - control is done via a token-ring protocol. For the CMS Phase-1 pixel detector there are four control rings each for FPIX and BPIX.
The Tracker FEC firmware is designed to implement four control ring firmware blocks (CTRL_RING) independently from each other. Each CCU control ring is addressed by one control ring firmware block. The firmware is link compliant with the CCU communication protocol specified for the original detector [18,32] and access compliant with the control software of the original detector. The firmware can be controlled and monitored over an 1-Gb/s Ethernet/IPBus [33] link via the AMC backplane and, unlike the other parts of the back-end electronics, the firmware does not need to be synchronized with the LHC clock. Four signals are used in every ring: two for data transmission from the FMC via the DOH to the CCU ring, transmitting clock and data, and two for data reception of the FMC via the DOH from the CCU ring, returning clock and data. Two DOHs are available per ring. Four SFPs and eight optical fibers need to be plugged in the FMC in order to connect a CCU ring.
An example topology with all the connections is shown in figure 6, which considers a CCU ring composed of two DOHs and five CCUs. The last CCU is a spare/dummy, which is needed in order to close the redundant path to output B of the control ring.
The commands are transmitted from the control ring firmware block (the master) via the TX line (A or B) to the appropriate CCU of the control ring (the slave). The CCUs are distinguishable by their own defined addresses. A ring-type topology is configured as a standard computer LAN connecting the control ring firmware block to CCUs and the CCUs between themselves. Two types of commands can be executed from the control ring firmware block: register write commands and register read commands. The CCU executes the I 2 C transactions addressed to the appropriate device from the initial command received.
By default, an idle pattern is sent to the ring on the TX line by the control ring firmware block. The control ring firmware block also verifies that the ring is well initialized at startup and just before transmitting a command, by injecting a token frame to the ring. The ring is well established if the returned token frame matches the token frame injected. In any case, a status register is updated so that the control software (section 8.1) knows the status of the ring in real-time.
-10 - Figure 6. An example topology with two DOHs and five CCUs (the CCU5 is a spare/dummy). Ring A is the primary ring, used by default. In case of a failure, either of DOH_A or of any single CCU, the device can be bypassed by switching to Ring B.

Phase-1 Pixel FEC
The Pixel FEC is responsible for distributing clock, trigger, and fast commands to the sensor modules, as well as for programming the DAC registers of the ROCs and registers of the TBM chips on the sensor modules. A total of eight DOHs are connected to a Pixel FEC.
A block diagram of the Pixel FEC at the board level is shown in figure 7. The firmware was designed to provide 1-Gb/s Ethernet/IPBus services via the AMC backplane. Pixel FEC registers and channel input and output FIFOs are interfaced via Ethernet through the IPBus.

JINST 14 P10017
The LHC 40 MHz input clock (fabric clock) is sent to a PLL to produce a TTC Clock at the same frequency, as well as 80 MHz and 200 MHz clocks for various other subsystems. TTC information is received through IDELAY and IDDR logic blocks and then processed in a hamming decoder block. The outputs of the TTC decoder block are the L1A and fast commands on an 8-bit bus. Pixel-related fast commands include ROC reset and TBM reset and other commands to e.g. reset event counters or clear buffers. Registers in the Pixel FEC register space count how many Pixel-related fast commands are decoded and a FIFO can capture all the TTC events. A trigger finite state machine (FSM) receives the fast commands and encodes the appropriate bit pattern into the TTC clock for transmission to the SFP as the module clock. The L1A, ROC reset, and TBM reset signals can also be encoded in the TTC clock by setting appropriate bits in the Pixel FEC register space.
Eight Pixel FEC channels are instantiated in the FC7's Kintex 7 FPGA. Programming data are loaded into a 16 kB transmit FIFO to be used by the transmit FSM. Either a Send Data bit is set in the Pixel FEC register space or the TTC Send Data command makes the transmit FSM undergo a transition using the configuration data stored in the transmit FIFO. An 8b/10b encoded data stream is generated and transmitted to the SFP. The TBM's hub and port addresses along with the number of bytes transmitted in the command are stored in the Pixel FEC register space.
During lab tests and during the initial phase of the detector operation, commands to program the sensor modules were composed by fetching configuration data stored on a remote server, and were loaded into the transmit FIFO, to be sent sequentially for all sensor modules included in the detector configuration. This procedure was relatively time consuming, and not practical in normal operation. Since Spring 2018 a new way of programming the sensor modules has been implemented using the feature of storing the configuration data in the FC7 DDR3 SDRAM. This has two benefits. First, it allows the storing of configuration data locally on the FC7 cards, so during re-configuration there is no need to form the commands again by fetching detector configuration data from the remote server. Secondly, it allows sending configuration commands in parallel, reducing the total time to program the sensor modules by Pixel FECs from 30 seconds to 2 seconds.
The DDR3 memory is partitioned into segments for each of the Pixel FEC channels. One segment is for general calibration purposes, and groups of 4 segments, each used for 28 sensor modules, are used to store TBM settings, two sets of DAC settings for individual ROCs, and settings to trim and mask individual pixels. Each memory segment is assigned a bit used to steer which memory segments are addressed to transmit their commands during a send command.
The data stream returning from the sensor module is parsed by the receive FSM. The clock for the receive FSM is the returned clock from the sensor module. Data reception begins with a start condition ('1's for eight clock cycles).
Data to the same hub/port address can be continuously transmitted until the data are exhausted. Once transmission to a hub/port address is complete the transmit FSM waits for the receive state machine to confirm reception of the command before proceeding to the next hub/port command.
Because the exact optical fiber lengths are unknown to the Pixel FEC and fiber lengths can add delays of several hundred nanoseconds between the data and clock leaving the Pixel FEC and the received data and clock at the sensor modules, a simple handshaking between transmit FSM and receive FSM is implemented to prevent metastability issues that might arise if the data and clock lines are not synchronized. Start of transmission is indicated to the receive FSM so a timeout The (data and clock) transmission paths between the Pixel FEC and the sensor modules are synchronized by cycling through Delay25 phases of the sent and received data at the portcard and plotting the successful transmissions, as shown in figure 8. The center of the resulting area is used as the calibrated delay for the sent and returned data. The Pixel FEC has been shown to have ± 7.5 ns of phase margin between sent clock and sent data, and ± 6 ns of phase margin between returned clock and returned data signals.

Phase-1 Pixel FED
The Phase-1 Pixel FED consists of an FC7 board with a Rx-FMC. The Rx-FMC is a mezzanine containing two 12-channel optical receivers that collect signals from the sensor modules, and one SFP+ for data transmission to the CMS Central DAQ, as shown in figure 4 (left). One FED can read out 24 data streams of 400 Mb/s, and transmit output data at 10 Gb/s. The FED can also emulate and transmit data and run without a detector.
The FED firmware consists of two parts. The first part (DECODE) handles decoding of the incoming data. The second part (BUILD) builds pixel events and sends them to the CMS Central DAQ.

DECODE Pixel FED firmware
The main task of the DECODE part of the Phase-1 pixel FED firmware is to decode the NRZI and 4b/5b encoded 400 Mb/s input signals, and to split the multiplexed TBM channels into two data streams. The decoding converts 24 multiplexed data streams of 400 Mb/s to 48 TBM core data streams of 160 Mb/s. A TBM core data stream starts with a TBM Header, followed by an 8-bit event number. This is followed by ROC Headers which indicate the beginning of pixel data. A TBM Trailer followed by 16-bits of status information terminates the data stream.
The DECODE part of the Phase-1 pixel FED firmware ( figure 9) was designed to automatically find the best sampling point for the incoming 400 Mb/s signal and to do continuous sampling phase finding without disturbing data integrity. The optical receiver output, which carries the 400 Mb/s data stream, drives a differential input buffer of the FPGA. The negative output of this buffer is used as a copy of the incoming data stream to perform sampling phase finding and phase correction calculations.
The DECODE firmware detects TBM Header, TBM Trailer, and ROC Headers and writes pixel data in so-called TBM FIFOs. Several checks are included to keep data integrity as high as possible. Therefore, not only does the TBM Header marker have to be identified to start a data packet, but also the beginning of the next marker (ROC Header marker or TBM Trailer marker depending on the data packet) is included to validate the start sequence. ROC Headers are only allowed within the expected delay after the arrival of a TBM Header. The number of ROCs is counted and an error is reported if the count does not match the expected number from the TBM type. Furthermore, veto conditions for Header and Trailer arrival times and controllers that check the sequence of Headers, Trailer, and pixel data are added to avoid corrupted data packets. At this stage the TBM 4-bit words, which are the outcome of NRZI decoding, are combined with a 4-bit qualifier marker. This allows the following stage to identify these words as Header, Trailer or pixel information.
In order to keep the event throughput per unit time roughly constant the DECODE firmware truncates the data volume by terminating the TBM core data stream when the TBM FIFOs get filled to a programmable value or too long payloads are sent from layer 1 modules. The data stream from the DECODE firmware block to the BUILD firmware block contains an overflow error type in this case.
The bitwise signal of the events might be altered which causes errors in the FED. The FED has multiple error counters, which can count independently for each channel.
The data streams are forwarded to the BUILD firmware block using a 36-bit wide interface with the possibility of clocking data out at 40, 80, or 160 MHz. The default frequency for clocking out data is 160 MHz.
For debugging purposes, the DECODE part of the firmware has fiber-specific spy FIFOs which store the incoming 5-bit symbols and the decoded 4-bit data words. To monitor the data transfer to the BUILD firmware part, additional spy FIFOs for every TBM channel are implemented.

BUILD Pixel FED firmware
The BUILD FED firmware was designed to handle the data readout of 48 data streams coming from the DECODE FED firmware, transmit data to the CMS Central DAQ via the S-Link Express interface, and communicate with the TCDS system for the TTC and TTS interfaces for synchro--14 -   nization. The major challenges are the high data rate, the asynchronous readout with large payload variations between streams and from event to event, maintaining the synchronization, as well as the exception and error handling. Exceptions can occur due to corrupted data provoked by an SEU or sensor modules not sending coherent data. The 1-Gb/s Ethernet/IPBus communication is used to read out information about these exceptions during physics data taking. The block diagram of the BUILD FED firmware is shown in figure 10. The READOUT firmware block encodes the data coming from the DECODE block, merges all the data fragments, detects and marks exceptions and transmits the merged data to the CMS Central DAQ. The firmware block uses separate FIFOs for L1As and pixel data and, in order to increase the data throughput, it drains the pixel data FIFOs in parallel. The parallel data draining structure allowed to exploit almost the complete potential bandwidth of the used hardware, as discussed in section 7.2, and ensured safe running for the conditions expected during LHC Run 2 and Run 3. The READOUT firmware block also computes and transmits its own individual TTS state which is computed based on the filling level of the L1A and pixel data FIFOs, synchronization loss detection and received TTC commands.
The TTS states used in the READOUT firmware block in the BUILD firmware are ready (RDY), busy (BSY), and out-of-sync (OOS). The other TTS states defined in [12] are not used. The TTS firmware block handles the transitions of the TTS states. The TTS states and transitions are shown in figure 11. After the system is configured, the TTS state is RDY. When FIFOs are almost full, back-pressure is applied to avoid an overflow and a subsequent loss of synchronization. The goal of the BSY state is to rapidly veto the arrival of new triggers. The veto is not instant because of non-negligible propagation time; new triggers are still accepted before the back-pressure is effective. An OOS condition due to either consecutive timeouts (a timeout occurs when a FED channel does not receive data for a programmable time) or consecutive event number mismatches  can be triggered at any moment in any TTS state. In order to reestablish synchronization and empty FIFOs, a resynchronization command is propagated from TCDS to all FEDs. The TCDS command is interpreted as a resynchronization sequence (RESYNC). The firmware was written to accept two types of RESYNC commands: global or private. In the global case, which is CMS-wide, both front-end and back-end electronics receive the same command. In the private case, only the back-end electronics receive the command without the event number reset (EC0).
At Configure

Pixel FED data payload
The sensor modules transfer zero-suppressed data to the Pixel Detector DAQ system via 2368 optical fibers. The average number of hits in a sensor module decreases with its radial distance from the interaction point. Figure 12 (left) shows the average number of pixel hits per event for all channels. The distribution is uniform for the outer layers in BPIX and in FPIX, while the average number of received hits per event has a large spread for the innermost layer. The twelve channels of each receiver have been chosen from sensor modules of different layers and at different z-coordinates in order to balance the data processing load on the FEDs. Most FEDs take two of these fiber bundles as inputs. Figure 12 (

System tests in the CMS Detector -Phase-1 Pixel Detector Pilot System
In order to be well prepared for a short commissioning period during the extended year-end technical stop at the beginning of 2017 and to take advantage of the lengthy access to the original detector possible during LS1, a pilot system [34] was built with eight prototype Phase-1 sensor modules.

JINST 14 P10017
The pilot system was installed in 2014 in the available space in the original FPIX half cylinders ( figure 13). A prototype microTCA FED system was used to read out the pilot system. The motivation for installing the pilot system was to learn how the readout, control, and offline systems perform in the CMS environment and to start integration with the CMS DAQ. The pilot system was commissioned before installation at CERN using a test stand running a standalone test software. Calibration procedures for the pilot detector implemented in the online software were validated after the installation in CMS. During the pilot system tests before installation it was observed that the prototype FED was not able to correctly decode the data at high trigger rates. The issue was traced back to two separate sources: an asymmetric eye diagram due to the TBM design, and jitter on the Phase-1 portcard. While an asymmetric eye diagram could be accepted for the pilot system, new versions of the TBM were designed for the final Phase-1 sensor modules that were built in 2015 and 2016. In order to address the jitter on the portcard, an external QPLL chip had to be added in between the TPLL and Delay25 chips on the pilot portcard. Figure 14 shows asymmetric eye diagrams for one of the pilot modules before and after QPLL installation. For the final Phase-1 portcards, the design incorporated the QPLL chip directly on the PCB.
After the installation in CMS, the pilot system was used to validate the FED firmware. Both the DECODE and BUILD firmware blocks were iteratively improved during pilot system operations within CMS and the DECODE firmware block was finalized.
Six out of eight pilot sensor modules were successfully used for data taking. Clusters are formed from signals from individual pixels by the reconstruction software and the hit position is determined by their barycenter. Figure 15 (left) shows the measured cluster positions projected onto the transverse plane. Figure 15 (right) shows the expected hits, which are derived from extrapolated tracks that are reconstructed in the FPIX detector (without pilot detector hits). As the clusters from the pilot system were not included in the track reconstruction there are no expected hits close to the center. This effect is visible in figure 15 (right). In order to discard the fringes of ROCs, where the uncertainty in the track extrapolation is large, fiducial regions are defined. These are visible as rectangular shapes in figure 15 (right).
Operating the pixel pilot system during the years 2015-16 within CMS, reading out events synchronously with other CMS sub-detectors and reconstructing tracks, provided valuable experience and enabled an early start for the modifications that were required for the integration of the Phase-1 pixel DAQ.

System tests in the laboratory
Small scale systems were used for development and testing of final detector parts, which advanced development and uncovered errors and issues well ahead of the final system installation. There were three integration centers using microTCA back-ends: Fermilab, the University of Zurich (UZH), and CERN. In addition, there were test stands at HEPHY in Vienna, IPHC in Strasbourg, and Cornell University for firmware and software development and testing. At Fermilab, qualification of the FPIX detector was performed [35] before shipping it to CERN for installation in CMS. At UZH [36] the focus was on testing the optical components and electronics on the BPIX service cylinders, and on the integration tests for the BPIX detector. At CERN, emphasis was on DAQ hardware testing and integration and on testing firmware before deployment for the Phase-1 pixel detector [37]. Functionality tests were also performed on detector components upon arrival at CERN. A so-called "soak test" facility was set up at CERN to validate all DAQ back-end components before installation -19 -in the CMS service cavern. A rack layout identical to the final setup in the cavern containing all of the production parts, power modules, AC-DC converters, crates, service boards, as well as FEDs and FECs was operated for several weeks before installation. The soak test included regular firmware upload, and power cycling of FEDs and FECs.
To facilitate the development and the validation of the pixel FED firmware a dedicated tool (FED tester) and a setup for high data rate tests were developed. These are described in the next two sections.

FED tester setup
The FED tester, a data emulator, was designed for the Phase-1 pixel detector upgrade based on the gigabit link interface boards (GLIBs) [38] combined with the same FMC as used on the FECs to transmit the emulated data. FED tester customization enables consistent tests of the FED firmware from version to version. Custom firmware and software was developed to emulate the data bit stream from sensor modules with different TBM and ROC types. Once it receives a trigger emulated data streams are generated and sent: a TBM Header, ROC Headers, pixel hit data, a TBM Trailer, and 16 bits of status information in every data stream. The software is able to generate data patterns in the FED tester framework and can validate the output from the FED the data are sent to.
The GLIB firmware is able to independently emulate 16 TBM core data streams. Three GLIB boards are used to completely fill one FED, and optical splitters can be added in order to feed multiple FEDs in parallel. Each group of two TBM core data streams is then multiplexed, and NRZI and 4b/5b encoded before being transmitted.
The content of the FED error counters are readout in the FED tester framework to confirm that the count for each error is accurate. Event generation can be done with a fixed data size, where every event is the same, or in SRAM mode, where the event size and pixel location can be programmed via software.
The SRAM of the GLIBs is a software loadable memory that can be accessed by the event readout framework. There are two separate memory locations that each hold approximately 8.4 MB of data. The first SRAM is designated to hold the distributions of hits per ROC. The second is designated to hold the emulated pixel locations for each hit. The reading of SRAM memory is driven by a 160 MHz clock. Since each GLIB can emulate 16 independent channels, there are 32 different processes which must occur to emulate an event. The first SRAM only needs to be read once per event, the second SRAM needs to be read any time a pixel hit information needs to be sent. The SRAM readout can be performed at trigger rates expected in CMS.

The DAQ setup for high data rate tests
Prior to installation the DAQ system was qualified at small scale but with a complete chain of DAQ hardware, for the highest expected data rates from the sensor modules. A microTCA crate with five FEDs was connected to the CMS Central DAQ, with the FEDs emulating data patterns. A FED tester was also installed in the crate, six optical splitters were used to feed the FEDs with the FED tester output. The FED tester output and internally emulated FED data were used at high trigger rates to qualify the pixel DAQ system and the interface to CMS Central DAQ. Clock and trigger signals were supplied by the TCDS system, as in the production DAQ system. The DAQ setup was used to   develop configurations to interface with the TCDS system, and to study the robustness of the TTS state transitions and the time spent in TTS states BSY and OOS under different conditions. It was also used to optimize the AMC13 configuration for the pixel use-case, and to study the propagation of the TTC commands from TCDS via the AMC13 to the FEDs. It continues to be used as a test bench to test new FED firmware releases, before their deployment in the production system.
It is possible to check the data sizes through the S-Link Express link of the FED using the high rate test setup. When the event sizes are too large the trigger throttling limits the maximum data throughput. The throughput is tested by using fixed-size data and SRAM data, to allow for more realistic conditions. The maximum pileup during LHC Run 2 was approximately 60 (less than 16 hits/TBM core data stream for an average BPIX FED).2 For up to 48 hits/TBM core data stream the FED can run at 100 kHz. Throttling of triggers starts at 56 hits/TBM core data stream and data throughput reaches approximately 7.5 Gb/s (at a trigger rate of 72 kHz). This shows that the data throughput was not a bottleneck during LHC Run 2 and is not expected to be a bottleneck during LHC Run 3. Figure 16 shows the throughput and trigger rates the FED can handle when different numbers of hits/TBM core data stream are generated by the FED tester.

Pixel online software
The Pixel Online Software (POS) is a collection of applications that control the front-end and back-end hardware of the CMS pixel detector. The software collection is written in C++ and is based on the CMS online software framework XDAQ [39].

Hardware access and supervisors
For each type of back-end electronics board in the pixel system a corresponding controller class exists. The controller provides the software interface to the hardware and allows access to the hardware functionality within the POS. It uses the CACTUS framework [40], which provides a hardware abstraction layer (HAL) for microTCA hardware. For the actual communication with the hardware the IPBus protocol is used. This protocol is transported via IP and Ethernet. The hierarchical structure of the POS is shown in figure 17. In order to prevent conflicts due to potential concurrent hardware accesses, the connection is established via a so-called control hub, which is a service daemon running on a separate computer (machine) that queues incoming requests from different applications and distributes them to the actual hardware. CMS adopted a state machine-based approach for the control of its DAQ systems. The software must always reflect the current hardware state and must be able to perform well-defined transitions between these states when instructed from a higher control level. For this reason, an additional application layer is built on top of the controller layer, the so-called hardware supervisors. All supervisors implement a common FSM and define interfaces to the other supervisors for state changes. The cross-communication between the supervisors in POS is realized using the SOAP protocol [41]. The message format is XML. Supervisors also provide a graphical user interface using a simple web server. One single supervisor can hold a set of instances of the controller classes allowing control of several hardware boards at the same time.
A second type of supervisor exists (service supervisors), which does not control hardware, but establishes the connection to other services, like the detector control system (DCS) which is used for controlling and monitoring the detector power distribution. This interconnection between the DAQ and DCS system is described in section 8.3.

-22 -
In order to operate the POS, there has to always be a main (service) supervisor, which orchestrates all the other hardware and service supervisors. This supervisor processes all commands received from the CMS Central Run Control System during global data taking and provides the main user interface during local detector calibrations.

Distributed software architecture
One advantage of the described software infrastructure is that it is scalable and can be distributed on many different computing nodes. The overall software infrastructure is defined in one common XML configuration file, such that each running process is aware of all the other existing processes in its environment. These software instances are distributed over twelve worker nodes featuring 20 cores and 32 GB RAM each. The number of computers has been chosen in order to follow the organization of the pixel detector back-end hardware in twelve microTCA crates. In addition to the twelve worker nodes, twelve additional computers act as control hubs, defining the gateways to the individual microTCA crates.

Interface to the Detector Control System
The front-end needs to be configured differently depending on the power status of the detector. For example, a sensor module becomes noisy in case of no external bias voltage. For this reason the different PixelFECSupervisors need to be informed of any state change of the power system of the pixel detector in DCS. The PixelDCSFSMInterface subscribes to the state of individual power supply channels in the DCS and evaluates the power state of a group of power supplies that power parts of the detector controlled by one PixelFECSupervisor. The summary power state is based on a single majority voting, i.e. one single different power supply state is enough to change the summary power state. The new summary state is transmitted to the corresponding supervisors and is considered in the next front-end configuration.

Operation performance
The CMS Phase-1 pixel detector has collected data in 2017 and 2018 with 95.5% and 94.4% functional channels, respectively. The non-functional fraction was due to infrastructure issues, caused for example by connector failures, and damaged sensor modules. Software recovery mechanisms and periodic ROC resets are implemented to reduce dead-time, ensure smooth running and maintain a high level of hit efficiency for BPIX layer 1.

Software recovery mechanisms
One important aspect of an online control and readout system is the ability to react to unexpected hardware states and guarantee the best performance of the hardware. While many of the simple problems, such as a too high trigger rate, are handled by the FEDs themselves, more subtle problems are easier to analyze and handle in software. Within the POS framework several higher level problem recovery systems are implemented, three of which will be discussed as examples here: the recovery from an SEU in the TBM, a non-responsive TBM, and a non-responsive portcard.
For the recovery of an SEU in a TBM the affected sensor module must be reprogrammed. This is handled by the Pixel FECs. The Pixel FED interface corresponding to a group of Pixel FEDs keeps the list of channels that do not send data, and if the number of channels in the list reaches a programmable value, it reports this to the FEDSupervisor. In order to have a full overview of the system the information is then sent from the FEDSupervisor to the PixelSupervisor. In the PixelSupervisor a new thread is started periodically requesting the SEU status count from all FEDSupervisors. When a programmable threshold is reached, which can differ between different parts of the detector for their impact on the data quality, a request to stop the triggers is sent to the CMS Central Run Control System. When the triggers are paused the Pixel FECs are notified to reprogram all the TBM settings. When the TBM is in a controlled state the ROC settings are reprogrammed. In order to use the time of the paused triggers effectively almost all settings are reprogrammed for the whole detector. Only the trim and mask settings are programmed specifically for the sensor modules affected by the SEU. The PixelSupervisor waits until the Pixel FECs have finished this operation and then signals to the CMS Central Run Control System to restart triggers. This procedure takes approximately five seconds.
One problem of the current version of the TBMs is that some SEUs result in a state where the TBM no longer processes triggers. The mechanism for this is understood and has been solved in the revised version of the TBM that will be used for the replacement of the innermost BPIX layer during LS2. For the TBMs currently used in the detector the only solution to revive the TBMs is a power-on reset of the TBM, which means that the low voltage supply of the TBM needs to be switched off and on again. Because of the design of the pixel detector this can be done by disabling the corresponding DC-DC converter; alternatively if a complete low voltage channel is disabled between 14 to 22 sensor modules are turned off. Using the DC-DC converters reduces the number of sensor modules being power cycled to between one and four, depending on their position in the detector. After the sensor modules are turned on, the same procedure as described above is followed to program the TBM and ROC settings.
In rare cases channels connected to a portcard stop sending data. Because of the load balancing among FEDs the readout of the modules served by one portcard is distributed. As a consequence the readout of one portcard is also distributed over several FEDs. This makes the detection of a missing portcard only possible in the PixelSupervisor, where the information from all FEDs is combined. The previously described report chain is used and if a complete portcard is recognized as having stopped to send data, the portcard is reprogrammed by the Tracker FEC. This is followed by the programming of all the affected sensor modules using Pixel FECs. If after this recovery the channels still do not send data to the FEDs, the corresponding channels are masked in the FEDs.
-24 - Figure 18 shows the number of ROCs that do not send data as a function of time in BPIX layer 1 during data taking for an LHC fill in 2017. More ROCs become inactive over time due to SEUs in the TBM (0.7%/100 fb −1 ). After a programmable threshold is reached the SEU recovery mechanism is activated as described above, during which triggers are paused. Once the triggers are resumed the number of inactive ROCs is again at the baseline value (roughly 1% of BPIX layer 1 ROCs are not functional).

Time during fill [minutes]
Inactive ROCs Figure 18. The number of inactive ROCs as a function of time in BPIX layer 1 during a typical LHC fill in 2017. The number of inactive ROCs increases until a programmable threshold is reached, at which point the SEU recovery mechanism is activated and the ROCs are recovered. The SEU recovery mechanism can be activated several times during an LHC fill. The SEU rate depends on the instantaneous luminosity, which decreases over the time of the fill. In the fill used for this plot, the peak luminosity was around 1.5 × 10 34 cm −2 s −1 . The typical rate for inactive ROCs is one in five minutes at an instantaneous luminosity of 1.0 × 10 34 cm −2 s −1 for BPIX layer 1 modules.

Periodic ROC resets
As discussed in section 3.1, the PROC600, the readout chip for BPIX layer 1, has rare data synchronization losses in double-columns that lead to lower hit efficiencies. Both at low and high trigger rates, inefficiency is caused by a timing error in the time-stamp buffer of a double-column.
Here a coincidence between a new hit and an expiring hit, i.e. a recorded hit exceeding the maximum allowed latency, can generate a spurious column drain and therefore the loss of synchronization of the double-column. This desynchronizes the readout mechanism, and the next hits are not assigned to the correct event. It is more probable to observe this effect at high trigger rates. At low trigger rates another timing error can generate a spurious buffer-full signal. This happens when a buffer is empty, a hit is registered, no other hit arrives within the trigger latency and two hits are registered at exactly the trigger latency in two consecutive clock cycles. The spurious buffer-full signal by itself would not be a problem, but can lead in combination with the problem described above to a loss of -25 -synchronization. In both cases the synchronization is restored by a reset. Both problems have been fixed in the new version of the PROC600. In order to address these data synchronization losses, periodic ROC resets at 70 Hz are issued by TCDS. Figure 19 shows the BPIX layer 1 hit efficiency versus instantaneous luminosity with and without periodic ROC resets. Issuing resets for the ROCs recovers the hit efficiencies at low and high instantaneous luminosities. If there were no periodic resets, the sharp efficiency drop above an instantaneous luminosity of 1.3 × 10 34 cm −2 s −1 would have affected the layer 1 hit efficiency drastically.

With ROC resets
No resets Figure 19. BPIX layer 1 hit efficiency with (green) and without (red) periodic ROC resets at 70 Hz versus instantaneous luminosity.

Conclusion
The CMS Phase-1 pixel detector DAQ system was developed based on a combination of custom and standard microTCA parts to satisfy the higher bandwidth requirement of the new pixel detector and to interface correctly to the upgraded front-end electronics and optical links. The DAQ system underwent a series of integration tests, including readout of the pilot pixel detector, quality assurance of the Phase-1 detector during its assembly, and testing with the CMS Central DAQ. It was tested with realistic data streams at high trigger rates (up to 100 kHz) expected during LHC running. The Phase-1 pilot detector system proved to be valuable, leading to new designs for the TBM and the portcard to address an asymmetric eye diagram and excessive clock jitter, and helping with FED firmware development. The CMS Phase-1 pixel detector achieved the required performance improvements compared to the original pixel detector, and the pixel DAQ system performed well during 2017-2018 running delivering high quality data with low dead-time consistently for CMS, without failure of any back-end parts.