Trigger algorithms and electronics for the ATLAS muon new small wheel upgrade

The New Small Wheel Upgrade for the ATLAS experiment will replace the innermost station of the Muon Spectrometer in the forward region in order to maintain its current performance during high luminosity data-taking after the LHC Phase-I upgrade. The New Small Wheel, comprising Micromegas and small Thin Gap Chambers, will reduce the rate of fake triggers coming from backgrounds in the forward region and significantly improve the Level-1 muon trigger selectivity by providing precise on-line segment measurements with ∼ 1 mrad angular resolution. Such demanding precision, together with the short time (∼ 1 μs) to prepare trigger data and perform on-line reconstruction, implies very stringent requirements on the design of trigger system and trigger electronics. This paper presents an overview of the design of the New Small Wheel trigger system, trigger algorithms and processor hardware.


E-mail: lguan@cern.ch
A : The New Small Wheel Upgrade for the ATLAS experiment will replace the innermost station of the Muon Spectrometer in the forward region in order to maintain its current performance during high luminosity data-taking after the LHC Phase-I upgrade. The New Small Wheel, comprising Micromegas and small Thin Gap Chambers, will reduce the rate of fake triggers coming from backgrounds in the forward region and significantly improve the Level-1 muon trigger selectivity by providing precise on-line segment measurements with ∼1 mrad angular resolution. Such demanding precision, together with the short time (∼ 1 µs) to prepare trigger data and perform on-line reconstruction, implies very stringent requirements on the design of trigger system and trigger electronics. This paper presents an overview of the design of the New Small Wheel trigger system, trigger algorithms and processor hardware.

Introduction
The Large Hadron Collider (LHC) will be upgraded over the next decade in several phases towards a High Luminosity-LHC to enhance its discovery potential [1]. The accelerator is anticipated to increase its energy for proton-proton collisions at a center-of-mass energy of 14 TeV during the second run period (Run 2) started in 2015. The luminosity is expected to reach or exceed its original designed value of 1×10 34 cm −2 s −1 at the end of Run 2. A major upgrade to the accelerator complex during the second long shutdown (LS2) starting around 2019 will double the instantaneous luminosity from its nominal value, allowing delivery of ∼300 fb −1 data to its hosting experiments during the following period (Run 3). The upgrade around 2024 will yield an instantaneous luminosity of 5-7×10 34 cm −2 s −1 . In order to take advantage of the LHC upgrades with higher delivered statistics, the ATLAS experiment [2] will upgrade its detector on the same schedule to handle higher event rate and integrated dose while maintaining its good detector performance.
A three-level trigger system [3] is used by the ATLAS experiment to select and record events on-line. The Level-1 trigger [4] pre-selects candidate events at a rate of 75 kHz (100 kHz for Run 2 and Run 3) from the initial 40 MHz of collisions. Such pre-selection is done on-line within 2.5 µs utilizing the Calorimeter and Muon Spectrometer customized electronics. The primary concern of the ATLAS Phase-I upgrade after LS2 is to improve the Level-1 trigger system for good single lepton (e and µ) selectivity with existing momentum (p T ) thresholds in a much higher rate environment compared with that in Run 1. The particular focus on the Muon Spectrometer is to improve the rejection of fake Level-1 triggers in the forward region. As illustrated in a quadrant y − z view of the ATLAS Muon Spectrometer shown in figure 1, muons traveling in the forward direction are bent radially toward or away from the beam axis during their passages through End-cap toroid magnets. High p T muons are selected as tracks with small deflection angles. The Level-1 muon trigger in -1 - Figure 1. A cut-view of the ATLAS Muon Spectrometer and the general concept of introducing NSW to improve Level-1 muon trigger at the forward region for high luminosity runs. The NSW will provide segment measurements with ∼ 1 mrad angular resolution, to be combined with segment measured at the Big Wheel, to eliminate fake triggers from non-IP originating backgrounds tracks. the forward region during Run 1 relied on segment measurements at the middle muon station (Big Wheel). Studies using the 2012 collision data showed that more than 90% of the Level-1 triggered muons with p T > 10 GeV in the forward region could not be matched with off-line muons. In most cases, triggers were fired by particles generated in between the innermost station (Small Wheel) and the Big Wheel from interactions of slow protons with the detector material. Requiring an associated hit at the Small Wheel for each segment measured at the Big Wheel would reduce faked muons by removing most of the hits not originating from the interaction point (IP). This motivated the ATLAS to utilize the present Thin Gap Chambers (TGCs) at the Small Wheel to participate Level-1 muon trigger during Run 2. Further improvements to substantially reduce the rate of fake triggered muons in the forward region and to sharpen the muon trigger turn-on curve are expected to maintain detector performance in Run 3 and beyond. This will be achieved by replacing the present Muon Small Wheels with a New Small Wheel (NSW) [5] in a Phase-I upgrade. Improved segment resolution of the Big Wheel trigger as proposed in Phase-II will provide two sets of independent segment measurements for enhanced p T resolution.
The NSW upgrade will improve the Level-1 muon trigger in the forward region as well as maintain excellent muon tracking in high rate environment. In this paper, we focus on discussing the NSW muon trigger capabilities leaving its tracking performance to be discussed elsewhere [5][6][7]. Organization of the paper is as following. Section 2 presents an overview of the NSW upgrade project, requirements for Level-1 triggering and the general NSW trigger scheme. Section 3 introduces trigger algorithms for reconstructing candidate muon track segments at the NSW and the hardware platform for implementing these algorithms. Section 4 discusses the development of trigger Front-end electronics for two sub-detector systems.

NSW upgrade and its readout scheme
The NSW upgrade introduces two types of high-rate capable large-area gaseous detectors, the small-strip Thin Gap Chamber (sTGC) [8,9] and the Micro-Mesh Gaseous detector (MM) [10,11], for precise reconstruction of muon segments both on-line and off-line in an environment with hit rates of up to ∼15 kHz/cm 2 (hottest region with maximum luminosity of 7×10 34 cm −2 s −1 after Phase-II upgrade). These detectors are arranged in 16 trapezoidal shaped sectors per end with each sector comprised of eight layers of MM sandwiched between two sTGC quadruplets. Both detector technologies provide trigger and tracking capability as both can discriminate the 25 ns bunch crossings as well as determine track hit positions with an accuracy of ∼100 µm per detector plane in the bending direction. sTGCs utilize 3.2 mm-pitch readout strips and ∼ 8 cm wide segmented pads on opposite sides of cathode planes. Anode wires perpendicular to strips are also read out for non-bending coordinate measurements. MMs have ∼0.4 mm fine pitch readout strips for each detector plane. MM strips are configured with 4 layers in azimuthal direction and 4 layers inclined at small angles with respect to azimuthal strips. In total, these detectors cover an area of approximately 2000 m 2 and have about 2.5 million readout channels.
The main requirement for NSW trigger at Level-1 is to reconstruct on-line segments with an accuracy of ∼1 mrad in the range of 1.3 < |η| < 2.4 for corroboration with the Big Wheel segments to discriminate against backgrounds. Reconstruction efficiency needs to be more than 95% across the entire covered η range. In addition, NSW trigger primitives have to be prepared within ∼1 µs for coincidence with the Big Wheel trigger primitives [12].
The NSW trigger and readout data flow is shown in figure 2. There are three data paths shown, one highlighted in light salmon (for sTGC triggers) and a second in blue (for MM triggers). These two data paths collect raw trigger data, such as charges or addresses of the fired strips, from on-detector electronics and send them to either sTGC or MM trigger processors located off-detector in the underground service carven. The two sub-detector systems operate independently to find candidate track segments which will eventually be merged before sending to the succeeding trigger processing unit. Data from on-detector electronics require about 500 ns to be transmitted off detectors through optical fibers, leaving about 500 ns for trigger processing Front-end electronics.
The third data path at the bottom of figure 2 with components shown in white boxes is for the common Level-1 readout and slow control chain. Hit charge and time information from NSW detectors are held at the Readout Controller (ROC) ASIC [14] until the Level-1 acceptance and then data are shifted out to the Front-end link exchange (FELIX) unit [13] where they are recognized and passed to the succeeding Readout Driver (ROD) for event building. This data path also includes bidirectional links via the FELIX interface. It allows the detector slow control, event monitoring and calibrations to be handled by commercial off-the-shelf network devices and PCs. Advantages of using the FELIX interface for readout routing and details of the communication among NSW readout electronics are discussed in [15].

NSW trigger algorithms and trigger processor
Each track segment reconstructed by NSW trigger processors will be presented as 24-bit data with the format shown in table 1. Two bits will be used to indicate the segment quality for each -3 -  sub-detector system (sTGC or MM), i.e. whether it is also found by the other sub-detector system. Five bits will be used to transmit the ∆θ which is the angular deviation of the NSW segment with respect to an infinite momentum track from the IP to the segment's radial position in the NSW. The resolution for ∆θ is 1 mrad. Eight bits R-index and six bits φ-index will represent the radial and azimuthal projections of the NSW segment on the Big Wheel trigger RoI. The corresponding granularity is 0.005 × 20 mrad in η × φ. Up to 8 track segments (4 per sTGC or MM) will be reconstructed for each NSW sector per bunch crossing. They will be sent through two optical fiber links running at 6.4 Gbps to new Sector Logic (SL) Boards for trigger matching with segments found by the Big Wheel. Due to substantial differences in detector characteristics, geometry and electronics, sTGC and MM employ different trigger processors with specifically tailored trigger algorithms for segment reconstruction.

sTGC trigger scheme and trigger algorithm
The structure of a single sTGC detector plane is shown in figure 3(a). As illustrated in figure 2, each sTGC strip or pad will be connected to one channel of a VMM [16], a 64-channel Amplifier-Shape-Discriminator ASIC. Timing pulses of fired pads output from the VMM will be sent to the pad Trigger Data Serializer ASIC (pad-TDS) [17]. Pad-TDS ASICs in all TGC planes will then -4 -transmit the binary pad firing information, together with the bunch crossing number (BCID), to the Pad Trigger Logic Board where three out of four coincidences are made per quadruplet to form pad-trigger roads. As shown in figure 3(b), sTGC pads from different detector layers are staggered by 1/2 pad width in the η direction. A pre-selected pad-trigger road therefore covers only about 13 strips per detector plane. Upon the formation of a pad-trigger road, its radial and azimuthal information are sent to the strip Trigger Data Serializer (strip-TDS). The strip-TDS stores charge data from VMMs connected to sTGC strips and performs the pad-strip data matching. Resulting output from a strip-TDS are charge data from those strips in a strip band selected by the pad-trigger road. Charge data together with strip band ID and pad-trigger road φ-ID are passed from the strip-TDS to the sTGC data packet Router [18] where it aggregates inputs from all strip-TDSs of a detector plane in a sector and transmits data to the sTGC trigger processor via optical fibers. The trigger scheme where "pad pre-trigger roads" select small number of strips to transmit data, drastically reduces the bandwidth needed to move strip data off detector.
The basic principle of sTGC trigger algorithm can be viewed in the illustration of figure 3(c). Strip charge data are first routed to their corresponding layer centroid finder. Charge thresholds are applied to select only up to five strips within a strip band for the layer centroid calculation. Such implementation takes into the consideration that real muons usually leave hit clusters with 3-5 strips. A layer centroid in the local reference frame is calculated using Digital Signal Processing (DSP) blocks as where S i and Q i are the strip number and the strip charge, respectively. The layer centroid in the global NSW reference frame is then determined by adding the local centroid with its global offset in the radial direction. This offset is converted from the strip band ID and the cluster position within the strip band using look-up-tables (LUTs). Quadruplet centroids are calculated in the next step as averages of their corresponding layer centroids. Poor layer centroids can arise from delta-ray or neutron hits with large (> 5 strips) clusters or noise hits with extra narrow (single strip) clusters. The algorithm is adjustable to exclude those poor layer centroids from averaging and prevent them from spoiling the quadruplet hit position. For calculating ∆θ of the segment, a LUT is present to provide each centroid of the pivot quadruplet (the quadruplet close to the IP) with a range of acceptable centroids in the confirmation quadruplet. Centroids from the confirmation quadruplet are converted into ∆θ with those found out of range (|∆θ| > 15 mrad) being rejected as non-pointing segments. Similarly, the R-index for the segment is determined by another set of LUTs based on the pivot quadruplet centroid and the segment pointing. The φ-index is determined from the pad-trigger road φ-ID input from the trigger processor.

MM trigger scheme and trigger algorithm
The layout of a single MM detector plane is depicted in figure 4(a). Unlike sTGC, the ∼0.4 mmpitch offered by a MM strip plane determines the hit position from a single strip with sub-mm resolution without the need to calculate the hit centroid. Strips in a MM quadruplet are arranged as shown in figure 4(b) so that two horizontal strip planes are parallel to the base of the NSW sector trapezoid whereas strips from the other two planes have small angles (1.5 • ) with respect to those horizontal strips. For MM trigger, the VMM ASIC connected to strips makes use of a mode called "Address in Real Time (ART)". In this mode, the address of the first threshold-crossing strip in an event is encoded in the VMM. An ART ASIC then collects these strip address data from 32 VMMs and choses up to 8 hits to be shifted out. Eventually, an ART Data Driver Card (ADDC), hosting -5 - two ART ASICs per card, transmits first hit strip addresses using optical fibers to the MM trigger processor of a specific NSW sector.
The basic principle of the MM trigger algorithm for segment reconstruction can be explained with figure 4(c). Hit strip addresses from a certain detector plane are first translated into slopes of infinite momentum tracks from the IP to those hit positions using LUTs. The entire η range covered by the NSW is divided into a number (N) of slope roads. Each slope road represents a range of acceptable slopes for straight tracks coming from IP to that subdivided η range. Slopes converted from strip addresses are then stored in a ring buffer with N(slope roads) × 8(strip planes) × T(bunch crossings). The buffer is checked every bunch crossing and a segment candidate is defined as a multiple layer coincidence within a slope road. The algorithm enables the trigger to be resilient to backgrounds originating far from the IP. Once a segment candidate is found, slopes from individual planes are sent to "slope fitter" logic where the so called global slopes and local slopes are calculated. The global slope is a slope derived from the mean position of all hits in a strip plane category (horizontal or stereo) and the location of the IP. The local slope, only calculated for horizontal planes, is the least square fit of all hits in horizontal planes to a single line. It represents the segment pointing within the NSW and is independent of the IP location. Finally, taking geometric relationship among MM detectors, IP, and Big Wheel RoIs into consideration, ∆θ and the projection on Big Wheel RoI are calculated using global and local slopes. Calculations and fitting solutions are pre-stored in LUTs and pre-loaded into processing FPGAs for fast segment reconstruction.

Trigger processor hardware platform
Each of the sTGC and MM trigger processors per NSW sector will be implemented in one of two FPGAs (Xilinx Virtex-7 XC7VX690T) [19] on an ATCA-based Mezzanine Card [20]. Two ATCA mezzanine cards will be hosted by a single ATCA carrier card and serve a NSW octant. Each ATCA mezzanine card has high speed optical receivers to accept 64 fiber inputs from 16 detector planes of a NSW sector. Due to the mismatch between NSW and Big Wheel trigger segmentations, multiple -6 - high density optical transceivers are included on each mezzanine card to broadcast reconstructed NSW segments to up to 7 new SL boards. Additionally, fast and low latency inter-FPGA LVDS pairs are implemented for cross communication between sTGC and MM trigger processors to merge segments.

Trigger Front-end electronics
NSW trigger Front-end electronics include radiation tolerant ASICs populated on boards along the side of detectors and FPGAs on boards sitting at the rim of the NSW where radiation environment is less harsh. All ASICs are based on IBM 130 nm CMOS technology, which are expected to withstand after much higher radiation dose [21] than that expected at the NSW. Customized logic in these electronics will be optimized to meet low latency requirements and to mitigate single event effects.

Front-end electronics for sTGC system
Trigger Data Serializer (TDS) ASIC: the TDS is a 128-channel, low latency, low power ASIC which prepares trigger data for sTGC strips and pads, performs pad-strip matching and transmits data with a high speed serializer. TDS can be operated in two modes: strip or pad mode. The pad-TDS mode assigns BCID to timing pulses from VMMs connected to sTGC pads and outputs the BCID followed by binary hit information from up to 104 pads in that bunch crossing. The pad-TDS provides programmable delay for each pad input with a 3.125 ns step to compensate signal arrival time difference due to the variable pad location within a detector plane. The strip-TDS mode decodes 6-bit hit charge data from strips, assigns each hit a BCID and stores them in a ring buffer. Upon receiving a pad-trigger road, the strip-TDS searches for hits both within the pre-trigger selected strip band using a LUT and with a BCID matching the trigger road. Charge data from up to 14 matched strips, appended with BCID, strip band ID and trigger road φ-ID, are framed, -7 - scrambled and sent out via the TDS serializer. The serializer operates at 4.8 Gbps, sending 120 bits per bunch crossing in both strip and pad modes for fixed latency triggers. A photo of the first TDS prototype packaged with a 400 pin BGA is shown in figure 5(a). Core logic of the TDS has been verified with the first prototype. The measured eye diagram for the serializer is shown at the bottom of figure 5(a), demonstrating its good performance.
Pad Trigger Logic board: the Pad Trigger Logic board is based on a Xilinx Kintex-7 FPGA and will be located on the NSW rim to collect hit data from the pad-TDSs in 8 sTGC planes within a NSW sector. The four layers of pad hits in a quadruplet will be examined for three out of four coincidences each bunch crossing. A further coincidence between the two quadruplets will form a pad-trigger road. Up to three pad-trigger roads per sector will be sent to strip-TDSs per bunch crossing as simulation suggests the probability to have more than three muons is less than 0.3%. Trigger road information with 12 bit BCID, 8 bit Band-ID and 5 bit φ-ID will be sent over two differential pairs together with a 160 MHz differential clock and a single ended frame line which indicates valid trigger data. sTGC router board: the router board will also sit on the NSW rim. Each board deals with a single sTGC plane in the NSW sector. The board will accept up to 10 strip-TDS inputs from a detector plane and send up to four active inputs per bunch crossing through its optical transceivers. Commercial repeater chips will be used at the receiving end of the Router to recondition the high speed signals to suppress transmission errors. Data transmitted from the strip-TDSs are packaged into 30-bit packets with unique headers for self identification as data or Null packets. To minimize latency, logic implemented in Router board FPGA will recognize an active input before receiving its entire data frame and set-up routing for it. A picture of the first developed prototype is shown in figure 5(b) and test results can be found in [18].

Front-end electronics for MM system
Address in Real Time (ART) ASIC: the ART ASIC acts as a MM trigger data concentrator which accepts inputs from 32 VMMs and selects up to 8 active channel inputs to transmit per bunch crossing. An active input from a 64-channel VMM to an ART ASIC consists of a flag signal followed by first above-threshold strip address encoded in six bits. Input signals to the ART ASIC -8 -will be skewed with programmable delays for correct sampling. The ART ASIC will start the processing by looking for the presence of ART flags from all VMMs inputs in each bunch crossing. A valid flag from any VMM will register a hit in the hit list. First 8 hits with non-zero flags will be selected from the hit list using a cascaded priority selector circuit. Strip addresses are deserialized after ART flags are detected. Finally a data formatter circuit will organize the data to be sent to the GBTx ASIC [22]. Including eight 6-bit strip addresses, eight 5-bit VMM IDs and a BCID, there will be 112 bits transmitted per bunch crossing from the ART ASIC to the GBTx using 14 differential pairs running at 320 Mbps Double Data Rate (DDR) mode.
ART Data Driver Card (ADDC): each ADDC hosts two ART ASICs, two GBTx ASICs and the VTTx which is a radiation tolerant dual-channel transmitter [23]. Each GBTx links with an ART ASIC transmits data through a serial link at 4.8 Gbps (4.48 Gbps bandwidth for data). The VTTx converts two electrical-links into optical signals and sends MM trigger data to a MM trigger processor. Each MM detector plane has four ADDCs and and 32 cards per NSW sector are connected to a MM trigger processor.

Conclusions
The ATLAS Muon NSW upgrade will replace the innermost muon station in the forward region with high-rate capable precision trigger and tracking detectors to improve Level-1 muon trigger in the End-cap region as well as to maintain excellent muon tracking capability in the high rate environment foreseen for LHC high luminosity runs after the Phase-I upgrade. The participation of NSW in the Level-1 muon trigger by providing segment measurements with 1 mrad angular resolution will be powerful to discriminate muons against backgrounds with high track density. Good angular resolution at NSW will also substantially improve p T resolution of the muon trigger after a similar resolution improvement to the Big Wheel segment measurement is implemented in Phase-II upgrade.
A complex NSW trigger and readout electronics system is under intensive development. Requirement of NSW segments be reconstructed with ∼1 mrad accuracy within about 1 µs imposes great challenges on the design of the NSW trigger electronics. First prototypes of NSW trigger ASICs and Front-end boards are being developed and tested. Separate trigger algorithms for the two NSW sub-detector systems (sTGC and MM) have been developed. These two independent sub-systems provide redundancy in the high track density environment of the future LHC. Trigger algorithms will be implemented in FPGAs on ATCA-based mezzanine cards with high density optical links and interconnections. A vertical slice is presently under preparation to integrate and exercise all trigger electronics and trigger algorithms before the final NSW commissioning.