Fast convolutional neural networks for identifying long-lived particles in a high-granularity calorimeter

We present a first proof of concept to directly use neural network based pattern recognition to trigger on distinct calorimeter signatures from displaced particles, such as those that arise from the decays of exotic long-lived particles. The study is performed for a high granularity forward calorimeter similar to the planned high granularity calorimeter for the high luminosity upgrade of the CMS detector at the CERN Large Hadron Collider. Without assuming a particular model that predicts long-lived particles, we show that a simple convolutional neural network, that could in principle be deployed on dedicated fast hardware, can efficiently identify showers from displaced particles down to low energies while providing a low trigger rate.


Introduction
Particles with long lifetimes are an important possibility in the search for new phenomena, and often appear in beyond the standard model theories, notably in models that describe the elementary particle nature of dark matter. When produced at the LHC, these long-lived particles (LLPs) have a distinct experimental signature: they can decay far from the primary proton-proton (pp) interaction but within a detector such as ATLAS or CMS, or even completely pass through the detector before decaying. For example, neutral LLPs could travel a significant distance through the detector before decaying into displaced leptons, photons, or jets [1][2][3][4][5].
The data at the ATLAS and CMS experiments are collected using triggers, which select events in real time, reducing the event rate from the 40 MHz bunch crossing rate down to about 1 kHz that can be written to disk. Most triggers assume that the particles originate from the pp interaction vertex and are not displaced. Thus, dedicated triggers for displaced particles are necessary to maximize the chances of catching new phenomena at the LHC, in particular for its future data-taking runs.
The trigger system of the LHC experiments is usually organized in stages. In the CMS experiment, events of interest are selected using a two-level trigger system [6]. The first level (L1), composed of custom hardware processors, uses information from the subdetectors and will reduce the data rate to 750 kHz in CMS at the High-Luminosity LHC (HL-LHC) [7], which is planned to start taking data in 2027. In this phase, the upgraded L1 trigger will also feature inputs from the silicon tracker, allowing for real-time track fitting and highly efficient particle-flow reconstruction [8] of objects at the trigger level. The logic will be implemented in field-programmable gate arrays (FPGAs).
Deep neural networks (DNNs) of limited size can be deployed on FPGAs using dedicated tools such as HLS4ML [9], and can therefore now be included directly in the L1 trigger. Given the recent success of DNNs in high energy physics, arXiv:2004.10744v1 [hep-ex] 22 Apr 2020 in particular for complex pattern recognition problems such as b jet identification or heavy flavour jet identification, anomaly detection, as well as shower reconstruction in highly granular calorimeters and particle flow [10][11][12][13][14][15][16][17][18][19][20][21][22][23], this opens up new possibilities for triggers with simultaneously high computing and physics performance.
Due to the higher occupancy with up to 200 pp interactions per bunch crossing, in particular in the forward region, a new endcap calorimeter will be installed in CMS for the HL-LHC [24]. The interleaved HGCal detector layers within the absorber structure will feature a high-granularity electromagnetic section using 28 layers of silicon sensors with pad segmentation, and a hadronic section of 22 layers using the same technology in its innermost layers, and a less segmented scintillator tile section at higher radii. The high granularity of this system will allow for the measurement of particle showers in five parameters: three space dimensions, time, and energy. The HGCal will be the first imaging calorimeter in a running experiment at a high-energy collider, which generates many new opportunities, such as using it for a pattern-recognition-based trigger for displaced particles.
An example of a LLP signature that produces such a displaced, forward signature in form of jets are so-called "emerging jets" [25,26]. Emerging jets contain electrically charged standard model (SM) particles that are consistent with having been created in the decays of new neutral LLPs produced in a parton-shower process by dark quantum chromodynamics (QCD). Dark QCD is a new strong dynamics, similar to SM QCD, but in a separate dark sector. Dark QCD is proposed in order to explain the origin of dark matter [25].
This note presents the first proof of concept of using CNN based pattern recognition to trigger on calorimeter signatures, as opposed to an energy over threshold. It has been shown in Ref. [27] that two dimensional calorimeter images can be used to detect a variety of displaced signatures using convolutional neural networks (CNNs) [28]. The study presented here is made model independent by investigating the identification of electromagnetic showers that, in general, do not point to the primary pp interaction vertex. The angle between the projection direction and the particle momentum (angle to projection axis) is in the following referred to as α. This trigger improvement will allow us to extend LLP searches with the HL-LHC to low mass and large displacements.
The study is performed using a toy calorimeter, similar to the HGCal, described in Section 2 together with the generated data set. The architecture and training of the DNN is presented in Section 3 and the results are presented in Section 4.

Detector and data sample description
The endcap calorimeter is built using Geant4 [29] and is placed at z = 3 m distance to the interaction point. It covers a pseudorapidity (η) between 1.5 and 3.0, has a depth of 34 cm and consist of 14 equidistant layers. Each layer comprises a 10.4 mm lead absorber and 300 µm silicon sensors. The sensors are placed in 30 rings in η, each containing 120 segments in φ, leading to 50 400 sensors in total, each with a size of approximately 0.05 in η and φ. This configuration corresponds to approximately 60 radiations lengths, and therefore covers electromagnetic showers only. The number of layers and the cell size approximate the granularity of the planned HGCal at first trigger level. Charged particles are subject to a magnetic field of 1 T in z direction.
The signal data set is produced by generating photons at z = 299 cm with a flat energy spectrum between 10 and 200 GeV. The angle with respect to the projection axis is uniformly sampled between 0 and π/3. The position is randomly set to be within a radius of 20 to 60 cm with respect to the beam axis. The rotation with respect to the projection axis is also randomly sampled, but constrained such that at least the first and the last layer of the calorimeter are hit. We consider in total 780,000 signal events for training, 8,800 for validation, and 14,400 for evaluating the performance of the proposed algorithm (testing).
To estimate the rate and the effect of multiple interactions per bunch crossing, minimum bias events are produced using Pythia8 [30]. We generate two independent samples: 15.3 M events for training and 4 M for testing and validation. The energy deposits of 200 randomly chosen minimum bias events are added to build a background event and to estimate the effect of the contribution of extraneous pp collisions to the signal. For training and validation, the ratio of signal to background events is 1:1. For testing, 70 background events are generated for each signal event. The rate is calculated by normalising the minimum bias events by the LHC revolution frequency of 11 246 Hz and the number of bunches of 2760 [31].

Neural network and training
To distinguish between events with and without a displaced photon, we use a CNN architecture, developed for pattern recognition in images or other data that can be described by a regular grid structure. The detector geometry is unrolled to a 2 dimensional image in η and φ with 14 color dimensions, one for each layer. The first 8 columns from φ = 0 to φ = 0.4 at φ = 2 * π are repeated, to account for particles that enter the calorimeter at φ ≈ 0. An example of a displaced photon signature after this preprocessing is shown in Figure 1. In this projection, the displaced photon forms a line, while the other particles coming from the primary interaction form points. Moreover, the trajectory of the displaced shower through the layers is distinct from the other particles by a clearly visible color gradient. The neural network needs to be designed such that it provides a compromise between performance and resource requirements. The latter are particularly stringent if this method should be applied and implemented in dedicated hardware in the first stages of the trigger. While we do not include dedicated studies of the resource requirements on such hardware in this note, the architecture is nevertheless chosen such that it could be adapted to such a setting e.g. through HLS4ML.
For each pixel, the 14 color dimensions are reduced to 4, by sequentially applying 3 dense neural network layers. The first two layers have 16 nodes, each, and the third has 4. The resulting image embeds the depth information in these 4 features, as opposed to Ref [27], where only a two dimensional representation of the calorimeter deposits is used. The image containing the encoded depth information is fed through 4 CNN blocks, each containing a CNN layer with a kernel size of 3 ⊗ 3 pixels, max pooling and batch normalisation [32]. No padding is applied in the neural network. The CNN layer in the first and second block contains 8 filters, and max pooling is applied with a kernel of 2 ⊗ 2 pixels. The last two blocks have 12 and 16 filters, and max pooling is only applied on two pixels in φ direction. The output of the convolutional blocks is flattened and fed through one dense neural network with 32 nodes before the final classifier is calculated using a sigmoid activation. In the other layers, we employ ReLu activations [33]. The network contains 10,405 trainable parameters.
The training is performed using tensorflow [34] and keras [35] within the DeepJetCore framework [36] using the Adam [37] optimiser. The first epoch is trained with batch size of 50 and a learning rate of 0.0001. The batch size is increased to 500 for another 30 epochs of training with a learning rate of 0.0003.

Results
We study the efficiency as a function of rate, for different photon energies and angles α with respect to the projection axis. As described in Section 2, both variables are sampled from a uniform distribution. This way of presenting the results is model-independent, whereas any choice of displacement would be inherently model-dependent. As shown in Figure 2, the efficiency rapidly increases with the photon energy for a fixed rate, and reaches values above 60% for a rate of 10 kHz already for energies larger than 30 GeV.
As opposed to a trigger that is based on energy thresholds only, the proposed DNN trigger depends critically on the energy and the angle α. The trigger efficiency as a function of the energy for a trigger rate of 15 kHz is shown in Figure 3 left. Particles entering the calorimeter with angles of α > 0.2 provide a sufficiently distinct signature to be detected already at relatively low energies, while for smaller angles, the efficiency remains moderate up to high energies. The dependence of the trigger efficiency on α, shown in Figure 3 right, does not follow the same pattern. Here, the efficiency increases with α for all energies, but decreases slightly beyond approximately α = 0.5. This behavior is dependent on the DNN architecture and geometry. Starting from a certain angle, the cells hit by a particle are no longer adjacent pixels, but leave a sparse image that can only be resolved by a DNN with sufficient complexity and a larger receptive field.

Summary
The first proof of concept of using pattern recognition with fast convolutional neural networks to trigger on displaced calorimeter signatures is presented. In particular, displaced signatures in a forward calorimeter can be identified with good efficiency and low false positive rate. For a target trigger rate of 15 kHz, individual particles with angles with respect to the projection axis greater than 0.2 can be detected with good efficiency at low particle energy. This study indicates a potential increase in sensitivity to low mass, forward-moving long-lived particles.