Search for dark QCD with emerging jets in proton-proton collisions at $\sqrt{s}$ = 13 TeV

A search for ``emerging jets'' produced in proton-proton collisions at a center-of-mass energy of 13 TeV is performed using data collected by the CMS experiment corresponding to an integrated luminosity of 138 fb$^{-1}$. This search examines a hypothetical dark quantum chromodynamics (QCD) sector that couples to the standard model (SM) through a scalar mediator. The scalar mediator decays into an SM quark and a dark sector quark. As the dark sector quark showers and hadronizes, it produces long-lived dark mesons that subsequently decay into SM particles, resulting in a jet, known as an emerging jet, with multiple displaced vertices. This search looks for pair production of the scalar mediator at the LHC, which yields events with two SM jets and two emerging jets at leading order. The results are interpreted using two dark sector models with different flavor structures, and exclude mediator masses up to 1950 (1950) GeV for an unflavored (flavor-aligned) dark QCD model. The unflavored results surpass a previous search for emerging jets by setting the most stringent mediator mass exclusion limits to date, while the flavor-aligned results provide the first direct mediator mass exclusion limits to date.


Introduction
Although there is a preponderance of evidence from astronomical and cosmological observations [1][2][3][4][5] for the existence of dark matter (DM), it has not yet been detected in laboratories, suggesting that its origin may be associated with as-of-yet unobserved physics processes beyond the standard model (SM).As experimental searches have excluded a large portion of the phase space of DM models with weakly interacting massive particles, alternative theoretical models have been developed with a hidden gauge sector, some of which are similar to quantum chromodynamics (QCD), which can result in strongly self-interacting DM particles [6][7][8][9].Dark matter of this type could interact with SM particles through so-called mediator particles and potentially be produced at colliders, generating signatures such as semivisible jets [10] or emerging jets (EJs) [11].
The search described in this paper is motivated by the models proposed in Refs.[11][12][13].A composite dark sector with a QCD-like non-Abelian gauge symmetry SU(N dark color ), where N dark color is the number of dark colors, is added to the SM gauge group.We consider the case where fermions in the dark sector (dark quarks Q dark ) interact with the SM quarks through a scalar mediator X dark .The scalar mediator is charged under both SM QCD and dark QCD and couples to a quark and a dark quark via Yukawa interactions with coupling strength κ αi , where the subscript α (i) denotes flavors of dark (SM) quarks.The mediator can be pair produced through gluon splitting at the LHC, similar to the pair production of a single type of squark in supersymmetry [14].In fact, the production cross section is the same as for pair production of right-handed top squarks [15,16] multiplied by N dark color .Each mediator decays into a quark and a dark quark, as shown in Fig. 1.The dark quarks hadronize into dark hadrons.Generally, of the dark hadrons in the composite QCD-like dark sector described above, either the lightest dark baryon or dark meson is stable and thus provides a natural DM candidate.The first version of this model that we consider, referred to as "unflavored", has a simplified flavor structure in the coupling of the dark sector to SM particles, such that only the Yukawa coupling to the d quark is nonnegligible [11].Equation (1), taken from Ref. [11], gives the proper decay length of dark pions in this model: where κ is the coupling between dark quarks and the SM down quark, f π dark is the dark pion decay constant, m d is the mass of the SM down quark, and m π dark is the dark pion mass.The numbers with units show the expected typical scale for each variable.
If instead multiple Yukawa couplings κ αi have nonnegligible values, the proper decay length for dark mesons composed of dark quarks of flavor indices α and β is given by Eq. ( 2) from Ref. [13]: where m X dark is the mediator particle mass, N c is the SM color factor, and m i , m j are the masses of the SM quarks with flavor indices i, j, respectively.Within this model, we focus on the "flavoraligned" scenario from Ref. [13], which has three dark quark flavors that couple to the SM down-type quarks (d, s, b) via a diagonal matrix κ αi = κ 0 δ αi .Because of spin considerations, dark hadron decays to heavier SM particles are favored, typically resulting in a large number of b quarks in the decays when kinematically allowed.We characterize the flavor-aligned model in terms of the maximum proper lifetime for any dark pion, cτ max π dark .Following the benchmark models from Refs.[11,13], where the lightest dark baryon is stable, and given that dark baryon production is suppressed by 1/N dark color [17], we consider only dark meson production and require all dark mesons to decay finally into SM particles.When the lifetimes are long enough to give macroscopic decay distances, the resulting signature from either model is two SM jets and two EJs containing multiple displaced vertices.
Previous experimental searches for strongly self-interacting DM have been made using protonproton (pp) collision data collected at the CERN LHC at √ s = 13 TeV and corresponding to about 140 fb −1 .This includes a search for semivisible jets by the CMS Collaboration [18], a search for semivisible jets by the ATLAS Collaboration [19], and a search for dark quarks in a dijet final state by ATLAS [20].In addition, a previous search for EJs interpreted with an unflavored dark sector model was performed by the CMS Collaboration using a data sample collected in 2016, corresponding to an integrated luminosity of 16.1 fb −1 [21].
In this paper, we present a search for the EJ signatures of the unflavored and the flavor-aligned dark sector models, focusing on the displacement information of the tracks in the EJ, using the data set collected by the CMS Collaboration in 2016-2018 using pp collisions at a centerof-mass energy of 13 TeV and corresponding to 138 fb −1 .The search strategy first identifies EJs by exploiting their topological differences relative to SM jets using selection criteria optimized separately for each model.Then, the probability for an SM jet to be misidentified as an EJ is measured, and the background is estimated using control samples in data.The main background comes from SM multijet production where SM jets are misidentified as EJs.
Compared to the previous analysis [21], we have extended the unflavored EJ model search to a wider parameter space.This study also includes the first dedicated search for a flavored dark QCD sector.We have implemented a number of important changes that considerably increase the sensitivity of the search, including the incorporation of a machine learning (ML) technique aimed at enhancing the EJ identification (tagging) performance.The tabulated results are provided as a HEPData record [22].This paper is organized as follows.In Section 2 we give an introduction of the CMS detector, followed in Section 3 by a detailed description of the simulated data used in this search.The event reconstruction and triggering algorithms are discussed in Section 4. In Section 5 we present the analysis strategy and two independent EJ tagging methods.Section 6 describes the background estimation method.The treatment of uncertainties is detailed in Section 7. The results are presented in Section 8 and summarized in Section 9.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Located within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections.Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors.Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid.
Physics events of interest are selected using a two-tiered trigger system.The first level, called the level-1 trigger, is composed of custom hardware processors and uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of about 4 µs [26].The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage [27].A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [28].

Event simulation
Monte Carlo (MC) events are used to evaluate the signal acceptance, optimize selection criteria, and test the closure of the background estimation methods.
The signal process is generated using the Hidden Valley module [29,30] in PYTHIA version 8.240 [31], based on Ref. [11].In the unflavored scenario, we choose the number of dark quark flavors to be N dark flavor = 7, following Ref.[11].The running of the dark coupling constant with Q 2 , where Q is the momentum transfer, is faster for smaller N dark flavor , and the resulting showers have fewer dark mesons.In the flavor-aligned scenario, N dark flavor = 3 is used, and κ αi is set to be diagonal, with all diagonal elements having the value κ 0 .For this scenario, the PYTHIA Hidden Valley module is modified to produce the different dark hadron species at the desired occurrence frequencies based on the dark quark flavors.For both cases, we consider a representation similar to QCD with N dark color = 3.The dark quark masses are degenerate and equal to the confinement scale Λ dark .The dark pion mass is set to be half of Λ dark , and the dark ρ meson mass four times the dark pion mass.These parameter choices are motivated by the assumptions made concerning dark matter energy density and energy scales in Ref. [11].The natural width of X dark is set to 10 GeV, a relatively small value compared to the detector resolution.Under these assumptions, the free parameters of the model are limited to the mediator mass m X dark , the dark pion mass m π dark , and the dark pion lifetime cτ π dark .Tables 1 and 2 summarize the signal model parameters used in this search.The signal cross section, described in Section 1, is computed at next-to-leading order (NLO), with the resummation of soft gluon emission included at next-to-leading-logarithmic accuracy.We simulate SM QCD multijet events and γ+jets events using the MADGRAPH5 aMC@NLO 2.6.5 event generator [32] at leading order with the MLM matching procedure [33].
In all cases parton showering and hadronization is performed using PYTHIA8 with the CP5 underlying event tune [34] and the NNPDF3.1 next-to-NLO parton distribution functions (PDFs) [35].The response of the CMS detector is modeled using GEANT4 [36], and corrections are applied to the simulated samples to account for the differences in resolutions and efficiencies between the data and the simulation.

Event reconstruction and triggering
A global "particle-flow" (PF) algorithm [37] aims to reconstruct all individual particles in an event, combining information provided by the tracker, calorimeters, and muon system.The reconstructed information from the different subsystems is used to build physics objects such as photons, leptons, jets, and missing transverse momentum [38][39][40].
The pp interaction vertices are reconstructed by clustering tracks on the basis of their z coordinates along the beamline at their points of closest approach to the beam axis using a deterministic annealing algorithm [41].The position of each vertex is estimated with an adaptive vertex fit [42].
Multiple vertex candidates can be reconstructed because of additional pp interactions in a bunch crossing (pileup).The primary vertex (PV) is taken to be the vertex corresponding to the hardest scattering in the event, evaluated using tracking information alone, as described in Section 9.4.1 of Ref. [43].The contribution from charged-particle tracks from pileup interactions is reduced by rejecting those associated with other vertices [44].The PV is required to be within 15 cm of the CMS detector center in the z direction to ensure optimal reconstruction efficiency.
Jets are reconstructed from the PF particles using the anti-k T algorithm [45,46] with a distance parameter of 0.4.The jet momentum (energy) is calculated as the vectorial (scalar) sum of the momenta (energies) of all clustered particles.Corrections derived from data and simulation are applied to the jet energy.Jets that are consistent with the fragmentation of b quarks (b jets) are identified using the DEEPJET discriminator [47][48][49].Both the medium and loose working points are used; the medium (loose) working point has an 85% (90%) probability of correctly identifying b jets with p T > 90 GeV and a 1% (10%) probability of misidentifying light-flavor jets as b jets.The scalar p T sum of all jets within an event (H T ) is used to select energetic events.
The analysis considers data collected using triggers based on jet p T and on H T calculated from the summed p T of online-reconstructed jets.The trigger used for the data collected in 2016 requires at least one jet with p T > 450 GeV or H T > 900 GeV, while for 2017-2018, the H T threshold is increased to 1050 GeV and there is no p T requirement.The inclusion of a jet p T trigger requirement for 2016 was necessary because, in part of the 2016 data taking period, some jets reaching the saturation energy for the level-1 trigger were mistakenly dropped from the H T sum, resulting in a significant loss of efficiency.This loss was recovered by including the singlejet trigger requirement.The H T threshold was increased in 2017 to compensate for the higher instantaneous luminosity.The efficiencies for an event to pass any of these trigger conditions in both data and simulation are measured from data sets collected with an independent trigger that requires a muon with p T > 50 GeV.In both cases, the efficiencies are close to unity above the signal selection H T thresholds, and the difference between the efficiencies as measured in simulation and data is applied as a correction to the simulation.

Analysis
The offline analysis selects energetic events with at least four jets.Jets are classified as EJ candidates using information from tracks associated with the jets.An overview of the event selection strategy is given in Section 5.1.Descriptions of the "candidate" EJ identification criteria ("tagging") is given in Section 5.2.The algorithms used to optimize the selection criteria to maximize signal sensitivity, and the resulting event selection requirements are described in Section 5.3.

Event and physics object selection
While some SM hadrons, such as those containing b quarks, have macroscopic decay lengths, tracks associated with SM jets mainly come from particles produced promptly (close to the collision primary vertex).In EJs, when the lifetime of one or more of the dark mesons is long, a substantial fraction of the tracks can emerge from displaced vertices.In addition, in the flavor-aligned scenario, most of the dark pion decay products are b quarks, which results in displaced tracks even when the dark pion decays immediately.We use as our displacement measure the d xy (d z ), measured from the PV to the point of closest approach on the track trajectory, as this gives better sensitivity than reconstructing individual decay vertices.Jets used in this analysis are required to have p T > 100 GeV, as we are looking for EJs originating from a heavy-mediator decay, and |η| < 2, to ensure that they are well contained in the tracker.Jet candidates are also required to pass a set of quality criteria to reject spurious jets from instrumental sources [50].To reduce the probability of displaced tracks being incorrectly assigned to jets by the PF algorithm, high-purity tracks, as defined in Ref. [23], with p T > 1 GeV are associated with jets by requiring the angular separation between the jet direction and the track direction, ∆R = √ (∆η) 2 + (∆ϕ) 2 < R max , where ∆η is the η separation between the jet axis and the track, and ∆ϕ is the separation in the azimuthal direction.We also reject tracks with |d z | > d max z to suppress jets from pileup interactions.The values of R max and d max z are optimized for each signal model.If a track can be assigned to multiple jets, it is assigned to the jet with the smallest ∆R separation.Jets are required to have at least one associated track so that the jet displacement measure can be calculated.To suppress events with a poorly reconstructed PV, we require that at least 10% of associated tracks have a longitudinal displacement from the PV less than 0.01 cm.
Candidate signal events are required to have high H T and at least four jets passing the criteria above.At least two of these four jets must be tagged as EJs.If more than four jets are found, the four jets with the largest p T are used.The H T and EJ criteria are optimized for each signal model.

EJ tagging
Two EJ tagging strategies are used in this analysis.The first selects candidate EJs via requirements on jet-level variables using track displacement measures ("model-agnostic selection").The second uses a graph neural network trained on specific signal models to determine an EJ tagging score ("machine learning or ML-based selection").The model-agnostic method allows for a simpler reinterpretation of the results for alternate theoretical models not considered in this paper, while the ML-based approach achieves the best possible sensitivity for the specific models studied here.
The model-agnostic EJ tagger uses different input features for the unflavored and flavoraligned dark sector models, while the ML-based EJ tagger is trained separately for each class of model.

Model-agnostic EJ tagging
For the EJ tagging that targets the unflavored dark sector models, R max = 0.4 is used.The following variables, which were also used in Ref. [21], are used for the EJ tagging: • d xy : this is the median d xy of tracks associated with a jet.
• α 3D : this is defined as which is the ratio between the scalar p T sum of the associated tracks with a pseudosignificance D N smaller than the selection value D max N and the scalar p T sum of all the associated tracks.The pseudo-significance D N is defined as: where σ(d xy ) is the d xy uncertainty calculated from the covariance matrix of the fitted track trajectory.The d z significance is based on the PV z resolution.
The distributions of these variables are presented in Fig. 2 for data events and for simulated signal and multijet background events.The data and simulated events shown require H T > 1200 GeV and at least four jets with p T > 100 GeV and |η| < 2, which is much less restrictive than the final signal selection requirements.• Jet girth: this is defined as the p T -weighted ∆R separation of tracks from the jet direction: where the index i runs over all tracks associated with the jet of interest.This variable exploits the feature that particles in EJs tend to have a wider angular separation than SM jets because of the large mass of the dark mesons.
Figure 3 shows the distributions of the tagging variables used in the flavor-aligned analysis for data, simulated signal, and multijet background events.The data and simulated events follow the same requirements as those appearing in Fig. 2.
As the jet-track association radius R max = 0.8 is larger than the R = 0.4 value used to perform jet clustering, jets that have nearby soft jets can be misidentified as EJs.To remove these candidates, a modified N-subjettiness variable τ n is calculated as follows: 1.For each of the four candidate high-p T jets, we consider all jets within ∆R < 0.8 of the selected jets that have p T > 30 GeV in decreasing order of jet p T .These will be used as the "subjet" collection to calculate τ n .Using this definition, all candidate jets will have at least one subjet: the candidate jet itself.
2. After determining the subjet collection assigned to each candidate jet, the computation of τ n is then carried out up to the leading n subjets: similar to the definition of the original N-subjettiness [51].The index i runs over the tracks associated with the candidate jet, and j runs from 1 up to n for the subjet collection assigned to the selected jet.
The requirement that τ 2/1 = τ 2 /τ 1 > 0.5 is applied to all EJ candidates and is found to reliably suppress the misidentification of SM jets with nearby soft jets as EJs.

The ML-based EJ tagging
The ML-based tagger uses a graph neural network (GNN) based on PARTICLENET [52] to directly incorporate the track information.Two separate GNNs are trained to classify EJs: one for the unflavored model (uGNN) and the other for the flavor-aligned model (aGNN).Each is trained, validated, and tested using all of the EJs from the signal samples and an equal number of background jets from background samples, weighted equally.The training, validation, and testing data subsets make up 60%, 15%, and 25% of the full data set, respectively.The output of each GNN is a score ranging from 0 to 1, which is a measure of the probability that the jet is an EJ.Tracks are associated with a jet using R max = 0.8.To maximize the available information for the ML training, the tracks are not required to satisfy a d max z requirement.Each network is trained on all jets originating from a dark quark in the signal models of interest, and an equal number of jets taken from SM QCD simulation as the background sample.
The ∆η and ∆ϕ between each track and its associated jet are used as the track coordinate variables in the jet space.Each track in the jet is represented by a 5-feature vector containing: • ∆R(track, jet), as particles in EJs tend to have a wider angular separation than in the SM jets because of the heaviness of the dark mesons.
• ln(p track T /1 GeV), ln(p track T / ∑ i p i T ), as the combination of the dark shower and the decay of the mesons back to the SM sector causes the p T of tracks to be smaller on average for EJs than for SM jets.
• T(d xy ), T(d z ).The transformation function T(x) is applied to the track displacement variables, to reduce the range of values input to the GNN while preserving the variables' sign and continuity.It is defined as: This transformation was found to give comparable or better performance than a standard scaling.
The impact parameter d xy is the most influential feature.Figure 4 shows the output score distributions for signal and background, demonstrating good separation.5.
The uncertainties in the SM multijet simulation are too small to be visible.The systematic uncertainties in the simulated signal distributions are small and have been omitted for reasons of clarity.The sums of the entries are normalized to unity.

Determining the EJ tagging and event selection criteria
Searches for each of the signal models listed in Tables 1 and 2 are conducted.An optimization procedure is performed for each signal model to determine the best threshold on the event H T , jet p T , and the EJ tagging variables by maximizing σ opt , defined as: where S (B) is the number of simulated signal (background multijet) events passing the selection thresholds and β is the estimated relative systematic uncertainty in the background, taken to be 10%.To reduce the number of sets of selection criteria, similar selection sets are grouped, and a representative set that gives at least 90% of the original best significance value is used for all the models in the group.Additional selection sets with higher background selection efficiencies are used to validate the background estimation calculations.Their selection criteria are detailed in Section 6.
The selection criteria used for EJ tagging are shown in Tables 3-5, and those for the event selection in Table 6.In the model-agnostic approach, longer signal model lifetimes in the unflavored scenario tend to favor selection criteria with larger d xy selection thresholds.On the other hand, in the flavor-aligned scenario, longer signal model lifetimes favor selection criteria with higher d min xy thresholds.In the ML-based approach, GNN EJ taggers generally exhibit consistent performance across varying signal model lifetimes, and the signal models with larger mediator particle masses generally are targeted by selection criteria with higher H T threshold.The specific selection criteria employed for each EJ model, along with the signal selection efficiency for each model under various selection criteria, are documented in the HEPdata record [22].
For the unflavored model, at cτ π dark = 45 mm, the model-agnostic taggers yield a maximum signal selection efficiency of ≈40%, while the ML-based taggers yield a maximum signal selection efficiency of ≈60%.The signal selection efficiency of the model-agnostic taggers drops to a few percent for very short-lived (≈1 mm) dark pions, and the efficiency of both taggers drops to a few percent for very long-lived (≈1000 mm) dark pions.For the flavor-aligned model, at cτ max π dark = 45 mm, the model-agnostic taggers yield a maximum signal selection efficiency of ≈25%, while the ML-based taggers yield a maximum signal selection efficiency of ≈40%.This efficiency remains fairly stable along the full maximum proper lifetime range for both taggers.

Background estimation
The signal region (SR) for this analysis is constructed from events with two or more tagged EJs.The main source of background for this analysis is the production of four SM jets, where two or more of these jets are misidentified as EJs.We estimate the number of SM events passing each selection criteria in the SR by constructing a control region (CR) with identical event and jet kinematic requirements, but where exactly one jet is tagged as an EJ.We estimate the fraction of signal events in the CR to be no more than 10 −5 .The probability of an SM jet being misidentified as an EJ (mistag rate) is heavily dependent on the underlying jet flavor, as SM jets containing b hadrons also have displaced tracks.We therefore estimate the misidentification probability ϵ( f , p T ) as a function of the underlying jet flavor f , where f is b for b jets and q for other jet flavors, and the jet p T .
When the probability of a jet being misidentified as an EJ is characterized and parameterized using p T , flavor fraction, and jet type, the number of background events in the SR is estimated using counts in the CR and the mistag rate according to the following formula: where ϵ i is an abbreviation of ϵ( f i , p i T ), the estimated mistag rate of jet i in each event.The jet indices in the summations and products run over all jets that are not EJ tagged.The denominator calculates the fraction of events that have exactly one EJ-tagged jet, while the three terms in the numerator calculate the fractions of events with two, three, and four EJ-tagged jets, respec-tively.The common factor for the estimated mistag rate for the EJ-tagged jet exists in both the numerator and denominator, and therefore does not appear in the expression.The factorials in the numerator are needed to correct for repeated counting in the unordered sum over the jet indices.For example, the first term in the numerator needs a factor 2! to correctly handle events where the two highest p T jets are tagged as EJs and the next two jets are not tagged; without this factor, the unordered sum would count these events twice.A detailed derivation of the equation is given in Ref. [53].Because the underlying flavor of each jet is difficult to determine, we approximate ϵ i in Eq.( 4) using a flavor-fraction-averaged misidentification rate ϵ avg (p T ) and the estimated b jet fraction F CR b of all non-EJ-tagged jets in the CR: The value of F CR b is estimated by fitting the CMS DEEPJET discriminator [47] spectrum of the non-EJ-tagged jets in the CR to two template distributions obtained from SM multijet MC events.One template contains the discriminator value for jets identified using generator-level information as containing a b quark, and the complement template is used for the other jets.The template distributions are varied within the measured uncertainties [48,49].The normalization factor of the b quark template, F CR b , is the only free parameter.The fit is performed for each of the selection criteria listed in Table 6. Figure 5 gives an example of the fit performance for the "u-set validation" and the "uGNN validation" selection criteria, defined in Table 6.
138 fb 1 (13 TeV)  The evaluation of the mistag rate is performed in a signal-free region (FR) that consists of events passing a high-p T photon trigger and containing an isolated photon in the ECAL barrel with p T > 200 GeV.The FR events are required to have either one or two jets passing the jet selection criteria described in Section 5.1.To obtain the misidentification probability for different jet flavors, we further divide the FR into a b-enriched region (ER) and a b-depleted region (DR) by imposing b tagging requirements on an additional jet with p T > 50 GeV and |η| < 2.4 that passes a set of noise-rejection criteria [50].As a significant fraction of heavy-flavored jets originate from gluon splitting in this event sample, b tagging requirements on this extra jet will change the b jet fraction of the selected jets.Events are classified as ER if the additional jet passes DEEPJET b tagging at the medium working point, and are classified as DR if the additional jet fails DEEPJET b tagging at the loose working point.Assuming that the ER and DR have the same mistag probabilities for the selected jets, and the only difference between these regions is the overall b jet fraction, the misidentification rate for all the selected jets in a specific region XR, ϵ XR (p T ), can be expressed as a linear combination of ϵ(b, p T ) and ϵ(q, p T ):

CMS
where F XR b (p T ) is the estimated b jet fraction of region XR in bins of jet p T .The b jet fraction is obtained by fitting the DEEPJET b discriminator distribution to templates obtained using jets from a simulated γ+jets sample.On average, the selected jets in the ER and DR have b jet fractions of 15% and 4%, respectively.The linear relation in Eq. ( 6) can then be inverted to obtain the mistag rates for different underlying jet flavors ϵ( f , p T ), which is used in Eq. ( 5) to estimate the misidentification rate of jets in the CR. Figure 6 gives examples of the estimated EJ tagger misidentification probabilities for "u-tag 1" and "uGNN tag 1" taggers, defined in Tables 3 and 5.
Because the Phase 1 upgrade of the pixel detector [24] improved the track reconstruction performance, the evaluation of the expected background is performed separately for the pixel Phase 0 geometry (2016) and Phase 1 geometry (2017-2018).

Validation tests
To validate the background estimation by checking the soundness of the calculations, closure tests are performed in validation regions (VRs) that use selection criteria that are orthogonal to the SRs.The VRs are defined using signal-like selection criteria that require at least two jets passing a validation EJ tagger.The observed number of events in the VR is compared to the number predicted using our background estimation technique.For the model-agnostic EJ tagging VR, the H T requirement of the validation event selection is inverted to ensure orthogonality with the SRs and small signal contamination.The EJ identification requirements for the validation EJ tagger are based on the u-tag 1 and a-tag 1 requirements for unflavored and flavor-aligned models, respectively, with one jet-variable selection requirement relaxed to further reduce the effect of any small signal contamination.The VR for the ML-based EJ tagging uses a validation tagger that selects jets with GNN scores in a region disjoint from the SRs.To further increase the number of events that pass the ML-based VR, we relax the H T and jet p T requirements.The full list of the selection criteria used for the VRs is given in Tables 3-6.The signal contamination in these VRs is less than 1% for all surveyed signal models.
The number of events passing the VR selection criteria with at least two jets passing the validation jet tag, as well as the predicted number using the estimation described in Eq. ( 4), is given in Table 7.No significant deviation between the observed and the estimation results is observed, indicating that the background estimation calculation is robust.The EJ tagger misidentification probability for b quark jets (red, orange) and light jets (light blue, dark blue) as a function of jet p T for the model-agnostic tagger "u-tag 1" (left) and the ML-based tagger "uGNN tag 1" (right), as defined in Tables 3 and 5, evaluated using data (red, dark blue) and generator-level flavor information from simulated samples (orange, light blue) in events containing a high-p T photon.The lower panel shows the pull, defined as the difference between the mistag rate calculated in simulation and mistag rate measured in data, scaled down by the uncertainty measured in data.The error bars indicate the uncertainties in the mistag rates measured in simulation scaled by the uncertainties measured in data.

Background uncertainties
The main sources of systematic uncertainty in this search arise from the background estimation method based on control samples in data.The kinematic differences between the SM jets in the CR photon-triggered data used for the misidentification rate calculation and the SM jets in the H T -triggered data used in the SR may lead to slightly different misidentification rates.We estimate the corresponding uncertainties by applying the background prediction method in Eq. ( 4) to simulated SM multijets events, using either the mistag rate from the SM multijet simulation or from the γ+jets simulation.The uncertainty is taken as the difference in the background predictions obtained with the two mistag rates.There is also an uncertainty in the flavor composition used in the misidentification rate estimations.This uncertainty is estimated from simulated background events by comparing the flavor-decomposition estimate described in Section 6 with one derived from generator-level flavor information.Finally, there is the uncertainty associated with the choice of variables used to parameterize the mistag rate.This uncertainty is estimated by comparing the estimation results when parameterizing the mistag rate as a function of jet p T versus track multiplicity, and as a function of coarse versus fine p T binning, as the mistag rate has a significant dependence on the parameter chosen and binning scheme used.The uncertainties in the flavor composition and mistag rate parameterization are larger on average than the uncertainty due to CR and SR differences.The modelagnostic method is more affected by changes in the chosen mistag rate parameter because the jet-level variables used depend on track multiplicity.The GNN method is more sensitive to Table 7: The observed yield of events in data satisfying the validation selection criteria with at least two jets passing the corresponding validation tag, and the estimation based on the misidentification rate calculated using validation events with exactly one jet passing the validation tagger scaled by the factor given in Eq. ( 4).The statistical and systematic uncertainties are reported for the estimated yields.the mistag rate binning scheme because of statistical fluctuations due to the small number of jets in some of the bins.The contribution of all the background uncertainty sources for various signal model search regions is summarized in Table 8.These three uncertainty sources are considered to be uncorrelated and summed in quadrature.The resulting uncertainties are given in Tables 7 and 10.The difference between the observed and estimated yields in the a-set validation, shown in Table 7, was investigated.No anomaly was found in the data, and thus it is assumed to arise from a statistical fluctuation.

Signal uncertainties
Various sources of uncertainty affecting the signal yields are also considered.The integrated luminosity uncertainties for the 2016, 2017, and 2018 data-taking periods are 1.2, 2.3, and 2.5%, respectively [54][55][56].There is a small difference in the trigger efficiency between data and simulation, with a ratio varying between 0.95 and 1.00.A correction factor compensates for this difference, and its statistical uncertainty is propagated to an uncertainty in the signal acceptance.The evaluation of jet energy correction uncertainties is performed as a function of jet p T and η, propagated to all jet-related kinematic variables, and then to an uncertainty in the signal acceptance.Uncertainty sources are treated as fully correlated across the years for the jet energy scale (JES), and uncorrelated for the jet energy resolution (JER).While the pileup distribution in the simulation is reweighted to match the data, there is a 4.6% uncertainty in the total inelastic cross section [57], which is treated as fully correlated across the years.The impact parameter distribution is slightly broader in the data CRs than in simulation.A smearing function that corrects this is applied to tracks in the simulated samples, and the resulting changes to the signal acceptances are used to estimate the modeling uncertainty.The corrected values are also used to recompute the GNN score.The uncertainty in the acceptance from the uncertainties in the PDFs is evaluated by reweighting events with different PDF variations [35,58].As recon-struction efficiency differences between data and simulation for displaced tracks are difficult to evaluate, we varied the reconstruction efficiency for tracks with d xy > 4 cm (approximately the distance to the second layer of the pixel detector) by 10% and found an ≈1% overall change in signal acceptance.As the measured difference in reconstruction efficiency in data and MC is expected to be much smaller [59,60], no uncertainty is assigned.The uncertainty from the factorization scale (µ F ) and renormalization scale (µ R ) choices is estimated by independently varying µ F and µ R by factors of 2 and 0.5 [61,62].A summary of the estimated uncertainties for the various signal model acceptances is given in Table 9.

Results
After the full event selection, the observed and expected event yields corresponding to each selection set are shown in Table 10.No statistically significant deviation from the SM background prediction is observed.The observation is used to set upper limits on the various signal parameter models considered using the CL s criterion [63,64].The test statistic is defined as the likelihood ratio employed in Higgs boson analyses in ATLAS and CMS, as elaborated in Ref. [65].Upper limits at the 95% confidence level (CL) on the production cross section in the 2-dimensional plane defined by the signal parameters are shown in Figs.7 (m π dark = 10 GeV) and 8 (m π dark = 20 GeV) for both the unflavored and the flavor-aligned scenarios, and in Fig. 9 (m π dark = 6 GeV) for the flavor-aligned scenario.These excluded cross sections are compared to the theoretical prediction to derive the exclusion regions shown as red and black curves for the expected and observed 95% CL limit, respectively.The red curves also have an associated red band to indicate the 68% CL variation on the expected exclusion limit.
In the unflavored coupling model, the key tagging variable for the model-agnostic method is the mean d xy of the associated tracks in the jet.In the case where the dark pion lifetime is too short, the displaced tracks from the dark pion decay products will be very similar to prompt SM tracks.In the opposite case, where the dark pion lifetime is very large, the dark pions increasingly decay outside the tracker.Thus, this search method is less sensitive to these two cases, giving reduced performance, as shown in the upper-left plots of Figs. 7 and 8.In contrast, the GNN method performs well even at low dark pion lifetimes, as the GNN leverages information from track-level relationships within EJs to keep the signal acceptance high.However, signal sensitivity is still limited for long dark pion lifetimes, as shown in the upper-right plots of Figs. 7 and 8.In the flavor-aligned scenario, the differentiating power for the model-agnostic method comes mainly from the multiplicity of associated tracks with large displacements.Unlike d xy , the multiplicity distribution is more uniform for different dark pion lifetimes (as we only consider cτ max π dark above the typical b hadron lifetime) leading to less dependence on cτ max π dark as seen in the lower-left plots of Figs. 7 and 8. Similarly, the GNN sensitivity is less dependent on changes in lifetime, resulting in a stable signal acceptance across the cτ max π dark range.In the flavor-aligned models where m π dark = 6 GeV, the dark pions decay predominantly to light SM quarks because of mass constraints, resulting in fewer displaced tracks in the events.This reduces the selection efficiencies for these signal models, leading to the model-agnostic method having less sensitivity compared to models with larger m π dark , as shown in Fig. 9.
The ML-based EJ tagging methods yield more stringent limits on the surveyed models.In the unflavored model, X dark masses up to 1900 and 1950 GeV are excluded at 95% CL for m π dark of 10 and 20 GeV, respectively, closely matching the expected limits.For the flavor-aligned models, the X dark mass exclusion at 95% CL increases to 1950, 1850, and 1900 GeV for m π dark of 6, 10, and 20 GeV, respectively, again matching well the expected limits.Relative to the modelagnostic tagger, the greatest increase in sensitivity is for the unflavored signal models with short lifetimes, where the dark showers generate nearly prompt tracks that are difficult to tag using simple requirements on the track displacement variables.

95% CL upper limit on cross section [fb]
Figure 7: The 95% CL upper limits on the production cross section for various signal models in the unflavored scenario (upper plots) and the flavor-aligned scenario (lower plots) with m π dark = 10 GeV using the model-agnostic (GNN) EJ tagging method, on the left (right).The red curve is the expected exclusion limit, with the band representing its 68% CL variation.The black curve is the observed limit.The dark blue dotted curves in the upper plots are the expected and observed limits previously obtained by CMS [21].
For the flavor-aligned scenario in the model-agnostic approach, the exclusion range on m X dark initially decreases as cτ max π dark increases.This is due to the decline in signal selection efficiency caused by the loss in track reconstruction efficiency, particularly for more displaced tracks.As cτ max π dark continues to increase, the displacement of the dark pion with the next longest lifetime, calculated from Eq. (2), becomes significant for models with m π dark = 10 GeV, resulting in a recovery in signal selection efficiency.However, for models with m π dark = 6 GeV, the decay of dark pions to b b is constrained due to mass limitations.Consequently, the fraction of longlived dark pions is higher compared to models with m π dark = 10 GeV, leading to decreasing signal efficiency even with longer lifetimes.
The GNN-based limits for the flavor-aligned models that utilize the aGNN set 2 selection have a  For that selection set, 0.3 background events are predicted, which implies that the most frequent observation will be exactly zero events if the background-only hypothesis is correct.Because the likelihood function is constructed from Poisson probabilities, this leads to the backgroundonly test statistic distribution exhibiting discrete, narrow peaks.With a free-floating signal strength, the test statistic distribution for the signal plus background hypothesis shares similar features.As a result, only a small change in the signal strength is required to obtain a CL s value of 0.05 when going from 95 to 68% CL to the expected median of the background expectation.The resulting expected limit bands in this region are narrow and asymmetrical.The 95% CL upper limits on the production cross section for various signal models in the flavor-aligned scenario with m π dark = 6 GeV using the model-agnostic (GNN) EJ tagging method, on the left (right).The red curve is the expected exclusion limit, with the band representing its 68% CL variation.The black curve is the observed limit.

Summary
A search for emerging jet signatures arising from a strongly interacting dark sector produced in proton-proton collisions has been presented, using data corresponding to an integrated luminosity of 138 fb −1 at √ s = 13 TeV.The signal model contains a family of dark quarks that couple to the standard model (SM) quarks via a scalar mediator X dark .Dark pions (π dark ) with a significant lifetime (cτ π dark ) are produced by the hadronization of the dark quarks; these then decay to SM particles at vertices displaced from the proton-proton interaction point.As the scalar mediator is assumed to be produced in pairs, and each decays to an SM quark and a dark quark, the signature of this process is two SM jets plus two jets of particles with constituents emerging from displaced vertices.
Both unflavored and flavor-aligned couplings between the SM quarks and the dark quarks are examined in the search.Events are selected using either a traditional cut-based approach or a graph neural network to identify emerging jets, in combination with other event-level selection criteria.The overall selection requirements are optimized for each coupling scenario and for different combinations of the mediator particle mass, dark pion mass, and dark pion lifetime.No excess of events beyond the SM expectations is found, and the observed 95% confidence level exclusion limits agree with the expected limits.For the unflavored model, dark mediator masses m X dark < 1950 GeV are excluded for cτ π dark ≈ 100 mm and m π dark = 10 GeV, while the flavor-aligned model result excludes m X dark < 1850 GeV at cτ max π dark ≈ 500 mm for m π dark = 10 GeV.This result surpasses the previous search for emerging jets in the unflavored scenario, increasing the experimental limit of the dark mediator particle by ≈500 GeV to set the most stringent limits to date, and provides the first direct exclusion of the flavor-aligned scenario.

Figure 1 :
Figure 1: Feynman diagrams for pair production of dark mediator particles via gluon-gluon fusion (left) and quark-antiquark annihilation (right), with each mediator decaying to an SM quark and a dark quark.

Figure 2 :
Figure 2: Distributions of the jet variables d xy (left) and α 3D with D max N = 4 (right) used for the model-agnostic EJ tagging that targets the unflavored dark sector models are shown for data (points), SM multijet simulation (gray line), and signal jets in simulation (colored lines).The sums of the entries are normalized to unity.The flavor-aligned dark sector model has three different dark meson lifetime ranges: long-lived dark mesons (up to 50 cm for the κ 0 parameters considered in this search), dark mesons with a b-hadron-like displacement from prompt dark meson decays into b quarks, and dark mesons that decay promptly into light SM quarks.In addition, since at least one dark quark in the

Figure 3 :
Figure 3: Distributions of the jet variables used for the model-agnostic EJ tagging targeting flavor-aligned dark sector models for jets obtained in data (points), SM multijet simulation (gray line), and simulated signal jets (colored lines).The distribution of the number of tracks with d xy > 10 −2.2 cm (jet girth) is shown on the left (right).The sums of the entries are normalized to unity.

Figure 4 :
Figure 4: Distributions of the output score of the uGNN (left) and aGNN (right) for the data (points with error bars), SM multijet simulation (dark gray line), and signal simulation (colored lines).The signal distributions in the left (right) plot are generated from the unflavored (flavoraligned) model.Bins are chosen to correspond to the jet selection criteria defined in Table5.The uncertainties in the SM multijet simulation are too small to be visible.The systematic uncertainties in the simulated signal distributions are small and have been omitted for reasons of clarity.The sums of the entries are normalized to unity.

Figure 5 :
Figure 5: Template fit of the DEEPJET discriminator used to determine the b jet fraction of the non-EJ tagged jets for data events that pass the "u-set validation" (uGNN validation) selection criteria shown on the left (right), except with the requirement on the number of EJ-tagged jets changed from 2 to 1.The lower panels show the ratio of the number of jets in the data compared to the sum of the fitted template distributions.

Figure 6 :
Figure6: The EJ tagger misidentification probability for b quark jets (red, orange) and light jets (light blue, dark blue) as a function of jet p T for the model-agnostic tagger "u-tag 1" (left) and the ML-based tagger "uGNN tag 1" (right), as defined in Tables3 and 5, evaluated using data (red, dark blue) and generator-level flavor information from simulated samples (orange, light blue) in events containing a high-p T photon.The lower panel shows the pull, defined as the difference between the mistag rate calculated in simulation and mistag rate measured in data, scaled down by the uncertainty measured in data.The error bars indicate the uncertainties in the mistag rates measured in simulation scaled by the uncertainties measured in data.
limit on cross section[fb]

Figure 8 :
Figure8: The 95% CL upper limits on the production cross section for various signal models in the unflavored scenario (upper plots) and the flavor-aligned scenario (lower plots) with m π dark = 20 GeV using the model-agnostic (GNN) EJ tagging method, on the left (right).The red curve is the expected exclusion limit, with the band representing its 68% CL variation.The black curve is the observed limit.narrow expected band, as shown in the lower right plots of Figs.7-8and the right plot of Fig.9.For that selection set, 0.3 background events are predicted, which implies that the most frequent observation will be exactly zero events if the background-only hypothesis is correct.Because the likelihood function is constructed from Poisson probabilities, this leads to the backgroundonly test statistic distribution exhibiting discrete, narrow peaks.With a free-floating signal strength, the test statistic distribution for the signal plus background hypothesis shares similar features.As a result, only a small change in the signal strength is required to obtain a CL s value of 0.05 when going from 95 to 68% CL to the expected median of the background expectation.The resulting expected limit bands in this region are narrow and asymmetrical.
limit on cross section[fb]

Figure 9 :
Figure9: The 95% CL upper limits on the production cross section for various signal models in the flavor-aligned scenario with m π dark = 6 GeV using the model-agnostic (GNN) EJ tagging method, on the left (right).The red curve is the expected exclusion limit, with the band representing its 68% CL variation.The black curve is the observed limit.
ment Project, and the National Science, Research and Innovation Fund via the Program Management Unit for Human Resources & Institutional Development, Research and Innovation, grant B37G660013 (Thailand); the Kavli Foundation; the Nvidia Corporation; the SuperMicro Corporation; the Welch Foundation, contract C-1845; and the Weston Havens Foundation (USA).

Table 1 :
Model parameters for the unflavored model.

Table 2 :
Parameters used for the flavor-aligned model.In order to probe a range of lifetimes, the values of κ 0 listed in columns 3-7 are tuned to give the desired cτ max π dark values of 5, 25, 45, 100, and 500 mm.In addition, samples were made with fixed κ 0 = 1, with a resultant value of cτ max π dark that depends on the other model parameters.
The signal distributions are similar despite the wide range of dark meson cτ values.

Table 3 :
Emerging jet selection criteria for the model-agnostic analysis designed for the unflavored scenario.The validation regions are discussed in Section 6.The symbols in parentheses indicate a minimum (>) or maximum (<) requirement.

Table 4 :
Emerging jet selection criteria for the model-agnostic analysis designed for the flavoraligned scenario.The validation tag is described in Section 6.The symbols in parentheses indicate a minimum (>) or maximum (<) requirement.

Table 5 :
The GNN score range used to identify a jet as an EJ.The uGNN (aGNN) tag indicates that the tagger uses the output score of the GNN trained on the unflavored (flavor-aligned) simulated signal samples.The validation tags are described in Section 6.

Table 6 :
Event selection criteria used for the analysis.The validation selection criteria are described in Section 6.

Table 8 :
Mean and standard deviation (std.) of the relative uncertainty calculated on the background estimations, by source, in percent.

Table 9 :
Mean and standard deviation (std.) of the relative uncertainty calculated on the unflavored and flavor-aligned samples, by source, in percent.

Table 10 :
The estimated number of events from the background prediction based on control samples in data and the observed event yields.Statistical and systematic uncertainties in the estimated background are provided.