Search for long-lived particles with displaced vertices in multijet events in proton-proton collisions at √ s = 13 TeV

Results are reported from a search for long-lived particles in proton-proton collisions at √ s = 13 TeV delivered by the CERN LHC and collected by the CMS experiment. The data sample, which was recorded during 2015 and 2016, corresponds to an integrated luminosity of 38.5 fb − 1 . This search uses benchmark signal models in which long-lived particles are pair-produced and each decays into two or more quarks, leading to a signal with multiple jets and two displaced vertices composed of many tracks. No events with two well-separated high-track-multiplicity vertices are observed. Upper limits are placed on models of R -parity violating supersymmetry in which the long-lived particles are neutralinos or gluinos decaying solely into multijet ﬁnal states or top squarks decaying solely into dijet ﬁnal states. For neutralino, gluino, or top squark masses between 800 and 2600 GeV and mean proper decay lengths between 1 and 40 mm, the analysis excludes cross sections above 0.3 fb at 95% conﬁdence level. Gluino and top squark masses are excluded below 2200 and 1400 GeV, respectively, for mean proper decay lengths between 0.6 and 80 mm. A method is provided for extending the results to other models with pair-produced long-lived particles.


Introduction
Many theories for physics beyond the standard model (SM) predict the pair production of long-lived particles decaying to final states with two or more jets. Some examples include Rparity violating (RPV) supersymmetry (SUSY) [1], split SUSY [2], hidden valley models [3], and weakly interacting massive particle baryogenesis [4]. Searches for long-lived particles significantly expand the parameter space of physics beyond the SM probed by the experiments at the CERN LHC.
This analysis is sensitive to models of new physics in which pairs of long-lived particles decay to final states with multiple charged particles. We present results for two benchmark signal models, as well as a method for applying the results more generally. The "multijet" benchmark signal is motivated by a minimal flavor violating model of RPV SUSY [5] in which the lightest SUSY particle is a neutralino or gluino, either of which is produced in pairs. The neutralino or gluino is long-lived and decays into a top antiquark and a virtual top squark, and the virtual top squark decays into strange and bottom antiquarks, resulting in a final state with many jets. The "dijet" benchmark signal corresponds to an RPV phenomenological model in which pairproduced long-lived top squarks each decay into two down antiquarks [6]. The diagrams for the multijet and dijet signal models are shown in Fig. 1.  Figure 1: Diagrams for the multijet (left) and dijet (right) benchmark signal models used in this analysis. In the multijet signal model, long-lived neutralinos ( χ 0 ) or gluinos ( g) decay into top, bottom, and strange antiquarks, via a virtual top squark ( t). In the dijet signal model, longlived top squarks decay into two down antiquarks. The charge conjugate processes are also considered.
The experimental signature of long-lived exotic particle pairs is two displaced vertices, each consisting of multiple charged-particle trajectories intersecting at a single point. In this analysis, a custom vertex reconstruction algorithm identifies displaced vertices in the CMS detector. We focus on signals with intermediate lifetimes, corresponding to mean proper decay lengths cτ from 0.1 to 100 mm, by identifying vertices that are displaced from the beam axis but within the radius of the beam pipe. The signal is distinguished from the SM background based on the separation between the vertices: signal events have two well-separated vertices, while background events are dominated by events with only one displaced vertex, usually close to the beam axis.
The CMS Collaboration searched for displaced vertices in proton-proton (pp) collisions at a center-of-mass energy of √ s = 8 TeV in 2012 [7]. This analysis is an updated version of the search, using pp collisions collected at √ s = 13 TeV. It improves upon the previous analysis, because of better background suppression along with a refined procedure for estimating the background and the associated systematic uncertainties. A similar analysis was performed by the ATLAS Collaboration [8]. The CMS and ATLAS Collaborations have also searched for displaced jets [9-12], displaced leptons [13,14], displaced photons [15], and displaced lepton jets [16]. The analysis reported here is sensitive to shorter lifetimes than those probed by previous analyses.

The CMS detector
The central feature of the CMS detector is a superconducting solenoid providing a magnetic field of 3.8 T aligned with the proton beam direction. Contained within the field volume of the solenoid are a silicon pixel and strip tracker, a lead tungstate electromagnetic calorimeter, and a brass and scintillator hadron calorimeter. Muon tracking chambers are embedded in the steel flux-return yoke that surrounds the solenoid. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [17].
The silicon tracker, which is particularly relevant to this analysis, measures the trajectories of charged particles in the range of pseudorapidity, η, up to |η| < 2.5. For nonisolated particles with transverse momentum, p T , of 1 to 10 GeV and |η| < 1.4, the track resolutions are typically 1.5% in p T , 25-90 µm in the impact parameter in the transverse plane, and 45-150 µm in the impact parameter in the longitudinal direction [18]. Jets are reconstructed from particle-flow [19] candidates using the anti-k T algorithm [20,21] with a distance parameter of 0.4.
Events of interest are selected using a two-tiered trigger system [22]. The first level is composed of custom hardware processors, and the second level consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing.

Event samples
The data sample used in this analysis corresponds to a total integrated luminosity of 38.5 fb −1 , collected in pp collisions at √ s = 13 TeV in 2015 and 2016. Events are selected using a trigger initially requiring H T > 800 GeV, where H T is the scalar sum of the p T of jets in the event with p T > 40 GeV. In the last data-taking period of 2016, corresponding to 22% of the total integrated luminosity, the higher instantaneous luminosity required the H T threshold to be raised to 900 GeV.
Simulated events are used to model the signal processes. In the multijet and dijet signal models, long-lived particles are produced in pairs; the "multijet" and "dijet" refer to the decay of each long-lived particle. For the multijet signals, the long-lived particle is a neutralino that undergoes a three-body decay into top, bottom, and strange quarks. In this analysis, the final results are the same if the neutralinos are replaced with gluinos. For the dijet signals, the long-lived particle is a top squark that decays into two down antiquarks. Signal samples with various neutralino or top squark masses m (300 ≤ m ≤ 2600 GeV) and lifetimes τ (0.1 ≤ cτ ≤ 100 mm) are produced using PYTHIA 8.212 [23] with the NNPDF2.3QED parton distribution functions [24].
Backgrounds arising from SM processes are dominated by multijet and top quark pair production (tt) events. The multijet processes include b quark pair production events. The multijet and tt events are simulated using MADGRAPH 5.2.2.2 [25] with the NNPDF3.0 parton distribution functions [26], at leading order for the multijet events and at next-to-leading order for the tt events.
For all samples, hadronization, showering, and R-hadron physics are simulated using PYTHIA 8.212. The underlying event tunes used are CUETP8M1 [27] for the signal samples and the multijet background samples, and CUETP8M2T4 [28] for the tt samples. The detector response for all simulated samples is modeled using a GEANT4-based simulation [29] of the CMS detector. The effects of additional pp interactions within the same or nearby bunch crossings ("pileup") are included by overlaying additional simulated minimum-bias events, such that the resulting distribution of the number of interactions per bunch crossing matches that observed in the experiment.

Event preselection
For an event to be selected for further analysis, it must have at least four jets, each with p T > 20 GeV and |η| < 2.5. Since the final states for the signal models considered all have at least four quarks, this requirement has little impact on signal events but is beneficial in suppressing background.
To ensure that the efficiency of the H T trigger is well understood, a stricter requirement of H T > 1000 GeV is applied offline, where H T is the scalar sum of the p T of jets with p T > 40 GeV, to match the trigger jet definition. For events with at least four jets and H T > 1000 GeV, the trigger efficiency, determined using events satisfying a trigger requiring the presence of at least one muon, is (99 ± 1)%.

Vertex reconstruction and selection
Displaced vertices are reconstructed from tracks in the silicon tracker. These tracks are required to have p T > 1 GeV; measurements in at least two layers of the pixel detector, including one in the innermost layer; measurements in at least six layers of the strip detector if |η| < 2, or in at least seven layers if |η| ≥ 2; and significance of the impact parameter with respect to the beam axis measured in the x-y plane (the magnitude of the impact parameter divided by its uncertainty, referred to as |d xy |/σ d xy ) of at least 4. The first three criteria are track quality requirements, imposed in order to select tracks with small impact parameter uncertainties. The requirement on track |d xy |/σ d xy favors vertices that are displaced from the beam axis.
The vertex reconstruction algorithm forms seed vertices from all pairs of tracks satisfying the track selection criteria, and then merges them iteratively until no track is used more than once. A set of tracks is considered to be a vertex if a fit with the Kalman filter approach [30] has a χ 2 per degree of freedom (χ 2 /dof) that is less than 5. Subsequently, for each pair of vertices that shares a track, the vertices are merged if the three-dimensional distance between the vertices is less than 4 times the uncertainty in that distance and the fit has χ 2 /dof < 5. Otherwise, the shared track is assigned to one of the vertices depending on the value of its three-dimensional impact parameter significance with respect to each of the vertices: if both values are less than 1.5, the shared track is assigned to the vertex that has more tracks already; if either value is greater than 5, the shared track is dropped from that vertex; otherwise, the shared track is assigned to the vertex with respect to which it has a smaller impact parameter significance. If a track is removed from a vertex, that vertex is refit, and if the fit satisfies the requirement of χ 2 /dof < 5, the old vertex is replaced with the new one; otherwise it is dropped entirely.
This procedure produces multiple vertices per event, only some of which are signal-like. In order to select vertices with high quality, we impose additional requirements: each vertex is required to have at least five tracks; a distance from the detector origin measured in the x-y plane of less than 20 mm, to avoid vertices from interactions in the beam pipe or detector material; a distance from the beam axis measured in the x-y plane, defined as d BV , of at least 0.1 mm, to suppress displaced primary vertices; and an uncertainty in d BV of less than 25 µm, to select only well-reconstructed vertices. The requirement on the uncertainty in d BV also suppresses displaced vertices from single b jets, which are composed of tracks that are mostly aligned with the vertex displacement from the beam axis and have small opening angles between the tracks.
Since signal events contain a pair of long-lived particles, we require events to have two or more vertices satisfying the above requirements. The signal region is composed of these two-vertex events. Simulation predicts there is on the order of 1 background event in the signal region for 38.5 fb −1 of data. However, establishing the possible presence of a signal relies on an accurate determination of the background, and for this we rely on data.
The vertex selection requires each vertex to have five or more tracks, but events with vertices with three or four tracks provide valuable control samples. These control samples, which are used to test the background prediction, have a factor of 10-100 more background events than in the signal region and negligible potential signal contamination. Simulation studies show that events containing 3-track, 4-track, and ≥5-track vertices have similar distributions of event variables, such as H T , number of jets, and quark flavor composition, as well as vertex variables, such as d BV , uncertainty in d BV , and angular separation between tracks.

Search strategy
The signal is discriminated from the SM background using the distance between the two vertices measured in the x-y plane, which is defined as d VV . In signal events, the two long-lived particles are emitted approximately back-to-back, leading to large separations. If an event has more than two vertices, the two vertices with the highest number of tracks are selected for the d VV calculation. In the case in which two vertices have the same number of tracks, the vertex with the higher mass is chosen, where the mass is reconstructed using the momenta of the tracks associated with the vertex, assuming that the particles associated with the tracks have the mass of a charged pion.
We fit the distribution of d VV to extract the signal, using templates to represent the d VV distributions for signal and background. The signal d VV templates are taken directly from simulation, with a distinct template for each signal mass and lifetime. The background template is constructed from events in data that have exactly one vertex, as described in Section 7. Figure 2 shows examples of the d VV distribution for simulated multijet signals with m = 800 GeV and production cross section 1 fb, with the background template overlaid. The distributions depend primarily on the signal lifetime; those for other signal masses and for the dijet signals are similar. The small peaks at low values of d VV are associated with events for which the two vertices are reconstructed from the same long-lived particle, with the effect being larger for the multijet signals.
In the signal extraction procedure, the d VV distribution is broken into three bins: 0-0.4 mm, 0.4-0.7 mm, and 0.7-40 mm. The two bins with d VV > 0.4 mm have low background. This division maximizes the signal significance for scenarios with intermediate and long lifetimes. Figure 3 shows the signal efficiency as a function of signal mass and lifetime in the region d VV > 0.4 mm. The signal efficiency increases with increasing mass because the events are more likely to satisfy the H T trigger requirement. As lifetime increases, the signal efficiency initially increases because of better separation from the beam axis, but then starts to decrease when the lifetime is so long that decays occur more often beyond the fiducial limit at the beam pipe. The efficiency is above 10% for cτ > 0.4 mm and m > 800 GeV.

7 Background template
Displaced vertices in background events arise from one or more misreconstructed tracks overlapping with other tracks. These events are dominated by multijet and tt processes. Background events with two vertices are primarily random coincidences of independently misreconstructed vertices. Accordingly, we construct the two-vertex background template, denoted by d C VV , by combining information from events in data that have exactly one vertex. There are approximately 1000 times more events with only one vertex than there are with two or more vertices, consistently for 3-track, 4-track, and ≥5-track vertices. Table 1 lists the number of events in each of the event categories. Table 1: Event yields in data. The "one-vertex" events have exactly one vertex with the specified number of tracks, and the "two-vertex" events have two or more vertices each with the specified number of tracks. The control samples are composed of the events with 3-track and 4-track vertices, the background template is constructed using the ≥5-track one-vertex events, and the signal region consists of the ≥5-track two-vertex events. Each entry in the d C VV template is calculated from two values of d BV and a value of ∆φ VV , where d BV is the distance measured in the x-y plane from the beam axis to one vertex, and ∆φ VV is the azimuthal angle between the two vertices. The template also includes corrections for the merging of nearby vertices in the vertex reconstruction algorithm and for possible correlations between individual vertices in background events with pairs of b quarks. The following paragraphs describe each of the inputs to the d C VV template construction method. The d BV values are sampled from the distribution shown in Fig. 4 for the ≥5-track one-vertex events in data. The distribution starts at 0.1 mm because of the fiducial requirement imposed to avoid primary vertices, and falls off exponentially. Signal contamination in the one-vertex sample is negligible for values of the signal cross section that have not been excluded by the previous similar analysis [7].
The statistical uncertainty in the d C VV template, taken as the root-mean-square of yields in an ensemble of simulated pseudo-data sets, depends on the number of entries in the parent d BV distribution. To ensure sufficient sampling of the tail of this distribution, the number of entries in the d C VV template is 20 times the number of one-vertex events. Values of ∆φ VV are approximated by sampling the distribution of jets in data. Since background vertices arise from misreconstructed tracks in jets, their position vectors tend to be correlated with jet momentum vectors. The angle between vertex positions can therefore be modeled using the observed distribution of azimuthal angles between pairs of jets, denoted as ∆φ JJ . The ∆φ JJ distribution used for the d C VV construction is taken from the 3-track one-vertex sample, which has a greater number of events than the 4-track and ≥5-track one-vertex samples. There are no significant differences in the ∆φ JJ distribution among these three samples.
To emulate the behavior of the vertex reconstruction algorithm in merging overlapping vertices, the d C VV template is corrected by the survival probability of pairs of vertices as a function of d VV . This efficiency is estimated by counting the number of remaining vertex pairs at each iteration of the vertex reconstruction algorithm. The efficiency correction suppresses small d C VV values, resulting in a yield in the first d C VV bin that is lower by a factor of approximately 2. Pair production of b quarks introduces d BV correlations in two-vertex events that are not ac- counted for when pairing single vertices at random. This is because the tracks from b quark decays are more likely to satisfy the track |d xy |/σ d xy requirement and therefore produce vertices. In simulation, the mean d BV in events with b quarks is higher than in events without b quarks by 47 ± 1 µm for 3-track vertices, by 52 ± 3 µm for 4-track vertices, and by 50 ± 6 µm for ≥5-track vertices. The fractions of events with b quarks are consistent across the 3-track, 4-track, and ≥5-track vertex samples: approximately 50% in one-vertex events and approximately 78% in two-vertex events. We determine corrections to the d C VV template for these d BV correlations by constructing d C VV separately for simulated background events with and without generated b quarks, combining the distributions in the ratio of two-vertex events with and without b quarks, and then dividing the resulting distribution by the nominal d C VV template. The b quark correction enhances larger d C VV values, resulting in a yield in the last d C VV bin that is higher by a factor of 1.6 ± 0.4.
Evidence that the background template construction method is valid is presented in the upper left, upper right, and lower left plots in Fig. 5, where d C VV is compared to the observed twovertex d VV distributions in the low-track-multiplicity control samples in data. There is good agreement between the relative d C VV and d VV populations in each of the three bins of the final fit. For example, in the 3-track control sample, where this agreement is most stringently tested, the ratios d C VV /d VV are 0.93 ± 0.06 in the 0-0.4 mm bin, 0.97 ± 0.07 in the 0.4-0.7 mm bin, and 1.44 ± 0.20 in the 0.7-40 mm bin.
The background template for the signal region is shown in the lower right plot in Fig. 5.

Systematic uncertainties
The signal yield is extracted from a fit of the signal and background templates to the observed d VV distribution. The free parameters are the normalizations of the signal and background templates, subject to the constraint that their combined yield matches the data. The result of the fit relies on the relative yields in the three bins of the templates, but is insensitive to the fine  details of the template distributions. This section describes the systematic uncertainties in the background template. It also addresses the systematic uncertainties in the signal efficiencies and templates.

Systematic uncertainties in signal efficiencies and templates
The signal d VV templates are taken directly from simulation of benchmark models with clearly specified parameters, so the systematic uncertainties arise from biases in the detector simulation and reconstruction. The dominant source of uncertainty is due to the vertex reconstruction efficiency. Smaller effects arise from track resolution, pileup, jet energy scale and resolution, integrated luminosity, and trigger efficiency.
The effect due to the vertex reconstruction efficiency is evaluated by comparing the efficiency in data and simulation to reconstruct signal-like vertices created by displacing tracks artificially. In events passing the preselection requirements (Section 4), we choose some number of light parton and b quark jets that have p T > 50 GeV, |η| < 2.5, and at least four particle-flow candidates. We then artificially displace the tracks associated with those jets as described below.
The magnitude of the displacement vector is sampled from an exponential distribution with scale parameter cτ = 10 mm, restricted to values between 0.3 and 20 mm. The direction of the displacement vector is calculated from the vector sum of the momentum of the jets. This direction is smeared to emulate the difference between the vertex displacement direction and jet momentum direction in signal events due to mismeasurements from tracking inefficiency and missing neutral particles.
The track selection requirements and vertex reconstruction algorithm are applied to the resulting set of tracks. We then evaluate the fraction of events in which a vertex satisfying all vertex selection requirements is reconstructed near the artificial displacement position. This fraction is evaluated for different numbers of displaced light parton or b quark jets. The largest disagreement between data and simulation gives an 11.5% uncertainty per vertex. For two-vertex events, the uncertainty is 23%. Varying the scale parameter of the exponential distribution or the amount that the direction is smeared within reasonable values has negligible effect on the difference between data and simulation.
The difference in vertex reconstruction efficiency between data and simulation could also depend on the magnitude of the artificial displacement. This dependence is found to be small, and the resulting difference in the signal d VV templates has a negligible effect on the signal yield extracted from the fit.
The selection of the tracks used in the vertex reconstruction requires that each track has a value of |d xy |/σ d xy of at least 4. The efficiency of this requirement is sensitive to the impact parameter resolution of the tracks. The mean impact parameter uncertainty is 2% larger in data than in simulation. The magnitude of this effect is quantified by tightening the requirement on the transverse impact parameter significance by 2% and evaluating the change in the signal efficiency. The maximum effect on the various signal masses and lifetimes, 5%, is taken to be the systematic uncertainty in the signal efficiency. This effect is corrected for in the vertex resolution study discussed earlier.
The uncertainties in the jet energy scale and resolution [31] could affect the total jet energy and change the probability that events satisfy the H T selection. Varying the jet energy scale by one standard deviation results in a change in the signal efficiency of 5% or less for all signal samples, and varying the jet energy resolution by one standard deviation changes the efficiency by 2% or less. We therefore assign these as the corresponding systematic uncertainties in the signal efficiency.
The uncertainty in the integrated luminosity is 2.3% for 2015 [32] and 2.5% for 2016 [33]. The uncertainty in the signal efficiency due to pileup is 2%. The uncertainty in the trigger efficiency is 1%. Table 2 summarizes the systematic uncertainties in the signal efficiency. We assume there are no correlations among them, and add them in quadrature to obtain the overall uncertainty.

Systematic uncertainties in the background template
The d C VV background template is constructed from the large sample of events in data with exactly one vertex. Systematic uncertainties in the background template arise from effects that could cause differences between the constructed d C VV distribution and the true d VV distribution of two-vertex background events. The 3-track control sample is used to evaluate the scale of these differences. The deviation from unity of the ratio of the predicted yield in each bin of the d C VV template to the observed yield in the same bin, which is referred to as the closure, is a measure of the systematic uncertainty. Additional uncertainties arise from effects that could compromise the validity of applying the 3-track control sample to the ≥5-track sample.
We check the assumption that closure of the d C VV construction method in 3-track vertices implies closure in ≥5-track vertices by varying the inputs to the template construction procedure and evaluating the resulting shifts in the d C VV template. Constructing d C VV involves sampling two values of d BV and an angle between vertices ∆φ VV , the efficiency to keep pairs of vertices as a function of d VV , and the b quark correction factors. Therefore, the main effects are related to these distributions. We include additional systematic uncertainties to account for possible differences in d C VV predictions due to variations in these distributions from 3-track vertices to ≥5-track vertices.
In background template construction, the ∆φ VV distribution is modeled using the ∆φ JJ distribution in 3-track one-vertex events. The ∆φ JJ distribution in ≥5-track one-vertex events is indistinguishable from that in 3-track one-vertex events. Potential bias could arise if the distribution of angles between jets and vertices differ for 3-track and ≥5-track vertices. Indeed, the correlation between vertex displacement directions and jet directions is smaller for ≥5-track vertices than for 3-track vertices. To probe the impact, we construct d C VV using a variation of the ∆φ VV input in which we assume that the displacement directions are uncorrelated with the jet momentum directions and draw ∆φ VV from a uniform distribution. We then assign the fractional change in the d C VV prediction in each bin as the systematic uncertainty. The template also depends on the probability that pairs of nearby vertices will both survive the vertex reconstruction algorithm as a function of their separation d VV . The efficiency to merge pairs of vertices is determined from the vertex reconstruction algorithm. To assess the uncertainty due to variations in this efficiency, we use a variation of the algorithm in which the seed vertices are composed of five tracks, rather than the usual two. We then construct a variation of d C VV using the resulting efficiency curve and take the fractional change in the d C VV prediction in each bin as the systematic uncertainty.
The corrections to the d C VV template that account for d BV correlations due to the pair production of b quarks are derived using the fraction of simulated 3-track two-vertex events with b quarks. This fraction could differ for ≥5-track two-vertex events. To assess the related systematic uncertainty, we recompute the b quark corrections using the extreme case in which all two-vertex events contain b quarks, and determine the fractional shifts in the d C VV yields in each bin. The statistical uncertainties in the b quark corrections are also taken as systematic uncertainties in the template.
The systematic uncertainty in the background template, d C VV , is estimated using a combination of the closure of the construction method in the control sample of 3-track vertices and the difference in effects from 3-track vertices to ≥5-track vertices. Table 3 lists the shifts arising from these components for each of the three d VV bins, along with their statistical uncertainties. The statistical uncertainties in the shifts take into account the correlation between the default template and the variation. In assessing the overall systematic uncertainty in the background template, we add in quadrature the shifts and their uncertainties, assuming no correlations.  −7 ± 6 −3 ± 7 +44 ± 20 Difference from 3-track to ≥5-track vertices: Modeling of ∆φ VV +4 ± 0 −5 ± 1 −2 ± 3 Modeling of vertex survival efficiency +20 ± 1 −19 ± 2 −26 ± 7 Modeling of b quark correction −11 ± 1 +9 ± 2 +18 ± 9 b quark correction statistical uncertainty ±3 ±9 ±36 Overall systematic uncertainty 25 25 69

Signal extraction and statistical interpretation
To determine the signal yield, we perform binned shape fits of the signal and background templates to the d VV distribution using an extended likelihood method [34].
The background template is constructed from the one-vertex events in data, while the signal templates are produced directly using the d VV distributions from simulation. There is one signal template for each signal model, mass, and lifetime.
The lower right plot in Fig. 5 compares the d C VV and d VV distributions in the signal region. The observed number of events in each bin, along with the predictions from the background-only fit and from example signal models, are listed in Table 4. The background-only fit normalizes the prediction from the d C VV background template to the observed number of two-vertex events. For the signal-plus-background fits, the signal yield is constrained to be nonnegative. Since there is only one two-vertex event in the data, falling in the 0-0.4 mm d VV bin, the fits to the observed distribution prefer zero signal yield. Table 4: For each d VV bin in ≥5-track two-vertex events: the predicted background yield from the background-only fit, the observed yield, and the predicted signal yields for simulated multijet signals with m = 2000 GeV, production cross section 1 fb, and cτ = 0.3, 1.0, and 10 mm. The systematic uncertainties in the predicted background yields reflect the fractional systematic uncertainties given in Table 3, and the uncertainties in the predicted signal yields reflect the fractional systematic uncertainty given in Table 2.
Predicted multijet signal yields d VV range Fitted background yield Observed 0.3 mm 1.0 mm 10 mm 0-0.4 mm 0.51 ± 0.01 (stat) ± 0.13 (syst) 1 2.8 ± 0.7 3.5 ± 0.8 1.0 ± 0.2 0.4-0.7 mm 0.37 ± 0.02 (stat) ± 0.09 (syst) 0 2.0 ± 0.5 3.7 ± 0.9 0.5 ± 0.1 0.7-40 mm 0.12 ± 0.02 (stat) ± 0.08 (syst) 0 1.1 ± 0.3 11 ± 3 31 ± 7 Upper limits on the signal cross section are set using a Bayesian technique [35]. A uniform prior is taken for positive values of the signal cross section. The signal efficiency is constrained by a log-normal prior with a width of 24%, reflecting the overall uncertainty in the signal efficiency ( Table 2). The only assumed uncertainty in the shape of the signal templates is that due to the finite number of events in the simulation; this uncertainty can be as large as 20% for the lower lifetime and mass samples that have small efficiencies. For the uncertainty in the background, log-normal priors are taken for the yield in each bin, with widths given by the fractional uncertainties listed in Table 3. Figure 6 shows, as a function of lifetime and mass, the observed 95% confidence level (CL) upper limits on the product of the signal pair production cross section and the square of the branching fraction for its decay (σB 2 ) for both the multijet and dijet signals. The expected limits are similar. Exclusion curves are overlaid, assuming gluino and top squark pair production cross sections [36] and 100% branching fraction, for both the observed and expected 95% CL upper limits. The upper limits reflect the signal efficiencies shown in Fig. 3, initially improving as lifetime increases, but worsening at approximately 40 mm due to the fiducial limit at the beam pipe. As an example, for a neutralino with mass of 800 GeV and cτ = 1 mm, the observed 95% CL upper limit on σB 2 is 0.3 fb. For mean proper decay lengths between 0.6 and 80 mm, gluino masses are excluded below 2200 GeV, and top squark masses are excluded below 1400 GeV. Figure 7 shows the upper limits as a function of mass for several values of cτ, and Fig. 8 shows the upper limits as a function of cτ for several values of the mass.
In Fig. 8, the narrowing of the expected limit bands above cτ = 2 mm is due to the correlation between the signal lifetime and the relative signal yields in the three d VV bins. The low background yield causes the discrete nature of the Poisson distribution to have an effect: the pseudo-data sets used to calculate the distribution of expected limits have a limited number of combinations of yields in each bin. For example, for a simulated multijet signal with m = 1600 GeV and cτ = 4 mm, the signal is concentrated almost entirely (>90%) in the last bin. The majority of pseudo-data sets that are different in only the first two bins then have nearly the same expected limit value. The bands widen above cτ = 20 mm with the reappearance of signal in the first bin due to the effect described in Section 6 in which two vertices are reconstructed from the same long-lived particle, an effect that is larger for the multijet signals. 38.5 fb Figure 6: Observed 95% CL upper limits on σB 2 for the multijet (left) and dijet (right) signals as a function of mass and mean proper decay length. The upper plots span cτ from 1 to 100 mm, and the lower plots span cτ from 0.1 to 1 mm. The overlaid mass exclusion curves assume gluino pair production cross sections for the multijet signals and top squark pair production cross sections for the dijet signals, and 100% branching fraction.

Extending the search to other signal models
This search for displaced vertices applies to other types of long-lived particles decaying to multiple jets. Here we present a generator-level selection that can be used to reinterpret the results of our analysis. For signal models in which there are two long-lived particles, this generator-level selection approximately replicates the reconstruction-level efficiency. The selection is based on the number and momenta of generated jets in the event, the displacements of the long-lived particles, and the momenta of their daughter particles. The generated jets are those clustered from all final-state particles except neutrinos, using the anti-k T algorithm with a distance parameter of 0.4, but are rejected if the fraction of energy from electrons is greater than 0.9 or if the fraction of energy from muons is greater than 0.8. The daughter particles are the u, d, s, c, and b quarks, electrons, muons, and tau leptons from the decay of the long-lived particle, and we consider those with an impact parameter with respect to the origin measured in the x-y plane of at least 0.1 mm. The generated jets and daughter particles are required to satisfy p T > 20 GeV and |η| < 2.5.
The criteria of the generator-level selection are as follows: at least four generated jets; H T > 1000 GeV, where H T is the scalar sum of the p T of generated jets with p T > 40 GeV; for each long-lived particle, a distance of the decay point from the origin measured in the x-y plane of between 0.1 and 20 mm, and a value of Σp T of the daughter particles of at least 350 GeV; and a distance between the decay points of the long-lived particles measured in the x-y plane of at least 0.4 mm. In calculating the Σp T of the daughter particles, we multiply the p T of b quark daughter particles by a factor of 0.65. This accounts for the lower reconstruction-level efficiency due to the lifetime of heavy flavor particles, which can impede the association of their decay products with the reconstructed vertices.
This generator-level selection replicates the reconstruction-level efficiency with a typical accuracy of 20% for a variety of models for which the signal efficiency is high (>10%). In the region with d VV > 0.4 mm, there are no observed events.

Summary
A search for long-lived particles decaying into multijet final states has been performed using proton-proton collision events collected with the CMS detector at a center-of-mass-energy of 13 TeV in 2015 and 2016. The data sample corresponds to an integrated luminosity of 38.5 fb −1 . No excess yield above the prediction from standard model processes is observed. At 95% confidence level, upper limits are placed for models of R-parity violating supersymmetry in which the long-lived particles are neutralinos or gluinos decaying solely into multijet final states or top squarks decaying solely into dijet final states. The data exclude cross sections above approximately 0.3 fb for particles with masses between 800 and 2600 GeV and mean proper decay lengths between 1 and 40 mm. For mean proper decay lengths between 0.6 and 80 mm, gluino masses below 2200 GeV and top squark masses below 1400 GeV are excluded. While the search specifically addresses two models of R-parity violating supersymmetry, the results are relevant to other models in which long-lived particles decay to final states with multiple tracks, and a method to extend the search to other signal models is provided. For the models considered, the results provide the most restrictive bounds to date on the production and decay of pairs of long-lived particles with mean proper decay lengths between 0.1 and 100 mm.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMWFW and FWF (Aus   [10] ATLAS Collaboration, "Search for pair-produced long-lived neutral particles decaying in the ATLAS hadronic calorimeter in pp collisions at √ s = 8 TeV", Phys. Lett. B 743 (2015) 15, doi:10.1016/j.physletb.2015.02.015, arXiv:1501.04020.