A search for new resonances in multiple final states with a high transverse momentum $Z$ boson in $\sqrt{s}=13$ TeV $pp$ collisions with the ATLAS detector

A generic search for resonances is performed with events containing a $Z$ boson with transverse momentum greater than 100 GeV, decaying into $e^+e^-$ or $\mu^+\mu^-$. The analysed data collected with the ATLAS detector in proton-proton collisions at a centre-of-mass energy of 13 TeV at the Large Hadron Collider correspond to an integrated luminosity of 139 fb$^{-1}$. Two invariant mass distributions are examined for a localised excess relative to the expected Standard Model background in six independent event categories (and their inclusive sum) to increase the sensitivity. No significant excess is observed. Exclusion limits at 95% confidence level are derived for two cases: a model-independent interpretation of Gaussian-shaped resonances with the mass width between 3% and 10% of the resonance mass, and a specific heavy vector triplet model with the decay mode $W'\to ZW \to \ell\ell qq$.


Introduction
One of the key components of the Large Hadron Collider (LHC) physics programme is the search for new resonances predicted by hypotheses for physics beyond the Standard Model (BSM) [1][2][3][4][5][6][7][8][9]. A large number of searches have been performed in proton-proton ( ) collisions at the LHC. Most searches are optimised for some specific final states motivated by a particular BSM model, or a class of models, and are therefore model-dependent, e.g. those associated with a boson . No new physics has been found so far by these model-specific searches. To expand the search program, generic searches can be sensitive to models that do not have the same characteristics of those already searched for and give more ability for reinterpretation for theories yet to be proposed. This article presents a generic resonance search with both model-independent and model-dependent interpretations in events with a leptonically decaying boson with high transverse momentum, T . It is generic since the search is performed in multiple final states and it is sensitive to new particles in various production and decay modes. It is also model-independent since minimal features of BSM physics are assumed. Specifically, BSM processes are assumed to produce a resonance decaying into high-transversemomentum objects, in association with, or including, a boson, whereas the SM background is assumed to be smoothly falling over the observables of interest. Two generic cases are considered, as shown schematically in Figure 1: , where the new resonance, represented as , decays to a final state of SM particles, and → → , where the resonance appears in production as an intermediate state that subsequently decays into a boson and SM particles, possibly via another intermediate resonance .
To probe a variety of possible BSM processes in a single analysis, events are separated into six exclusive categories according to the identity of the leading-T object in . The leading-T object can be an electron, a muon, a photon, a small-jet reconstructed with a radius parameter of = 0.4 containing a -hadron, a small-jet without a -hadron, or a large-jet with = 1. Invariant mass distributions of all the visible objects in the final state, either excluding the leptons from the boson decay ( ) or including the leptons from the decay ( ) in each of the six event categories, are defined as observables of interest. The distributions are examined with the BumpHunter algorithm [32, 33] for local excesses above a data-driven estimate of the smoothly falling SM background. Depending on the event categories and mass observables, the search is performed so as to be sensitive to a large mass range from around 200 GeV up to 6 TeV. where is a new resonance to be searched for while can be a new resonance or SM particle.
Generic searches with minimal beyond-the-SM signal assumptions were performed by the D0 Collaboration [34][35][36][37] and the CDF Collaboration [32,38] at the Tevatron, and by the H1 Collaboration [39,40] at HERA. At the LHC, a general search was performed by each of the ATLAS [41] and CMS [42] collaborations, using 3.2 fb −1 and 35.9 fb −1 of data at a centre-of-mass energy ( √ ) of 13 TeV, respectively. ATLAS also performed searches in final states containing two-leptons as well as in three-and four-leptons [43,44]. However, final states including a resonance were not explicitly considered in general searches although decays through weak bosons are often possible in BSM models. Requiring a high-T boson may reduce the background enough to reveal signals which are not visible in an inclusive sample and also explores a signature not explicitly considered in previous general searches. Furthermore, a leptonically decaying boson is a fully reconstructable final state with straightforward ways to reduce the multĳet background which is challenging to model. The analysis uses the full data sample from √ = 13 TeV collisions recorded by the ATLAS detector in Run 2 at the LHC, corresponding to an integrated luminosity of 139 fb −1 . The results are interpreted in a model-independent way in terms of Gaussian-shaped signals with relative mass widths of 3%, 5%, and 10% for all the event categories. Furthermore, the results from the dominant leading large--jet event category are interpreted in a Heavy Vector Triplet (HVT) model [45,46] in order to demonstrate the sensitivity of the analysis.

ATLAS detector
The ATLAS detector [47] covers nearly the entire solid angle around the collision point. 1 It consists of an inner tracking detector (ID) surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic (EM) and hadronic calorimeters, and a muon spectrometer (MS).
The ID provides charged-particle tracking in the range | | < 2.5. It consists of silicon pixel, silicon microstrip, and transition radiation tracking detectors. The high-granularity silicon pixel detector covers the vertex region and typically provides four measurements per track, the first hit normally being in the insertable -layer [48,49]. It is followed by the silicon microstrip tracker, which usually provides eight measurements per track. These silicon detectors are complemented by the transition radiation tracker, which enables radially extended track reconstruction up to | | = 2.0 and contributes to electron identification.
Lead/liquid-argon (LAr) sampling calorimeters provide EM energy measurements with high granularity. A steel/scintillator-tile hadronic calorimeter covers the central pseudorapidity range (| | < 1.7). The end-cap and forward regions are instrumented with LAr calorimeters for both the EM and hadronic energy measurements up to | | = 4.9. For | | < 2.5, the EM calorimeter is divided into three layers longitudinal in shower depth, which are finely segmented in and , particularly in the first layer. This segmentation allows measurements of the lateral and longitudinal shower profile, and the calculation of shower shapes [50] used for particle identification and background rejection. The longitudinal segmentation of the EM calorimeter is also exploited to calibrate the energy response to electron and photon candidates [50].
The MS comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by superconducting air-core toroids. The precision chamber system covers the region | | < 2.7 with three layers of monitored drift tubes, complemented by cathode-strip chambers in the forward region, where the background is highest. The muon trigger system covers the range | | < 2.4 with resistive-plate chambers in the barrel, and thin-gap chambers in the end-cap regions.
A two-level trigger system [51] was used during the 2015-2018 data-taking period. The first-level trigger (L1) is implemented in hardware and uses a subset of the detector information. This is followed by a software-based high-level trigger, which runs algorithms similar to those in the offline reconstruction software, reducing the event rate to approximately 1 kHz from the maximum L1 rate of 100 kHz.
An extensive software suite [52] is used in data simulation, in the reconstruction and analysis of real and simulated data, in detector operations, and in the trigger and data acquisition systems of the experiment.

Data and simulated event samples
The analysed data comprise the full ATLAS collisions at √ = 13 TeV collected from 2015 to 2018, corresponding to an integrated luminosity of 139 fb −1 after data quality requirements [53] were imposed to ensure that all the subdetectors were operating normally. 1 The ATLAS experiment uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the -axis along the beam pipe. The -axis points from the IP to the centre of the LHC ring, and the -axis points upwards. Cylindrical coordinates ( , ) are used in the transverse plane. The polar angle ( ) is measured from the positive -axis and the azimuthal angle ( ) is measured from the positive -axis in the transverse plane. The pseudorapidity is defined as = − ln tan( /2). Angular distance is measured in units of Δ ≡ √︁ (Δ ) 2 + (Δ ) 2 . ranging from 500 GeV to 6 TeV. The reconstructed resonance mass can be modelled by a double-sided Crystal Ball function [74,75]. Acceptance times efficiency, A × , values of up to 50% are found for these signal models in the dominant leading large--jet category.
All background and signal samples were processed through a G 4-based simulation [76,77] of the ATLAS detector. The effect of multiple interactions in the same and neighbouring bunch crossings (pile-up) was modelled by overlaying the simulated hard-scattering event with inelastic collision events generated with P 8.186 using the NNPDF2.3 PDF set and the A3 tune [78]. The MC simulated events are weighted to match the distribution of the number of reconstructed pile-up vertices observed in different data-taking periods. Scale factors are applied to the simulated events to account for any remaining differences between observed and simulated data.

Object selection
Selected events are required to have at least one primary vertex [79] with at least two associated tracks, each with T > 500 MeV. In each event, only the primary vertex with the highest sum of squared transverse momenta of contributing tracks, hereafter denoted by 2 T , is considered. Muons are reconstructed either by combining ID and MS tracks that have consistent trajectories and curvatures or as stand-alone MS tracks [80,81]. The muon candidates used in this analysis are required to have T > 25 GeV and | | < 2.7. They are also required to pass the 'Medium' selection for T > 25 GeV and the 'HighPt' selection for T > 300 GeV. The efficiency for identifying a single muon is about 99% for T < 300 GeV and around 80% for T > 300 GeV.
Electrons are reconstructed from clusters of energy deposits in the EM calorimeter that match a track reconstructed in the ID. Reconstructed electrons are identified using likelihood identification criteria [50,82]. The electron candidates are required to have T > 25 GeV. They are also required to pass the 'MediumLH' selection and be within | | < 2.47, excluding the transition region between the barrel and end-caps in the LAr calorimeter (1.37 < | | < 1.52).
To ensure that lepton candidates originate from the interaction point, the likelihood-based electrons (muons) must satisfy | 0 |/ 0 < 5 (3), and both lepton types must satisfy | 0 sin | < 0.5 mm. Here, 0 is the transverse impact parameter of the lepton measured relative to the primary vertex, 0 is the uncertainty in 0 , and 0 is the longitudinal distance between the position of the primary vertex and the position of the track at the point of closest approach. In addition, electron and muon candidates are required to be isolated from other tracks and calorimetric activity by applying T -and -dependent isolation criteria. For both the electron and muon candidates, the calorimeter isolation is based on the sum of the transverse energies of topological clusters [83] within a cone of size Δ = 0.2 around each lepton. The track isolation is computed by summing the transverse momenta of selected tracks within a cone of ℓ T -dependent size around the lepton track. The efficiency of the isolation requirements is 95% for muons from bosons with T > 100 GeV. For electrons, it is 90% but decreases for bosons with T > 450 GeV as the electron signatures begin to overlap.
If a → + − decay has a sufficiently large Lorentz boost, the + − angular separation becomes small enough for the + − pair to merge and fail the electron isolation requirements. The merged + − pairs are then reconstructed as jets, and a boosted decision tree (BDT) algorithm [84,85] is used to separate merged-+ − candidates from hadronic jets. This BDT identification is only applied when there is no boson candidate from either an + − pair or a + − pair in the event. The input variables for the BDT include the jet's calorimeter-and ID-track-related properties and kinematic information. More details are given in Appendix A.1. The combined efficiency of the resolved and merged identification for an + − pair forming a boson candidate is around 80% with weak T dependence.
Photon candidates are reconstructed from dynamic, variable-size topological clusters of cells with significant energy in the EM calorimeter and, possibly, matching tracks reconstructed in the inner detector [86]. The photon candidates are classified as 'converted' if two tracks forming a conversion vertex, or one track with the signature of an electron track but without hits in the innermost pixel layer, are matched to the cluster; otherwise, they are labelled as 'unconverted'. They are identified by using variables that characterise the lateral and longitudinal shower development in the EM calorimeter and the energy fraction leaking into the hadronic calorimeter. Photons used in this analysis are required to pass the 'Tight' selection [50,53] for transverse energy T > 25 GeV and be within | | < 2.37, excluding the same transition region between the barrel and end-caps in the LAr calorimeter as for electrons. The selection has an efficiency of 82%-92% (75%-92%) for unconverted (converted) photon candidates depending on T and . The fractions of the unconverted and converted photon candidates are about 68% and 32%, respectively. The photon isolation requirement is based on the amount of transverse energy inside a cone of size Δ = 0.4. The efficiency of the isolation requirement is 75%-90% (∼70%) for unconverted (converted) photon candidates.
Small-jets are reconstructed using the particle-flow algorithm [87], which considers both tracker and calorimeter information, using the anti-algorithm [88] with a radius parameter of = 0.4 implemented in the FastJet package [89]. The four-momenta of these small-jets are calculated as the sum of the four-momenta of their constituents. Jets are corrected using T -and -dependent scale factors to account for the calorimeter's non-compensating response, signal losses due to noise-threshold effects, energy lost in passive material, and contributions from pile-up interactions, as described in Ref. [90].
The jets are required to have T > 30 GeV and | | < 2.4. To suppress jets that originate from pile-up, a jet-vertex tagger using the 'Tight' working point [91] is applied to jets with T < 60 GeV and | | < 2.4. To avoid double counting, jets of any transverse momentum are discarded if they are within a cone of size Δ = 0.2 around an electron or a photon, or if they have fewer than three associated tracks and are within a cone of size Δ = 0.2 around a muon. However, jets discarded due to an overlap with a reconstructed electron are still considered when testing for a merged-+ − candidate. If a remaining jet with three or more associated tracks is within a cone of size Δ = 0.4 around a muon candidate, or the separation between an electron or a photon and any jet is within 0.2 < Δ < 0.4, the corresponding muon, electron, or photon candidate is rejected.
Small-jets containing a -hadron are identified using the DL1r -tagging algorithm [92] at the 77% -tagging efficiency operating point. The misidentification rate for jets which originate from a light quark or a gluon is less than 1%, while it is approximately 20% for -jets, as measured in simulated¯events.
Large-jets are reconstructed from topological clusters [83] in the calorimeter, using the anti-algorithm with a radius parameter = 1.0. To reduce contributions to the jet's transverse momentum and mass that arise from pile-up, a trimming procedure [93] is applied to remove = 0.2 subjets with less than 5% of the original T of the jet. Corrections are applied to restore the jet's energy to that of jets reconstructed at the particle-level energy scale [94]. Jet candidates are required to have T above 200 GeV and | | < 2.0. The energy resolution for large-jets with T between 200 GeV and 1 TeV is ∼7% [94]. The jet's mass is computed as a weighted combination of calorimetric mass and track-assisted mass [95]. To avoid double counting, the small-jets are discarded if they are within a cone of size Δ = 1.0 around the large-jets. The large-jets are discarded if they are within a cone of size Δ = 1.0 around an electron or a photon.

Event selection and classification
The boson candidates are selected by requiring two oppositely charged, same-flavour leptons ( + − or + − ). Both leptons must satisfy the minimal quality criteria discussed in Section 4. If no boson candidate can be formed from a pair of two isolated leptons, merged-+ − candidates are checked. The boson candidates are required to have an invariant mass between 66 and 116 GeV. If there are multiple boson candidates, the one with the mass closest to the boson mass is chosen. To suppress the large SM background contributions at low T and to be more sensitive to high T final states expected from new heavy particles, the selected boson candidate is also required to have T > 100 GeV. The choice of this T threshold is a compromise between suppressing the SM background contribution and keeping the data event yields high enough for data-driven background estimations in all search categories.
For a given event, all objects satisfying the selections described in Section 4, except the lepton pair from the boson decay, are assumed to belong to the recoil system ( ). The objects in are ordered in T . The leading-T object type is used to assign an event to one of six independent event categories: a small-(non-) jet (LeadJ), a small--jet (LeadB), a large-jet (LeadFatJ), a photon (LeadP), an electron (LeadE) or a muon (LeadM). An inclusive category is formed by summing the six exclusive categories. This category can be useful when the signal is distributed over several considered event categories. The data yields in the six categories are shown in Table 1. The invariant mass of the recoil system, , and the invariant mass of the recoil system and the boson, , are the observables of the search. The SM background is modelled by a fit to the binned distribution of or after combining the + − and + − channels in each event category. MC samples are used in selecting the varying bin size for each mass distribution, M ( or ). The bin widths for a mass spectrum are chosen to approximate the mass resolution, which broadens with increasing mass. Two functional forms are considered for the estimation of the background and are defined as: (1) (2), 0 is a free normalisation factor and are a number of other free parameters controlling the shape of a mass distribution, and M min and M max are the lower and upper fit boundaries of the distribution. For a given mass bin, the integral of the function is fit to the selected number of events in the bin. The number of free parameters depends on the mass spectrum and event category. Mass spectra that cover a larger range usually have higher yields and need more parameters than those with a smaller range.
The choice of functional form, number of free parameters, and fit range is based on two criteria for the fits initially performed on the simulated SM mass spectrum. The statistics of the MC samples is category dependent and varies typically from a few times up to an order of magnitude larger than that of the data sample. The global 2 -value of the background-only fit has to be greater than 0.05 for the modelling to be good enough over the considered mass range, and the significance of the spurious-signal yields over the fit range must typically be below 20%-50% to ensure a discovery will not be claimed falsely. The spurious signal is evaluated by fitting the simulated background's mass spectrum with signal-plus-background models. The statistical uncertainty of the mass spectrum corresponds to that of data. The signal significance is defined as the ratio of the absolute value of the signal yield to its fitted uncertainty. The upper fit boundary is dictated by the number of events in the high mass tail of the mass spectrum. In most cases, the lower fit boundary is a direct consequence of the event selection. However, in some cases, it has to be increased slightly to satisfy the requirements of a small spurious-signal significance and a global 2 -value larger than 0.05 for the background fit. The spurious signal for the model-independent interpretation is evaluated by fitting the simulated background mass spectrum with a Gaussian-shaped signal function together with the selected background functional form. The detector resolution, with relative values typically around 5% for the mass observables, is also taken into account for the Gaussian-shaped signals with intrinsic relative width values of 3%, 5%, and 10%. It widens the reconstructed peak from the intrinsic width. For a model-dependent interpretation with an HVT signal, the spurious signal is evaluated with a double-sided Crystal Ball function whose parameter values are determined from the HVT MC sample.

Search strategy
The strategy applied in this search is diagrammed in Figure 2. The mass spectrum in data in each category is fitted with a functional form within the fit range. Both the functional form and the fit range are chosen for each mass spectrum, as discussed in Section 6.1. To search for local excesses over the background, the BumpHunter (BH) algorithm is then applied to the mass spectrum within a signal-sensitive mass range, which lies within the fit range (see below).
The BH algorithm calculates the significance of any excess found in continuous mass intervals of varying size at various locations in the spectrum. The size of the search window varies from a minimum of two mass bins up to half of the bins in the sensitive mass range. For each interval in the scan, the BH algorithm computes the significance of the difference between the data and the background estimate. The interval where the data deviates most significantly from the background estimate is defined as the set of bins that have the smallest probability of arising from a Poisson background fluctuation. The probability of random fluctuations in the background-only hypothesis to create an excess at least as significant as the one observed anywhere in the spectrum defines the BH -value. It is determined by performing a series of pseudo-experiments drawn from the background estimate, with the look-elsewhere effect considered.
The signal-sensitive mass range for each signal depends on the assumed shape and width. The sensitive range is derived using the procedures described in Appendix A.2. A signal at a given mass point within the fit range is injected on top of a set of 100 pseudo-experiments generated by fluctuating the fitted background mass spectrum within a Poisson distribution. The injected number of signal events has a signal significance of 5 , calculated over the full mass spectrum using a definition recommended in Ref. [96]. A mass point is considered to be sensitive if the signal is correctly located with a BH -value below 0.01 in more than 50% of the pseudo-experiments. In general, the procedure is more sensitive for narrow signals, so the signal-sensitive range corresponding to the 3% width value is used in the BH search and it is wider than the signal-sensitive mass ranges for other width values. For Gaussian-shaped signals, the largest considered width is 10% because wider signals are often absorbed into the estimated background.
To ensure that the global fitting method is sensitive to potential signal resonances and the background functional form is appropriate, an alternative signal injection test has been performed and it was found that the injected signal can be recovered by the procedure.
The search strategy depends on the global 2 -value and the BH -value. If the global 2 -value is above the threshold of 0.05 and the BH -value is above the threshold of 0.01, it indicates that this functional form can model the SM background for the examined mass spectrum and there is no significant excess above the background. In this case, upper limits are set on the signal models within a limit-sensitive mass range which is derived for each signal using a procedure, described in Appendix A.3, based on the coverage of the expected exclusion limit. If the global 2 -value is below 0.05 and the BH -value is above 0.01, it indicates that the initial functional form validated on the simulated SM mass spectrum does not fit the corresponding data spectrum.
Alternative functional forms are then tested against the nominal one. Two alternative functional forms are considered: the same functional form as the nominal one but with more free parameters and the other functional form defined in Eq. (1) or (2). In the first case, the test is performed using an -test [97]: nom and 2 alt are the 2 values for fits to the spectrum using, respectively, the nominal or alternative model, nom and alt are the number of free parameters in each model, and is the number of bins used for the 2 computation. The value of is larger if the alternative model provides a significantly better fit to the spectrum than the nominal model. The alternative model is retained if the -value of the -test satisfies ( ) > 0.05. If the test is not successful, the other functional form is considered with up to six (eight) free parameters for 2 ( ) ( 1 ( )). If both functional forms fail to describe the background, the lower fit boundary is raised until the global 2 -value is above the threshold of 0.05.
If the BH -value is below 0.01 for any global 2 -value, a new global fit is performed while excluding the BH interval corresponding to the most significant excess relative to the background estimated by the fit in the full fit range. This gives the sideband 2 -value. The BH algorithm is then applied using the newly fitted background estimate to get a new BH -value. If the new BH interval is at the same mass location as the initial one, the new BH -value is expected to be smaller than the initial one because more events from the excess may be absorbed in the background fit in the initial step. If the new BH interval is at a different location, the procedure is repeated while excluding both intervals. If the global 2 -value is above 0.05, the excess is to be further quantified by performing a signal-plus-background fit to the mass spectrum. On the other hand, if the global 2 -value is smaller than 0.05, the global fit is to be improved using alternative models described above and the procedure is then repeated.

BumpHunter search results
The functional forms given in Section 6.1 are applied to mass spectra in data for the six exclusive event categories and the inclusive one to get the initial data-driven background estimate. The upper fit boundaries for some of the event categories have to be lowered from their initial MC-based values because there are fewer high-mass data events than expected. The lower fit boundaries are driven by the kinematic selection except for the spectra in the leading large--jet and inclusive categories, where the search strategy requires them to be raised a little. The fit range and nominal functional form for each mass spectrum are shown in Table 2.
Using the fitted background estimate, the BH algorithm is applied to all the spectra within the signalsensitive mass ranges. The signal-sensitive mass ranges and the resulting BH -values are given in Table 3. The BH results using the backgrounds estimated by the fit in the full fitting range are also shown graphically in Figures 3 and 4 for the and spectra, respectively, in the six exclusive event categories. Figure 5 shows the mass spectra of the inclusive event category. Gaussian-shaped signals with three different widths at mass values around 1, 2 and 3 TeV in the and spectra are shown as examples in the leading small--jet category. The signal yields are normalised to correspond to a cross-section value of 1 fb. An HVT signal at 3 TeV is also shown in the relevant mass spectrum of the leading large--jet category. The largest excesses are at around 280 GeV in the spectrum of the leading muon category and at around 1.6 TeV in the spectrum of the leading large--jet category with a BH -value of 0.48 and 0.46, respectively. With updated background estimates after excluding the initial BH interval, the corresponding new BH -values are 0.08 and 0.10, respectively. They are still above the threshold value of 0.01, meaning that there is no significant excess in any of the mass spectra and event categories.    Table 3. in the inclusive category. The vertical lines indicate the most discrepant interval identified by the BH test from the initial step and the -values correspond to those shown in the fourth column in Table 3.

Model-independent and model-dependent interpretations
Since no significant excesses are observed in any of the mass spectra, exclusion upper limits are derived at 95% confidence level (CL) for model-independent results based on Gaussian-shaped signals with relative width values of 3%, 5% and 10%, as well as for HVT signal interpretations.

Systematic uncertainty and statistical methodology
The statistical uncertainty of the fit due to the limited size of the data sample is considered as a systematic uncertainty affecting the data-driven background determination.
For signals, the spurious-signal uncertainty corresponds to an envelope of absolute spurious-signal yields over the mass spectrum under consideration. The envelope is obtained from a smooth parameterisation passing through the local maxima of the spurious-signal yields. However, given the differences in the shape and normalisation of the mass spectra between data and simulation, the implementation uses the absolute uncertainty scaled to keep the relative uncertainty unchanged.
The spurious signal's systematic uncertainty and the statistical uncertainty of the background function fit are the dominant sources of uncertainty. Additional systematic uncertainties for the HVT signal modelling include experimental, theoretical and luminosity uncertainties. The large--jet-related uncertainties are the dominant source of experimental uncertainty. The theoretical uncertainties include contributions due to missing higher-order corrections, PDF uncertainties, and the uncertainty in the strong coupling constant s . The effect of missing higher-order corrections is estimated by varying the renormalisation and factorisation scales by factors of 0.5 and 2. Similarly, the PDF and s are varied to estimate the effect of their uncertainties. The uncertainty in the combined 2015-2018 integrated luminosity is 1.7% [98], obtained using the LUCID-2 detector [99] for the primary luminosity measurement.
For each mass spectrum, a likelihood function is constructed as a product of Poisson distributions corresponding to different mass bins. Upper limits on the signal strength in terms of the cross-section times acceptance times branching ratio, × B × A, are extracted at 95% CL using the CLs method [100] with a binned profile likelihood ratio as the test statistic. The free parameters of the background functional forms described in Section 6 are treated as nuisance parameters, which are optmised for each resonance mass value by the likelihood function without constraints during the fit [101]. The systematic uncertainties are also included as nuisance parameters, each with a Gaussian constraint except for the integrated luminosity, which has a log-normal constraint. Generally, each of the two dominant uncertainties mentioned above degrades the expected upper limits by around 10−20% but can reach up to 50% at low mass regions.

Constraints on a generic Gaussian-shaped signal for a model-independent interpretation
For a model-independent interpretation, Gaussian-shaped signals with relative width values of 3%, 5% and 10% are studied. The cross section value of the signals is normalised to 1 fb. The exclusion upper limits are obtained by adjusting the normalisation of the signal strength s of a template representing the Gaussian signal in question until a CL s -value of 0.05 is reached. The obtained s value is thus the 95% CL upper limit on the event yield of the Gaussian-shaped portion of a signal in an appropriately chosen BSM model. This s value is then divided by the integrated luminosity L to obtain the upper limit on × B × A. For the Gaussian-shaped signals, the B × A value cannot be specified as those must be determined in the context of a specific BSM model.
The resulting expected and observed limits at 95% CL on the Gaussian-shaped signal with a relative width value of 3% are shown as a function of in Figure 6 and in Figure 7 for the six exclusive event categories and in Figure 8 for the inclusive event category. Similar limits are also derived for width values of 5% and 10%. Comparisons between the observed limits for the three width values are presented in values of 300 GeV and 6 TeV, respectively. The limits are calculated with the asymptotic approximation to the distribution of the test statistic [102]. The results are compared with the alternative approach using pseudo-experiments and the difference is found to be well below 15%, much smaller than the other dominant uncertainties. The limits are not corrected for efficiencies. Typical values of acceptance times efficiency can be found in [103].
These limits should be used when long low-mass off-shell tails from PDFs and non-perturbative effects on the narrow resonance signal shape can be safely truncated or neglected and, after applying the selection described in Sections 4 and 5 (the main selection being 66 < ℓℓ < 116 GeV and T > 100 GeV), the reconstructed mass distribution approximates a Gaussian distribution. All reconstructed objects passing the object selection including initial-state radiation (ISR) must be included in and . ISR broadens the reconstructed mass, reducing sensitivity as outlined for the HVT model in the next section. A detailed description of how to use these limits is given in Ref. [104].           Figure 11: Comparison of observed upper limits at 95% CL on the cross section times branching fraction times acceptance for a Gaussian-shaped signals with relative width values of 3%, 5% and 10% as a function of (a) and (b) in the inclusive category.

Constraint on a → signal in an HVT model
Although this generic analysis is not optimised for a particular model, it can also be used to constrain a specific model, for example the → → ℓℓ signal in the HVT model. The leptonic final state of the boson decays is not used -the presence of the neutrino causes the reconstructed mass to be significantly shifted from the generated value and does not improve the analysis sensitivity. The acceptance times efficiency values as a function of in the dominant leading large--jet category are shown in Figure 12. The derived observed and expected upper limits at 95% CL on the cross section times the branching fraction of the → → ℓℓ decay for the HVT signal are shown in Figure 13. The limits exclude an HVT signal lighter than 1.2 TeV for both models A and B. The two models differ in their coupling parameter to the SM gauge bosons, with a value of 1 and 3, respectively [46]. These limits are weaker than those in Ref.
[12], where boson tagging was applied to increase the signal sensitivity, illustrating the trade-off between generality and sensitivity.
When constructing , all reconstructed objects passing the object selection are included, even those arising from ISR. For the HVT signal, the expected cross section limits from reinterpreting the limits on a Gaussian-shaped signal are about 20% stronger if ISR objects are excluded from the invariant mass calculation using generator-level information.

A.1 Merged-+ − identification
The electrons from a boosted boson corresponding to the merged-+ − topology appear as a reconstructed small-jet in the EM calorimeter, which differs, however, from a hadronic jet. Therefore, the key idea of the BDT analysis is to separate the merged-+ − system from hadronic jets. The following variables are selected as inputs for the BDT analysis (the requirements, where indicated, are applied prior to BDT training): • jet T : the transverse momentum of a jet (requiring jet T > 450 GeV), • jet : the pseudorapidity of a jet (requiring | jet | < 2.47), • jet : the invariant mass of a jet (requiring the expected jet mass for a → + − decay to be around the boson mass: 60 < jet < 120 GeV), • EM jet : the jet energy fraction deposited in the EM calorimeter, satisfying a T -dependent selection, • max( layer / jet ): the maximum jet energy fraction deposited in a calorimeter layer, • track jet : the number of tracks, with T > 500 MeV, associated to a jet using the ghost-association [106] (requiring 1 ≤ track jet ≤ 7), In this study, signal events are based on (→ + − ) + jets MC samples requiring the merged-+ − candidate to match the true boson within Δ < 0.1. The SM background events correspond to combined MC samples of + jets, (→ ) + jets and (→ ) + . Background processes containing only hadronic jets, e.g. the dĳet background, are efficiently suppressed to a negligible level by the selection listed above, in particular by a T -dependent EM jet requirement. The raw number of selected signal and background jets is 102 593 and 8572, respectively. Events are required to fire either jet triggers or electron triggers. Events with two oppositely charged same-flavour leptons are vetoed since those events are analysed using the standard likelihood-based electron identification. The jet energy fraction in the EM calorimeter must satisfy a T -dependent selection, and at least one but no more than seven tracks must be found in the jet cone. The candidate must match with at least one GSF track, and satisfy PFO − GSF tracks < 2 to suppress + jets background, where PFO is the number of particle-flow tracks with T > 1 GeV. At least one / cluster [50] must also be matched with the jet. The cluster-based mass, cls , different from jet , is defined using the matched / cluster information. It is constructed with the leading-T / cluster in the jet, to show the two-body decay property of the → + − process. The cluster-based mass is defined as where is the ratio of leading-cluster energy to jet energy. The jet must satisfy cls > 60 GeV.
Using a T -dependent BDT-score selection, the merged-+ − identification efficiency as a function of T is determined using the (→ + − ) + jets MC samples and compared with that of the standard electron identification in Figure 14 (left). The decrease in the efficiency of the standard electron identification at high T is recovered with the BDT-based merged-+ − identification, reaching a combined efficiency of 80%. The background rejection factor for three types of processes surviving the analysis preselection is shown in Figure 14 (right), illustrating the better performance of the BDT method in comparison with a cut-based method using EM jet at the same signal efficiency.

A.2 Signal-sensitive mass range
To obtain a mass range in a given event category in which the search is able to detect a signal, a set of 100 pseudo-experiments are generated using the fitted background functional form with Poisson fluctuations. Gaussian-shaped signals with intrinsic relative width values varying between 3% and 10% are injected, after taking into account detector resolution effects, on top of each pseudo-experiment. The injected signals correspond to a signal strength of 5 and are spaced every 250 GeV in and . Each of the 100 resulting mass spectra is treated as data and is studied following the search strategy described in Section 6.2 in two steps.
As an example, the sensitive range of the spectrum in the leading small--jet category was found as follows. A Gaussian-shaped signal with a width value of 3% is injected centred at 2.5 TeV with the number of signal events corresponding to a signal significance equal to 5 . In 97 of the 100 pseudoexperiments, the BH algorithm correctly identifies the location of the injected signal. However, only 41 pseudo-experiments yield a BH -value below the threshold of 0.01 since the signal is partially absorbed in the background-only fit which is performed in the full fitting range. But if the BH interval which is derived using the background-only fit in the full fitting range is excluded from the fit range, the number of pseudo-experiments with a BH -value below 0.01 increases to 87. For signal width values of 5% and 10%, this number is 76 and 57 respectively, showing the lower sensitivity for signals with larger widths. The results for a signal width value of 3% are shown in Figure 15. The study is also performed at other mass points for both the and spectra in all event categories. The fractions from the right panel of Figure 15 for the leading small--jet category are presented in Figure 16. It shows that the sensitivity of the BH algorithm is best in the intermediate mass range and decreases at low and high mass values. The signal-sensitive mass range in this analysis is defined by requiring at least 50% of the pseudo-experiments to have a correctly identified BH interval and a BH -value below 0.01.  in the leading small--jet category for a 5 injected Gaussian-shaped signal with a relative width of 3%. The blue dashed horizontal line at 0.5 is used to define the signal-sensitive mass range.

A.3 Limit-sensitive mass range
For a given event category, a background mass spectrum based on MC simulations can be used to derive an expected exclusion upper limit at 95% CL for a Gaussian-shaped signal of a certain given width in a model-independent way. This is referred to as nominal exclusion limit in the following. A set of 1000 pseudo-experiments are generated from the MC estimates with both the background and a signal with a signal strength set to the nominal exclusion limit value. For each pseudo-experiment, a signal+background fit is performed and an exclusion upper limit at 95% CL is extracted. The distribution of the cross-section exclusion upper limits from the 1000 pseudo-experiments is compared with the nominal limit derived directly from the MC mass spectrum. The fraction of pseudo-experiments that have a higher upper limit is calculated. If it is larger than 95%, the background modelling is sensible for the mass point under study and the corresponding exclusion upper limit is conservative. Given the limited size of each pseudo-experiment, a fraction larger than 90% is considered acceptable. One example is shown in Figure 17 for mass points at the lower and upper boundaries of the limit-sensitive mass range for the mass spectrum in the leading small--jet category for Gaussian-shaped signals with relative width values of 3%, 5% and 10%. The results for all the mass spectra and event categories are summarised in Table 4. As expected, the limit-sensitive mass range is largest for signals with small widths. Most of these mass ranges can be applied directly to data when setting limits; an adjustment is only needed when the limit-sensitive mass range extends beyond the corresponding fit range in data.