1 Introduction

Supersymmetric models predict the existence of scalar partners (squarks and sleptons) to the fermions of the Standard Model. In particular, stops and left-handed sbottom squarks need to be light in order to solve the hierarchy problem of the Higgs boson mass. Consequently, searches for stops and sbottoms are at the core of the ongoing LHC program. However, despite intense efforts, see e.g. [139], they remain elusive, resulting in stringent limits by dedicated searches performed by both ATLAS [4047] and CMS [4855].

The interpretation of a new physics search requires a model hypothesis against which a measurement can be tensioned. Lacking evidence for superpartners and clear guidance from theory, apart from naturalness considerations, it makes sense to employ search strategies that make as few assumptions on the model as possible. For most reconstruction strategies a trade-off has to be made between achieving a good statistical significance in separating signal from backgrounds and the applicability to large regions in the model’s parameter space. Hence, experimental searches are in general tailored to achieving the best sensitivity possible for a specific particle or decay, leaving other degrees of freedom of interest unconsidered. This approach can lead to poor performance for complex models with many physical states and couplings, e.g. the MSSM and its extensions. Hence, a reconstruction that retains sensitivity over wide regions of the phase space, thereby allowing to probe large parameter regions of complex UV models, is crucial during the current and upcoming LHC runs. However, since the number of possible realizations of high-scale models exceeds the number of analyses available at the LHC, tools like ATOM [5658], CheckMate [59], MadAnalysis [60] or FastLim [58] and SModels [61] have been developed in recent years to recast existing limits on searches for new physics. A method that shows a flat reconstruction efficiency despite kinematic edges and population of exclusive phase space regions would be particularly powerful to set limits on complex models allowing for broad parameter scans.

In this paper we develop a reconstruction strategy for third-generation squarks that accumulates sensitivity from a wide range of different phase space regions and for a variety of signal processes.Footnote 1 The proposed reconstruction is therefore a first step toward an general interpretation of data, i.e. recasting. As a proof of concept, we study stop and sbottom pair production, followed by a direct decay into a hadronically decaying top or a bottom and a neutralino or chargino; see Fig. 1. In our analysis we use simplified topologies, including only sbottom and stops as intermediate SUSY particles and we focus on jets and missing energy as final state signal. While this example might be oversimplifying, e.g. it might not capture long decay chains that arise if the mass spectrum is more elaborate, this setup is motivated by naturalness [6466], i.e. it resembles minimal, natural spectra with light stops, higgsinos, and gluinos where all other SUSY particles are decoupled [67].

Fig. 1
figure 1

Generic Feynman diagram for stop production and decay. The initially produced squarks in our setup can be any of the two stops or the lighter sbottom. They decay subsequently into a top or a bottom and a higgsino, such that the electric charge is conserved

To be more specific, within this simplified setup the shape of the event depends strongly on the mass difference between the initially produced squarks and the nearly mass degenerate higgsinos in comparison to the top quark mass \(\mathcal {Q}=(m_{\tilde{q}}-m_{\tilde{h}})/m_t\). We can identify three regions in the physical parameter space leading to distinctly different topologies.

  1. 1.

    \(\mathcal {Q}<1\): The only accessible two-body decay of the produced squark is the decay into a bottom quark and a charged higgsino. Possible three-body decays contribute only in small areas of the parameter space.

  2. 2.

    \(\mathcal {Q}\gtrsim 1\): The decay into a top quark and a higgsino can become the main decay channel for the produced squark, depending on the squark and the parameter point. The top quark of this decay will get none or only a small boost from the decay and thus its decay products might not be captured by a single fat jet. However, the intermediate W boson can lead to a two-prong fat jet that can be identified by the BDRSTagger [68].

  3. 3.

    \(\mathcal {Q}\gg 1\): If the squark decays into a top quark the latter will be very boosted and its decay products can no longer be resolved by ordinary jets. Yet they can be captured by one fat jet and subsequently identified as a decaying top by the HEPTopTagger [69, 70]. The HEPTopTagger was designed to reconstruct mildly to highly boosted top quarks in final states with many jets, as anticipated in the processes at hand. However, other taggers with good reconstruction efficiencies and low fake rates in the kinematic region of \(\mathcal {Q} \gg 1\) (see e.g. [7173] and references therein) can give similar results.

Because the value of \(\mathcal {Q}\) is unknown and the event topology crucially depends on it, a generic reconstruction algorithm that is insensitive to details of the model needs to be scale invariant, i.e. independent of \(\mathcal {Q}\). Hence, it needs to be able to reconstruct individual particles from the unboosted to the very boosted regime.

Apart from scanning a large region of the parameter space, such an analysis has the advantage that it captures the final state particles from the three possible intermediate squark states \(\tilde{t}_{1,2}\) and \(\tilde{b}_1\), even if they have different masses. Therefore the effective signal cross section is increased compared to a search strategy which is only sensitive to specific processes and within a narrow mass range. In order to preserve this advantage we furthermore apply only cuts on variables that are independent of \(\mathcal {Q}\) or the mass of one of the involved particles.

The remainder of this paper is organized as follows. In Sect. 2 we give details of the parameter space that we target and the signal and background event generation. Section 3 contains a thorough description of the reconstruction of the top quark candidates and the proposed cuts as well as the results of the analysis. We conclude in Sect. 4.

2 Event generation

2.1 Signal sample and parameter space

As explained in the previous section, not only \(\tilde{t}_1\) but also \(\tilde{t}_2\) and \(\tilde{b}_1\) production contributes to the signal. For all these three production channels we consider the decay into a higgsino \(\widetilde{\chi }_1^\pm \), \(\widetilde{\chi }_{1,2}^0\), and a top or bottom quark. Since in our simplified topology setup we assume the higgsinos to be mass degenerate, we generate only the decay in the lightest neutralino \(\tilde{q}\rightarrow q+\widetilde{\chi }_1^0\) and the chargino \(\tilde{q} \rightarrow q' + \widetilde{\chi }_1^\pm \), where q, \(q'\) stand for t or b. The decay of the second lightest neutralino \(\widetilde{\chi }_2^0\rightarrow \widetilde{\chi }_1^{0,\pm } + X\) and of the chargino to one of the neutralinos \(\widetilde{\chi }_1^\pm \rightarrow \widetilde{\chi }_{1,2}^0+X\) does not leave any trace in the detector since the emitted particles X will be extremely soft. Thus, the event topologies for \(\tilde{t}_{1} \rightarrow t+\widetilde{\chi }_{1,2}^{0}\) will be the same and the different cross section for this topology can be obtained by rescaling with appropriate branching ratios.

We consider the following points in the MSSM parameter space. At fixed \(A_t=200\,\,\text {GeV}\) and \(\tan \beta =10\) we scan in steps of \(50~\,\text {GeV}\) over a grid defined by \(\mu \le m_{Q_3}, m_{u_3} \le 1\,400~\,\text {GeV}\) for the two values of \(\mu =150, 300\,\,\text {GeV}\). The gaugino masses as well as the other squark mass parameters are set to \(5\,\,\text {TeV}\) while the remaining trilinear couplings are set to zero. For each grid point we calculate the spectrum and the branching ratios with SUSY-HIT [74]. Despite the specific choices for the parameters our results will be very generic. An increased \(A_t\) would enhance the mixing between the left- and right-handed stops and thus render the branching ratios of the physical states into top and bottom quarks more equal. However, since the reconstruction efficiencies for both decay channels are similar, the final results would hardly change. The change due to a different mass of the physical states can be estimated from our final results. Similarly, a different choice for \(\mu \) only shifts the allowed region in the parameter space and the area where the decay into a top quark opens up but does not affect the efficiencies.

Since the squark production cross section only depends on the squark mass and the known branching ratios we can now determine which event topologies are the most dominant. In the left column of Fig. 2 we show for \(\mu =300\,\,\text {GeV}\) the relative contribution to the total SUSY cross section, defined as the sum of the squark pair production cross sections \(\sigma _ SUSY \equiv \sum _{S=\tilde{t}_1, \tilde{t}_2, \tilde{b}_1}\sigma _{S\bar{S}}\). In the right panels we show in color code the coverage defined as the sum of these relative contributions. The larger the coverage the more of the signal cross section can be captured by looking into these channels. Clearly considering only the decay of both squarks to top quarks and higgsinos is not enough as the parameter space with \(\mathcal {Q}<0\) is kinematically not accessible. Moreover, in the \(m_{Q_3}>m_{u_3}\) half of the space this final state misses large parts of the signal since the lighter stop decays dominantly into bottom quarks and charginos.

In the case where all three final states are taken into account, the coverage is nearly 100 % throughout the parameter space, except along the line \(m_{\tilde{t}_1}\approx m_t+m_{\tilde{h}}\) where the top decay channel opens up. There it drops to 70–80 %, because in this narrow region also the direct decay to a W boson \(\tilde{t}_1\rightarrow W+b+\widetilde{\chi }^0\) has a significant branching ratio.

Fig. 2
figure 2

Left panels The relative contribution of the different processes to the \(\sigma _ SUSY \), i.e. in red \(100\cdot \sigma _{T_1q_1q_2}/\sigma _{\mathrm{SUSY}}\) as a function of \(m_{Q_3}\) and \(m_{u_3}\), where \(T_1 q_1 q_2\) refers to \(\tilde{t}_1\) pair production followed by a decay into a higgsino plus \(q_1\) and \(q_2\), respectively. Right panels Coverage, i.e. sum of the relative contributions of the considered processes to \(\sigma _ SUSY \). From top to bottom the considered channels are missing energy plus only tt final states, tt and bt final states, and tt, bt, and bb final states. Note the different color scale of the lower right plot. The areas enclosed by the gray dashed lines show the points that are already excluded, determined by fastlim. Red dashed lines indicate the mass of the lightest stop in \(\,\text {GeV}\)

For all parameter points we generate events for each of the up to nine signal processes using MadGraph5_aMC@NLO, version 2.1.1 [75] at a center of mass energy \(\sqrt{s}=13\,\,\text {TeV}\). No cuts are applied at the generator level. The matching up to two jets is done with the MLM method in the shower-\(k_T\) scheme [76, 77] with PYTHIA version 6.426 [78]. We set the matching and the matrix element cutoff scale to the same value of \(m_{S}/6\) where \(m_S\) is the mass of the produced squark. We checked and found that the differential jet distributions [79] are smooth with this scale choice. The cross section for the signal processes is eventually rescaled by the NLO QCD and NLL K-factors obtained from NLL-fast, version 3.0 [8082].

2.2 Background sample

In our analysis we use top tagging methods based on jet substructure techniques. We therefore focus on the decay of the squarks into a neutralino and a bottom quark or a hadronically decaying top quark. The latter will generate between one to three distinct jets and the former will generate missing energy. Our final state therefore consists of missing energy and up to six jets. As background we thus consider the following four processes, all generated with MadGraph5_aMC@NLO version 2.1.1 [75] and showered with PYTHIA version 6.426 [78].

  • \(\varvec{Wj}\) : \(p p \rightarrow W_\ell + (2+X) jets \), where we merge up to four jets in the five flavor scheme and require that the W decays into leptons (including taus), such that the neutrino accounts for the missing energy.

  • \(\varvec{Zj}\) : \(p p \rightarrow Z_\nu + (2+X) jets \), where we merge up to four jets in the five flavor scheme and the Z decays into two neutrinos and hence generates missing energy. In both channels Wj and Zj we require missing transverse energy of at least \(70\,\,\text {GeV}\) at the generator level.

  • \(\varvec{Zt\bar{t}}\) : \(p p \rightarrow Z_\nu + t + \bar{t}\), where both top quarks decay hadronically, faking the top quarks from the squark decay and the Z decays again into two neutrinos to generate missing energy. This cross section is known at NLO QCD [83] and rescaled by the corresponding K-factor.

  • \(\varvec{t\bar{t}}\) : \(p p \rightarrow t\bar{t} + jets \), where one top decays hadronically and the other one leptonically to emit a neutrino, which accounts for missing energy. The NNLO+NNLL QCD K-factor is obtained from Top++ version 2.0 [84] and multiplied with the cross section.

3 Analysis

3.1 Reconstruction

For the reconstruction of the events we use ATOM [56], based on Rivet [85]. Electrons and muons are reconstructed if their transverse momentum is greater than \(10\,\,\text {GeV}\) and their pseudo-rapidity is within \(\left| \eta \right| <2.47\) for electrons and \(\left| \eta \right| <2.4\) for muons. Jets for the basic reconstruction are clustered with FastJet version 3.1.0 [86] with the anti-\(k_t\) algorithm [87] and a jet radius of 0.4 . Only jets with \(p_T>20\,\,\text {GeV}\) and with \(\left| \eta \right| <2.5\) are kept. For the overlap removal we first reject jets that are within \(\Delta R=0.2\) of a reconstructed electron and then all leptons that are within \(\Delta R=0.4\) of one of the remaining jets. All constituents of the clustered jets are used as input for the following re-clustering as described below.

The underlying idea behind the reconstruction described in the following is to cover a large range of possible boosts of the top quark. We therefore gradually increase the cluster radius and employ successively both the HEPTop- and the BDRSTagger. This allows us to reduce background significantly while maintaining a high signal efficiency.

A flowchart for the reconstruction of the top candidates with the HEPTopTagger is shown in Fig. 3. First the cluster radius is set to \(R=0.5\) and the constituents of the initial anti-\(k_t\) jets are re-clustered with the Cambridge–Aachen (C/A) algorithm [88, 89]. Then for each of the jets obtained we check if its transverse momentum is greater than \(200\,\,\text {GeV}\) and if the HEPTopTagger tags it as a top. In this case we save it as candidate for a signal final state and remove its constituents from the event before moving on to the next jet. Once all jets are analyzed as described above we increase the cluster radius by 0.1 and start over again with re-clustering the remaining constituents of the event. This loop continues until we exceed the maximal clustering radius of \(R_ max =1.5\).

Fig. 3
figure 3

Flowchart of the top reconstruction with the HEPTopTagger

After the reconstruction with the HEPTopTagger is finished we continue the reconstruction of the top candidates with the BDRSTagger as sketched in Fig. 4. We choose our initial cluster radius \(R=0.6\) and cluster the remaining constituents of the event with the Cambridge–Aachen algorithm. Since we now only expect to find W candidates with the BDRSTagger and need to combine them with a b-jet to form top candidates, the order in which we analyze the jets is no longer arbitrary. Starting with the hardest C/A jet we check if its transverse momentum exceeds \(200\,\,\text {GeV}\), its invariant mass is within \(10\,\,\text {GeV}\) of the W mass, and the BDRSTagger recognizes a mass drop. In the case that one of the above requirements fails we proceed with the next hardest C/A jet until either one jet fulfills them or we find all jets to fail them. In the latter case we increase the cluster radius by 0.1 and repeat the C/A clustering and analyzing of jets until the radius gets greater than \(R_ max =1.5\). Once a jet fulfills all the previous criteria we need to find a b-jet to create a top candidate. To do this we recluster the constituents of the event that are not part of the given jet with the anti-\(k_T\) algorithm and a cone radius of 0.4 and pass them on to the b-tagger.Footnote 2 Starting with the hardest b-jet we check if the combined invariant mass of the W candidate and the b-jet is within \(25\,\,\text {GeV}\) of the top quark mass. If such a combination is found it is saved as a candidate and its constituents are removed from the event. The remaining constituents of the event are reclustered with the C/A algorithm and the procedure repeats. Alternatively, if all b-jets fail to produce a suitable top candidate the next C/A jet is analyzed. Once the C/A cluster radius exceeds \(R_ max \) the remaining constituents of the event are clustered with the anti-\(k_T\) algorithm with radius 0.4 and passed on to the b-tagger. Those that get b-tagged are saved as candidates of the signal final state as well.

Fig. 4
figure 4

Flowchart of the top reconstruction with the BDRSTagger

3.2 Analysis cuts

After having reconstructed the candidates for the hadronic final states of the signal—top candidates and b-tagged anti-\(k_T\) jets—we proceed with the analysis cuts. As our premise is to make a scale invariant analysis, we must avoid to introduce scales through the cuts. We propose the following ones and show the respective distribution before each cut in Fig. 5.

  1. 1.

    Zero leptons: The leptons or other particles that are emitted by the decaying chargino or second lightest neutralino are too soft to be seen by the detector. Moreover, since we focus on the hadronic decay modes, no leptons should be present in the signal events. In the background, however, they are produced in the leptonic decays which are necessary to generate missing energy. We therefore require zero reconstructed electrons or muons.

  2. 2.

    Exactly two candidates: The visible part of the signal process consists of two hadronic final states as defined above. In the rare case that an event contains more but in particular in the cases where an event contains less than these two candidates it is rejected. This means that no b-jets beyond possible b candidates are allowed.

  3. 3.

    : Since we cannot determine the two neutralino momenta individually, it is impossible to reconstruct the momenta of the initial squarks. Yet, we can make use of the total event shape. In the signal, the transverse missing energy is the combination of the two neutrino momenta and therefore balances the transverse momenta of the two candidates. Consequently the vectorial sum of the candidate’s transverse momenta \({\vec {p}}_{T,c_i}\) has to point in the opposite direction of the missing energy.

  4. 4.

    : This cut is based on the same reasoning as the previous one. The absolute value of the summed candidate’s transverse momenta and the missing transverse energy needs to be small. In order to maintain scale invariance we normalize the result by .

  5. 5.

    : By this cut we require that the missing transverse energy and the transverse momentum of the harder of the two candidates are not back-to-back. Since the two produced squarks are of the same type and the higgsinos are mass degenerate, the recoil of the top or bottom quarks against the respective higgsino will be the same in the squark rest frame. Therefore, the two neutralinos should contribute about equally to the missing energy and spoil the back-to-back orientation that is present for each top neutralino pair individually. We therefore reject events where one top candidate recoils against an invisible particle and the second candidate does not. Moreover, we can thus reject events where the missing energy comes from a mismeasurement of the jet momentum.

  6. 6.

    : This cut exploits the same reasoning as the previous one.

Fig. 5
figure 5

Normalized distributions of the background and signal processes before the cut on the respective observable. The numbers in the brackets stand for the soft mass parameters \(m_{Q_3}\) and \(m_{u_3}\) in GeV and the sums in c and d are understood to be vectorial. In all plotted samples \(\mu =300\,\,\text {GeV}\)

Fig. 6
figure 6

Normalized distribution of the type of tags of the candidates and of \(m_{T2}\) after the last cut

Fig. 7
figure 7

Efficiency of the cuts as a function of the squark mass for all samples with \(\mu =300\,\,\text {GeV}\)

Table 1 Cut flow for the background processes. The numbers give the cross section in picobarns after the respective cuts
Table 2 Cut flow for the signal processes. The cross section is given in picobarns after each cut. In the first line the parameter points are specified as \((m_{Q_3},m_{u_3})\), both in GeV
Fig. 8
figure 8

Total efficiency of all cuts combined in the \(m_{Q_3}\)\(m_{u_3}\) plane. The red dashed lines show the mass of the lighter stop in GeV

Fig. 9
figure 9

Stacked distribution of \(m_{T2}\) after all cuts. The plots are based on the samples with \(\mu =300\,\,\text {GeV}\) and the numbers in the brackets refer to the soft mass parameters \(m_{Q_3}\) and \(m_{u_3}\) in GeV

Fig. 10
figure 10

Exclusion limits in the \(m_{Q_3}\)\(m_{u_3}\)-plane. A Monte Carlo error and a systematic error of 15 % on the background normalization is assumed. In all three plots \(\mu =150\,\,\text {GeV}\), which corresponds roughly to the mass of the higgsinos

Fig. 11
figure 11

Exclusion limits in the \(m_{Q_3}\)\(m_{u_3}\)-plane. A Monte Carlo error and a systematic error of 15 % on the background normalization is assumed as detailed in the main text. In all three plots \(\mu =300\,\,\text {GeV}\), which corresponds roughly to the mass of the higgsinos

In the left plot in Fig. 6 we show the relative contribution of the types of the two final state candidates after all cuts. In the samples with heavy stops and sbottoms the HEPTopTagger contributes between 30–40 % of the candidates. In the samples with lighter squarks and thus less boosted objects the HEPTopTagger finds less candidates and the BDRSTagger contributes up to about 15 % of the top quark candidates. However, most candidates besides the ones from the HEPTopTagger are b-jets.

In Fig. 7 we show the efficiency of each cut for the three possible final states as a function of the squark mass. As anticipated they show only a mild mass dependence. This is also reflected in the total efficiency which is very flat over the whole parameter space as can be seen in Fig. 8.

The cut flows for background and signal together with the signal over background ratios S / B are given in Tables 1 and 2, respectively. The values for S / B range from about 0.3 in the samples where the stops have a mass of only about \(350~\,\text {GeV}\) to \(10^{-3}\) in the sample with heavy stops of about 1 400 GeV.

3.3 Results

To continue further, we consider the \(m_{T2}\) [90, 91] distribution that is shown in the right plot of Fig. 6 (normalized) and in Fig. 9 (stacked). \(m_{T2}\) is designed to reconstruct the mass of the decaying particle and gives a lower bound on it. This is reflected in the plotted distribution, where the upper edge of the signal distribution is just at the actual squark mass. For the calculation of \(m_{T2}\) we assume zero neutralino mass and use a code described in [92] and provided by the authors of this reference.

Instead of imposing an explicit cut on \(m_{T2}\) to improve S / B, mainly to the benefit of the processes involving heavy squarks, we rather evaluate the statistical significance applying a binned likelihood analysis using the \(CL_s\) technique described in [93, 94]. For the calculation we employ the code MCLimit [95]. We assume an uncertainty of 15 % on the background cross section and also include an error stemming from the finite size of the Monte Carlo sample. For the latter we need to combine the pure statistical uncertainty with the knowledge of a steeply falling background distribution. To do this we determine for each background process and each bin the statistical uncertainty \(\omega \sqrt{N}\), where \(\omega \) is the weight of one event and N is the number of events in the given bin. Conservatively, we assign \(N=1\) for those bins which do not contain any events of the given process. In the high \(m_{T2}\) region where no background events appear this method clearly overestimates the error on the background which is steeply falling. In addition we therefore fit the slopes of the \(m_{T2}\) distributions with an exponential function and use this function to extrapolate the background distribution to the high-\(m_{T2}\) region. As uncertainty on the shape for a given background process we now take in each bin the minimum of \(\omega \sqrt{N}\) and three times the fitted function. This way the error in the low \(m_{T2}\) range is determined by the statistical uncertainty while the one in the high \(m_{T2}\) range from the extrapolation. The combined error on the background in each bin is then obtained by summing the squared errors of each process and taking the square root.

The results are shown in Fig. 10 (Fig. 11) for \(\mu =150\, (300) \,\,\text {GeV}\) and integrated luminosities of 100, 300,  and \(1\,000\,\text {fb}^{-1}\). In Fig. 12 we show for \((m_{Q_3},m_{u_3})=(1500,1500)\) the \(CL_s\) exclusion limit as a function of the integrated luminosity. Even for this parameter point, close to the predicted sensitivity reach of the LHC, using our approach, we find a \(95~\%\) CL exclusion with \(600~\mathrm {fb}^{-1}\).

Fig. 12
figure 12

p values as a function of integrated luminosity for the parameter point with \((m_{Q_3},m_{u_3})=(1500\,\,\text {GeV},1500\,\,\text {GeV})\) and \(\mu =300\,\,\text {GeV}\)

4 Final remarks

The main idea behind this analysis was to obtain a scale invariant setup. In the first step we achieved this by employing the HEPTop- and BDRSTaggers together with varying radii. Thereby we managed to pick the minimal content of a hadronically decaying top quark for a large range of top momenta. In the second step we avoided introducing scales in the cuts and only exploited the event properties that are independent of the mass spectrum. After this proof of concept it will now be interesting to apply this principle to other searches where top quarks with various boosts appear in the final state as for example in little Higgs models with T-parity [9698].