Discovery potential of Higgs boson pair production through final states at a 100 TeV collider

We explore the discovery potential of Higgs pair production at a 100 TeV collider via full leptonic mode. The same mode can be explored at the LHC when Higgs pair production is enhanced by new physics. We examine two types of fully leptonic final states and propose a partial reconstruction method, which can reconstruct some useful kinematic observables. It is found that the mT2 variable determined by this reconstruction method and the reconstructed visible Higgs mass are crucial to discriminate the signal and background events. It is also noticed that a new variable, denoted as Δm, which is defined as the mass difference of two possible combinations, is very useful as a discriminant. To examine the detector effects, we consider seven detector setups for a 100 TeV collider and investigate the changes in the sensitivity, and we find that lepton isolation and the minimal lepton Pt cut are crucial in order to reduce the integrated luminosity.


Introduction
The discovery of the Higgs boson at the LHC has motivated the high energy community to think of the next generation of pp colliders. A 100 TeV collider can offer us huge potential to probe various forms of new physics [1]. For example, new vector bosons W and Z could be discovered up to [25][26][27][28][29][30][31][32][33][34][35] TeV [2]. A heavy Higgs bosons of the two Higgs doublet model could be probed up to 20 TeV or so via single associated production [3]. In the simplified model, the superpartners of the top quark and the gluino could be probed up to 5 TeV and 10 TeV, respectively [4], which would give decisive evidence on the fate of the electroweak supersymmetry models. Dark matter candidates could be probed up to 10 TeV or higher [5][6][7][8]. A 100 TeV collider could also perform high precision measurements of Higgs properties [9], top quark properties, EW physics, and so on. The next generation pp colliders would be complementary to the Higgs factories, either in the form of CEPC, or FCC-ee, or ILC, where the mass generation mechanism of the standard model (SM), i.e. the couplings of the Higgs boson to the rest of the SM particles, can be precisely measured [10].
Among the various new physics candidates, the shape of the Higgs potential plays a very special role. As we know, the shape of the Higgs potential is determined by Higgs fields and Higgs self-couplings. New Higgs fields and new self-couplings are common in most extensions of the SM. Therefore, it is well-known [11][12][13][14][15][16] that to probe Higgs self-couplings at colliders can offer us a way to understand the nature of Higgs bosons, to reconstruct the shape of Higgs potentials, and to understand the mys-tery of electroweak symmetry breaking. These couplings could play crucial parts in the EW baryogenesis scenarios [17,18]. For example, they are crucial to determine whether CP violation is strong enough to produce a large enough matter anti-matter asymmetry in our universe, which is needed in terms of one of the three Sakharov criteria. In the two Higgs doublet model, it is possible to introduce meaningful complex Higgs self-couplings to induce a large enough CP violation, which is needed for the EW baryogensis scenarios. These couplings are also important to determine whether the strong first-order phase transition could occur for a realistic EW baryogenesis scenarios, and they can affect gravitational wave radiation in the process of bubble collisions [19][20][21], where the gravitational wave can induce a B mode which is detectable from the cosmological microwave background [22][23][24]. So there is strong motivation to explore the shape of the Higgs potential.
Compared to the Standard Model (SM), new physics could modify either effective trilinear (or cubic) or quartic couplings or both, either at 10%−20% via loop corrections [25] or more than 100%−300% via tree-level corrections (say adding a dimension-6 operator [26][27][28] or many higher dimensional operators [29]). The Lorentz structure of the triple Higgs boson vertex can even be modified in the Higgs-Gravity model [30,31], which could lead to energetic Higgs bosons in the final states [32]. Using the discovered Higgs boson as a probe to measure the selfcouplings of the Higgs boson could help us to further understand the nature of the Higgs bosons, and to extract information on the shape of the Higgs potential, which encodes electroweak symmetry breaking. Therefore, to measure trilinear couplings and quartic couplings [33][34][35][36] will be of great importance and could be one of the prime targets for both the LHC high luminosity runs and future collider projects.
The study of the di-Higgs boson final states in the SM and new physics models at hadron colliders has been a hot topic recently. Various production processes and final states have been explored in the literature, including bbγγ [32,37,38], bbbb [39][40][41], WW bb [42], WW γγ [43], bbττ [37], the rare decay final states 3 2j [44] and others [45]. Both ATLAS and CMS collaborations have performed realistic simulation and analysis on di-Higgs boson final states at the LHC [46,47].
In this work, we extend our study in Ref. [44] to the pure leptonic mode, i.e. pp → hh → 4 + / E in a 100 TeV collider. To our best knowledge, this mode has not been carefully studied in the literature due to its tiny production rate in the SM at the collision energy of the LHC. But for some new physics models, the production rate of di-Higgs can be enhanced by a factor from 10 to 100, and then this mode could be accessible even at the LHC. For a 100 TeV collision, the production rate of this mode in the SM itself is large enough and is accessible. Meanwhile, since it gives pure leptonic final states, this mode can be searched by experimental groups relatively easy. Therefore, it is meaningful to perform a careful analysis of this mode either for the LHC runs or for a future 100 TeV collider project.
In order to determine the Higgs self-couplings at future hadron colliders, a precision measurement of the top Yukawa coupling at the LHC and future colliders is crucial. The top quark Yukawa coupling plays a remarkable role in probing the properties of the Higgs boson. It is the strongest Yukawa coupling, and almost saturates the perturbation bounds; it can affect the vacuum stability [48] much more seriously than any other Yukawa couplings in the SM; it determines the multi-Higgs production at hadron colliders and affects the decay of the Higgs boson to gluon pair, di-photon and Zγ final states much more than the other fermions in the SM; and it affects the Higgs self-coupling measurements at hadron colliders, both trilinear and quartic coupling measurements.
Therefore, in this work, to examine how the top quark Yukawa coupling can affect the measurement of the trilinear Higgs coupling, we take the following effective Lagrangian where the term Y t = √ 2m t /v is the Yukawa coupling of the top quark in the SM, and both a and b are dimensionless parameters. The parameter b is related to CP violation. In the SM, a = 1 and b = 0. In the two Higgs doublet model (2HDM) with no CP violation, a = ctgβ and b = 0. If there is CP violation in the 2HDM, the CP even and CP odd neutral scalars could mix, which leads to a non-vanishing b. Early efforts to probe this coupling at hadron colliders can be found in [49]. The study of measurement of these couplings at linear colliders can be found in [50]. A recent study of how to measure these two free parameters at the LHC and at future hadron colliders can be found in [51,52], where a different but equivalent parametrisation was used. Theoretical calculation of loop corrections from tth can be found in [53]. A recent study on the CP properties of the 2HDM can be found in [54]. For a systematic analysis of the constraints from the Higgs precision measurement on either a or b, interested readers can refer to [55][56][57].
The term λ SM = m 2 h /2v 2 ≈ 0.13, while λ 3 is a free dimensionless parameter. In various new physics models, this coupling can vary over a large range. For example, in the framework of an effective operator, the strong first order electroweak phase transition has been explored in [28], where λ 3 can be in the range [5/3, 3]. In the model with a singlet + SM, the trilinear coupling could be larger than the value of the SM by more than 20% to 200% [58], and this deviation is dependent upon the mass of the singlet. Such an enhancement can also occur in the Minimal Dilaton model [59] and in the NMSSM [60].
There are three main aims for this work: 1) at the hadronic level, we explore the sensitivity of the pure leptonic mode pp → hh → 4 + / E at a 100 TeV collider; 2) we examine the detector effects on the sensitivity; 3) we examine the correlation of the direct measurement oftth and the direct measurement of λ 3 in future colliders by using the numerical approach.
The new findings of this work are as follows. 1) By using our proposed reconstruction method, the pure leptonic mode of Higgs pair production is accessible at a 100 TeV collider and the hadronic level analysis indicates that it could have a pretty high significance. 2) When detector effects are taken into account, the significance can be reduced by a factor from 2 to 4 due to the loss of signal events, which means 4 to 16 times more datasets are needed in order to achieve the same significance of the hadronic level analysis. One of the crucial things is to include those soft leptons (say 10 GeV < P t ( ) < 20 GeV) in our analysis, which can enhance the significance.
3) The top quark Yukawa coupling can introduce a large uncertainty to the determination of Higgs trilinear coupling due to the fourth power dependence of Higgs pair production on the top quark Yukawa coupling, and a 100 TeV collider can help us to determine the trilinear coupling to a 5% level from the uncertainty of tth. The coupling of Higgs to weak vector bosons can also significantly affect the Higgs pair production rate, like in the processes pp → hh+X and e + e − → hh+X, and can then affect the determination of λ 3 .
The work is organised as follows. In Section 2, we study the cross section of the process gg→hh. In Section 3, we examine the characteristic features of full leptonic final states. In Section 4, we explore the sensitivity of two types of the same sign leptonic final states gg→hh→4 +/ E in a 100 TeV collider. In Section 5, we investigate the detector effects on the sensitivity of the fully leptonic mode of Higgs pair production. In Section 6, we examine the issue of how tth measurement can affect the determination of the trilinear Higgs couplings. In Section 7, by using the effective Lagrangian, we also study how the couplings of Higgs boson to weak vector bosons and the trilinear Higgs couplings can affect the main production rates at both hadron colliders and electronpositron colliders. We end this work with a conclusion and some discussion. We provide an appendix to describe the quasi-Monte Carlo method implemented in our code "wat", which has been used in this work to evaluate cross sections and to generate unweighted signal events.

Cross section of Higgs pair production at hadron colliders
Here we briefly describe the numerical methods that we used in this work.
To study the cross sections of Higgs pair productions, we implemented the effective Lagrangian described in Eq. (1) as a new model file by modifying the one-loop SM model file in MadGraph5/aMC@NLO [61]. Then we interface the loop matrix elements outputted from Mad-graph5/aMC@NLO with our code "wat", based on QMC, to perform phase-space integration and event generation. The main algorithms of "wat" are presented in Appendix A. The generated signal events are further showered by PYTHIA6 [62]. The leptons, jets, and missing transverse energy in the final states are simply processed by using the package FASTJET [63]. Then signal events are used to perform physical analysis at hadronic level. To study the detector effects, we use the fast-detector simulation package DELPHES [64]. All the background events are generated by using the Mad-graph5 package and are processed as the signal events are done. To apply the multivariate analysis in this work, we have called the subroutines of the MVA package built-in the ROOT package. Below we describe how the effective Lagrangian in Eq. (1) is implemented in some details. We add the parameters a, b, and λ 3 and the corresponding tree level vertices by following the UFO protocol given in [65]. We use MadGraph5 to compute the one-loop amplitudes, where the OPP method [66] has been implemented. Based on the unitarity cuts of one-loop scattering amplitudes [67], the OPP method is a systematic algorithm that reduces a one-loop amplitude into scalar integrals, shown as Eq. (2): i , and N (4, q, p 1 , · · · , p n ) is the numerator in 4 dimensions. The main idea of this method is to choose several q values smartly to calculate the numerator N numerically, then these coefficients d ijkl , c ijk , b ij , a i can be solved easily. Therefore, it can only deal with a 4 dimensional numerator, as indicated in Eq. (2). The difference between a D-dimensional numerator and a 4-dimensional numerator will give an ex-tra rational term after integrating the loop momentum, shown as Eq. (3): It has been proven that this rational term R 2 is related to the ultraviolet nature [68,69], and hence is only required to be calculated once. All these R 2 terms for the SM have been obtained in Refs. [70,71], and in Ref. [72]. Furthermore, an automatic tool to obtain R 2 term for any renormalizable model is presented. For the effective Lagrangian in Eq. (1), the R 2 terms related to Higgs pair production are shown as below, and we implement them as well as tree-level vertices in a UFO model file [65]: where a 1 , a 2 are the color indices, µ 1 , µ 2 are the Lorentz indices, and g s is the QCD coupling constant. Then we interface the loop matrix element produced by MadGraph5/aMC@NLO [61] to our integration and event generation code to obtain the leading order cross section and unweighted signal events.
At the leading order, cross sections in hadron colliders can be parametrised as the function of theoretical free parameters a, b, λ 3 in the following form It is noticed that this cross section is sensitive to the signs of a and λ 3 , respectively, but is insensitive to the sign of b. There are two types of diagrams contributing to the process gg → hh: box diagrams and triangle diagrams. The terms independent of λ 3 come from the squared amplitudes of box diagrams. The terms proportional to λ 2 3 are from the squared amplitudes of triangle diagrams, and the terms proportional to λ 3 are from the interference between the box and triangle diagrams.
We use the numerical approach to fit the coefficients G 1 − G 7 for the LHC at 14 TeV, 33 TeV and a 100 TeV collider. We generate more than 100 points in (a,b,λ 3 ) space for each collision energies. The coefficients of cross sections at the 14 TeV, 33 TeV and 100 TeV are presented in Table 1. and G 14 5 . This enhancement factor is smaller than that of the box diagrams due to the s-channel suppression for energetic gluon fluxes. The interference coefficients G 100 6 and G 100 7 are also 40 times larger than G 14 6 and G 14 7 .
Another interesting observation is that the coefficient G 3 is 7 times larger than the coefficient G 1 . Typically, when b is much smaller than 1, the contribution of G 3 term cannot be large. But if b can be of order one, then the contribution of G 3 can be sizeable. In Refs. [57,73,74], more operators have been taken into account. We have compared our results presented there and found agreement.

Full leptonic modes of signal and background events
There are quite a few advantages of the full leptonic mode of gg → hh → 4 + / E for a 100 TeV collision. First, it is relatively efficient for searches by experiments. The targeted objects in the final state are leptons and missing energy. They can be reconstructed efficiently by the subdetector systems of the LHC detectors. The lepton reconstruction efficiency with P t > 5 GeV is more than 90% and particle identification can be made at detector level. Second, the signal is relatively clean and it is robust against the contamination of pileup and underly-ing events, since primary collision vertices of the signal events can be reconstructed.
We use Madgraph5 to generate background events at a collision energy of √ s = 100 TeV and use PYTHIA6 to perform showering and decay simulations. Background processes without Z bosons in the final states are generated by using the on-shell approximation for the top quark, W boson and Higgs boson. For background processes with a single Z boson in the final state which decays into two leptons, like ttZ, ZW + W − and Zh background processes, we have included the effects of off-shell Z and γ and their interferences. For both signal and background, higher order corrections are taken into account by normalizing the total cross section to their (N)NLO results, which is indicated by the K factor (K = σ (N)NLO /σ LO ) in Table 3. We adopt the NNLO result for signal in Ref. [75], and the K factor for processes hZ, ttZ, ZW + W − , tth, hW + W − are obtained by MadGraph5/aMC@NLO [61] under the on-shell approximation. The K factor for tttt and ttW + W − at 100 TeV collider is unknown, and we adopt a value K = 1.3, which is the K factor for tttt at 14 TeV LHC. The K factor for the process pp → W + W − W + W − is also unknown, and we use the K factor for ZW + W − since both of them have the same initial states at hadron colliders.

023105-5
The background processes can be roughly divided into the following three types: • 1) Single Z processes, including pp → Zh, pp → ZW + W − , and pp → Ztt. The last process has a cross section around 35 times larger than the former two processes due to its QCD nature, while Zh and ZW + W − have similar cross sections.
• 2) Top pair processes, including pp → tth, pp → tttt, and pp → ttW + W − . We notice that htt is the dominant background in this category.
• 3) Pure electroweak processes, including pp → hW + W − and pp → W + W − W + W − . Each of their cross sections is smaller than that of the signal, but their sum is comparable to that of the signal after taking into account the K factors.
The cross sections of these processes are listed in Table 3. It is worth remarking that single-Higgs associated processes in the categories above are the main background events, and it turns out that tth is the dominant background for the full leptonic mode, which can greatly affect the significance.
In order to select the most relevant events, we introduce the following preselection cuts: • For each event, there must be four leptons. The leading two leptons should have transverse momenta larger than 30 GeV and 15 GeV, respectively. The third and fourth leptons should have transverse momenta larger than 10 GeV and 8 GeV respectively. In Fig. 1, we show the transverse momenta of the four leptons in the signal events.
• We demand the future detector should have a better coverage of η( ) as |η( )| < 4 for identified leptons. Due to the good space resolution power and fine granularity in the ECAL detector, we require that the minimal angular separation between two leptons is ∆R min ( , ) 0.15. In Fig. 2, we demonstrate the distribution of the maximal η max ( ) and the minimal angular separation of any pair of two leptons ∆R min ( ). When these two cuts on leptons are applied, more than 90% signal events can be accepted. We will discuss how the coverage of η can affect our results in the discussion section.
• Considering the fact that tt processes have quite large contributions to the background events, we introduce the "b-jet veto" to reject events with a tagged b-jet with P t (b) > 40 GeV and |η(b)| < 5.
We assume that the b tagging efficiency is 60% and simply multiply these types of background processes by a factor 0.16.
• For the decay mode M3, in oder to suppress the background events from meson decays in the final states and Z boson decay, we demand the invariant mass of both lepton pairs should fall into two windows, either 15 GeV m + − 80 GeV or m + − 100 GeV. Table 4. Preselection cuts used in our analysis.
low energy hadron veto and Z mass veto From the results given in Table 3, the background from pp → ZZ is huge, about 2000 times larger than the rest. Furthermore, due to the off shell Z and γ * , the cut of invariant mass of two pairs of four leptons can only suppress the background by a factor of 100 at most. Our previous experience in studying 3 + 2j [44] reveals that background events from Z, γ * -exchange processes are difficult to suppress. Therefore, the first two modes are challenging in the SM and will be neglected in this analysis. When there is a significant enhancement to Higgs pair production by new physics, these two modes should be considered.
In this work, we will focus on the third and fourth modes (labelled as "M3" and "M4", respectively) and we will perform a detailed analysis on these two modes. Although compared with the third mode, the fourth mode has a smaller production rate, it has less background contributions from the SM.

Kinematic features of signal and background events in M3 and Mcases
In this section, we explore the kinematic features of the signal and background events. It is impossible to fully reconstruct all the final particles due to 4 neutrinos in the final state.
The physical observables can be divided into two types. The first type is defined as the global and topological event shape observables for each event, and the second type is defined from the partial reconstruction method introduced later. The first type includes the following observables listed below.   • o1) The missing transverse momentum spectrum.
Since there are 4 neutrinos in the final states, we expect that there should be a large transverse momentum. For the signal, as shown in Fig. 3(a), the distribution peaks near 60-80 GeV.
• o2) The invariant mass of four leptons. This quantity is expected to capture the mass of mother particles. For the signal, as shown in Fig. 2(b), the distribution peaks near 100-150 GeV. Since four neutrinos can take away half the energy of a Higgs pair, this quantity is expected to be close the mass of one Higgs boson.
• o3) The transverse mass of each event is constructed from the sum of 4-momentum of leptons (denoted as P 4 , which has components (E 4 , P x 4 , P y 4 , P z 4 )) and the missing transverse momentum ( / P T ), which has components To construct this observable, we boost the 4momentum of leptons such that the P z 4 = 0, i.e. (Ẽ 4 , P x 4 , P y 4 , 0). Then we construct the observable as , where φ denotes the azimuthal angle between theP 4 and / P T . For the signal, as shown in Fig. 3(c), the distribution peaks near 150-250 GeV.
• o4) The invariant mass of visible objects is defined as the momentum sum of four leptons and jets.
For the signal, as demonstrated in Fig. 3(d), the distribution peaks near the region 100-300 GeV.
• o5) The number of jets in each event with P t (j) > 40 GeV. As demonstrated in Fig. 3(e), we noticed that when demand n j 2, around 90% signal events can be selected out, though with a considerable number of background events from tt processes.
Below we explain how to construct the second type of observables.
Obviously, the most important information about the signal events is the mass of the Higgs boson. So it is crucial to extract this useful observable. Due to the fact that four neutrinos cannot be fully reconstructed, instead we can only determine the visible masses of each Higgs boson from the identified leptons.
To determine the visible mass of each Higgs boson mass, we encounter a minor combinatorics issue: there are two possible combinations in each event. To determine which one is correct, we follow the minimal mass method introduced in [44] to determine the right combination, which can yield a correctness up to 94% here as we have checked this by using the parton level data sample. The method evaluates the sum of two visible masses of Higgs bosons for each combination, and picks out the smaller one as the correct combination. In contrast, we examine a method by using the angular separation of leptons, which can only yield a correctness up to 85% at most.
After having found the visible mass of the Higgs boson in the signal, by using the standard M T2 method we can split the missing transverse momentum into two parts and exploit the kinematic feature of pair production to reconstruct the transverse mass of Higgs boson. Then we can construct the second type of observables as listed below.
• o6) The observable ∆m, which is defined as the mass difference between two mass sums of reconstructed Higgs bosons in two possible combinations. The distribution is shown in Fig. 3(f), and shows a large shape difference between the signal and background processes in the shape.
• o7)) The first reconstructed partial mass of the Higgs boson with two leptons of the same flavour in the M3 case, which is labelled as h 1 . In the M4 case, the one with the hardest lepton is labelled as h 1 . The distribution of this quantity is shown in Fig. 3(g) and Fig. (5)g. In Fig. (3)g, the Z mass window cut is clearly shown for the processes with a single Z.
• o8)) The second reconstructed partial mass of the Higgs boson with two leptons of different flavour in the M3 case, labelled as h 2 . In the M4 case, the one reconstructed not with the hardest lepton is labelled as h 2 . The distribution of this quantity is shown in Fig. 3(h) and Fig. 5(h).
• o9) The transverse mass of the Higgs bosons reconstructed by using the M T2 method, which is shown in Fig. 4(a) and Fig. 6(a).
All of these reconstructed observables are crucial for both M3 and M4 cases. There exist strong correlations among these observables for signal events and they are related to the mass of the Higgs boson, while for the background they are not necessarily related to the mass of the Higgs boson. Such a fact can be utilised to separate signal and background, which is the spirit of multivariate analysis methods.

The M3 case
In the M3 case, the dominant background processes are the single Z associated processes, as clearly demonstrated in Table 5 in the cut-based analysis. From Fig.  3(g), the mass cut can clearly affect the single Z processes.
In the cut-based analysis, we choose the following 4 cuts: 1) a cut on the number of jets n j 2, which is supposed to suppress background processes associated with a top pair; 2) a cut on the variable m T2 , as we demand m T 2 < 110 GeV, which can greatly suppress the background processes like three body and four body productions; 3) a cut on the reconstructed visible mass of Higgs boson is imposed as m h 1,2 60 GeV, which can suppress background events from the Zh and htt processes; 4) a cut on the mass difference ∆m > 50 GeV. By using these cuts, we can achieve S/B and significance 0.25 and 4.2, respectively.

023105-9
According to the results based on the cut-based method, it is noticed that the processes associated with top pair are the main background for the "M3" mode. In order to suppress this type of background, we introduce a few top veto observables in our BDT analysis. For example, the visible invariant mass of the whole event, as demonstrated in Fig. 3(d), can help to separate background events with and without a top quark pair. Furthermore, the number of jets in each events can help to suppress the background events with a top quark pair, as demonstrated in Fig. 3(e). These two observ-ables are correlated with each other for the background processes.
In Table 5, both a cut-based analysis and a BDT analysis are presented for comparison. As demonstrated in Fig. 4(b), after taking into account these observables to reject top quark final states, we can further suppress background and gain in the S/B and significance up to 0.38 and 6.1, respectively, as shown in Table 5. As discussed in [76], if an integrated luminosity of 20-30 fb −1 for a 100 TeV collider can be achieved, then a significance 20.0 is expected.

The M4 case
In the M4 case, similar to the M3 case, we can reconstruct the visible mass of Higgs boson, h 1 ( ) and h 2 ( ), respectively, where subscript 1 and 2 are assigned according to the simple rule: the one which includes the most energetic lepton is assigned to be h 1 and the other is assigned to be h 2 . It is noticed that these two masses peak near 30-50 GeV and 20-40 GeV, respectively, and both have an edge near 60 GeV.
Different from the M3 case, the main task in the M4 mode is to suppress the background processes from htt final states, while the singlet Z processes can be negligible and are omitted in Table 6. The final state htt is the dominant background. Our reconstruction method can successfully reconstruct the Higgs boson in htt processes, as demonstrated in Fig. 5(h), where the shapes of background and signal look similar. Therefore, we impose a cut on m(h vis 1 ), instead of both as in the M3 mode case. A cut-based analysis and a BDT analysis are presented in Table 6. For the third cut, in this case, we only impose a cut on the visible mass m(h vis 1 ) < 60 GeV. As demonstrated in Fig. 6(b), after taking into account these observables to reject top quark final states, we can further suppress the main background and gain in the S/B and significance up to 1.6 and 6.8, respectively, as shown in Table 6.  After using the same methods to veto top pair associated background in the MVA method, a better result is obtained, where the S/B and significance can reach 1.9 and 9.2, respectively. If an integrated luminosity 20-30 fb −1 is assumed, then a significance of 30.0 is expected.
When comparing the results given in Table 5 and Table 6, we notice that the S/B of the M4 case is much better than that of the M3 case, due to the lack of huge background processes as in the M3 case. Furthermore, the analysis for the M4 case is relatively simpler than the M3 case due to the background from single Z processes being efficiently suppressed.

Detector effects
Here we examine how the detector can affect our results given in Table 5 and 6. We focus on the results of the BDT analysis.
The key issue of the detector is related to lepton acceptance. We show the acceptance efficiencies for the signal events in Table 7. If the coverage of η for leptons can reach 4, then 90% of the signal events can be accepted, but if the coverage of η for leptons can only reach 2, less than half of the signal events can be accepted. A detector which can cover a larger η region (say |η( )| < 4) is good for Higgs pair signals. We examine the detector effects by comparing seven detector setups: 1) an ATLAS-like detector; 2) an ATLAS-like detector, with a larger η( ) coverage up to 4 only; 3) an ATLAS-like detector with η( ) < 4, and η(b) < 4; 4) an ATLAS-like detector with η( ) < 4, η(b) < 4, and smaller ∆R for lepton isolation; 5) a FCC default detector provided in DELPHES package version 3.4.2; 6) an improved FCC detector with lower jet P t threshold; 7) an improved FCC detector with a further lower lepton P t threshold. The key parameters of these detector are tabulated in Table 8. The first two are determined at the hardware level, and the rest three can be implemented in trigger and analysis levels.  In these detector setups, except for the spatial coverage, we assume all sub-detectors of the ATLAS-like detector setups have the same performance. Likewise, all sub-detectors of the FCC-like detector setups are assumed to have the same performance.
The results for M3 and M4 fully leptonic modes are shown in Table 9. A few comments on the results are in order.
1) For the ATLAS-like detector, due to η max being small, around half of the signal events are lost, as can be seen from Table 7. The requirement of a larger ∆R( ) causes a further loss of signal due to the spin correlation of two leptons from Higgs decay and a moderately boosted Higgs boson. The lepton reconstruction efficiency further causes signal loss. Although background events also suffer loss, a worse significance is reached. Compared with the hadronic level analysis, the significance is reduced by a factor 3.6 for M3 mode and 4.2 for M4 mode, which means an order of magnitude increase of integrated luminosity in order to reach the same significance.
2) The improved ATLAS-like detectors can improve the η max coverage, a better b jet rejection, and a better R( ) resolution as well. When the detector setup A3 is considered, the significance can reach 2.9 for M3 mode and 3.8 for M4 mode when an integrated luminosity 3000 fb −1 is assumed. When compared with the setup A2, the main gain is caused by lowering the minimal P t ( ), which can almost double the number of signal events.
3) The FCC default detector setup has a larger η max coverage. Due to a larger P t ( ) requirement, fewer signal events can be selected. Meanwhile, due to a larger P t (b), there is a worse b jet rejection from tt type background events. These two facts lead to a large loss in significance when compared with the hadronic level analysis.
4) The improved FCC detectors are mainly focused on P t (b) and P t ( ). The significance can be slightly better with the improvement of P t (b). But, as shown from the F2 detector setup, P t ( ) is more crucial for our signal. It is found that a smaller P t ( ) (P t ( ) = 10 GeV) can double the number of signal events. It is also noticed that the F2 detector setup can be slightly better than the A3 detector setup. Compared with the hadronic level analysis, the significance drops at least a factor of 2, which means 4 times larger integrated luminosity in order to reach the same significance. 6 Correlation between the processes gg → hh and pp → tth Events from the tth final state are the dominant background for the M4 case, which is also true for the pp → hh → 3 + 2j + / E T mode explored in [44] at the HCs. Therefore, the measurement of the top Yukawa coupling can significantly affect the detection of Higgs pair production. Below, we examine how the measurement of tth couplings can affect the determination of λ 3 . The correlation between the measurement of tth and the measurement of Higgs pair production can be investigated by the cross section of gg → hh given in Eq. (6). This issue has not been addressed in the literature.
Before showing more results, we would like to explain the sensitive regions shown in Figs. 7(a)-9(b). The sensitive regions in the figure are obtained by using the cross section of gg → hh in harmonic colliders. The lines separating the sensitive region and the insensitive region are determined by the cross section of gg → hh. For both the LHC and the 100 TeV cases, the lines are determined by a cross section 150 fb or so. For example, in Fig. 7(a), in the region with large a (say a > 2) and large λ 3 (say λ 3 > 6), the cross sections can be larger than 150 fb, therefore this region can be probed even at the LHC. In contrast, in Fig. 7(b), in the region with small a (say a < 0.2) and λ 3 around 2, the cross sections are smaller than 150 fb. Therefore even at a 100 TeV collider, it is difficult to probe this region. All these features are shown in the fitted cross section given in Eq. (6).
In Fig. 7, we demonstrate the correlation between the determination of tth at the LHC and a future 100 TeV collider, where the bounds are estimated from the measurement of 3 2j + / E. These bounds have not been optimised for each value of a and λ 3 ; we simply use the information of cross sections in this projection. When detector effects are taken into account, a worse bound is expected. As demonstrated in [44], when the discrimination of signal and background is optimised, we can expect better bounds.
For the LHC with a luminosity 3 fb −1 , we assume that tth couplings can be determined up to 20% (10% is estimated by using the boosted techniques for tth and h → bb [77,78]). We deliberately take a larger value for this since only statistics are taken into account, which is denoted by the two solid lines in Fig. 7(a) as upper and lower bounds from tth measurements.
For a 100 TeV collider with luminosity 3 fb −1 , we assume that 5% precision can be achieved, which is denoted by the two solid lines in Fig. 7(b) as upper and lower bounds from tth measurements. In Ref. [79], where the production ratio tth/ttZ measurement and boosted Higgs and top quark are used, it is argued that this coupling can be measured up to a precision 1% or so when only statistics are considered. In reality, more background processes and detector effects must be considered for each tt decay final state, so we assume a precision up to 5% as a relatively conservative and loose estimation.
Comparing Fig. 7(a) and Fig. 7(b), it is noticed that a 100 TeV collider can greatly shrink the uncertainty in determining the λ 3 and a parameters. Due to a 4 dependence of the cross section σ(pp → hh), a 5% uncertainty of δa can induce an uncertainty of λ 3 up to 20% or so. If the coupling a can be determined to a precision 1%, that will undoubtedly be crucial to pinpoint the value of λ 3 down to 5%. It is worth mentioning that the two-fold ambiguity in a with the same cross section can be removed by using the method to check the differential distribution of leptons in the final state, as demonstrated in [44], which is robust against the contamination of underlying events and pileup effects.  In the effective Lagrangian given in Eq. (1), the parameter b is related to the strength of CP violation. We explore the determination of the b parameter from Higgs pair production, which is given in Fig. 8(a). From Fig.  8(b), the potential of a 100 TeV collider to probe the parameter space expanded by b and λ 3 is obviously better than that of the LHC.
Higgs pair production at a 100 TeV collider can bound b down to 0.4. In contrast, the LHC runs can only constrain this parameter to 1.2 or so at most. When we quote these numbers, we have not taken into account the uncertainty in a. Obviously, the direct measurement from tth could impose a better constraint on the value of b, either at the LHC or at a 100 TeV collider.
The correlation of a and b in a Higgs pair is provided in Fig. 9(a). We plot the dependence of the cross section of tth upon a and b in the same plots. From Fig. 9(b), it is noticed that the LHC can be sensitive to the region where a < 0 and b is sizeable, which corresponds to a large production rate of gg → hh due to a constructive interference. To consider the bounds from the tth measurement in Fig. 9(a) and Fig. 9(b), we have parametrised the cross section of tth at the LHC 14 TeV and at a 100 TeV collider as

023105-14
where the values of t 1 and t 2 for 14 TeV and 100 TeV are computed by using fit and are provided in Table 10, Here the NLO correction has been taken into account. For a 100 TeV collider, there is a 3-fold ambiguity when we combine the measurements of tth and hh. To remove this 3-fold ambiguity, the differential distribution of tth final state should be carefully analysed, as demonstrated in [51] where quite a few differential observables are proposed.
It is interesting to observe that both single Higgs production and Higgs pair production can indirectly help to determine the interaction between top quark and Higgs boson at a 100 TeV collider. The Higgs pair production can also help to disentangle and resolve the nature of ultraviolet contributions to Higgs couplings to two gluons [80].
It is also interesting to study the correlations of the cross section σ(pp → tth) and σ(pp → hh) when the uncertainty from parton distribution functions (PDFs) is considered. We use the CT10 [81] dataset to examine the PDF uncertainty. A recent analysis of the CT14 NNLO PDF can be found in [82].
We consider the correlations between the processes gg → hh and pp → tth in three cases. The first case is (a, b, λ 3 ) = (1, 0, 1), the second case is (a, b, λ 3 ) = (1, 0, −1), and the third case (a, b, λ 3 ) = (0, 1, 1). For these three cases, the central values of cross sections of gg → hh and pp → tth are so different that the uncertainty from the PDF cannot lead to an overlap among them. In order to examine the correlation caused by the PDF uncertainty, we normalise the cross sections with the central values in each case and plot the 90% confidence level contours in Fig. 10.
In the first case, the uncertainty of PDF at the LHC 14 TeV can at most lead to an uncertainty of 5% in the cross section of pp → ttH and 4% or so in that of gg → hh. The uncertainty of gg → hh is almost the same at a 100 TeV, while that of pp → tth is shrunk to 3%.
At the LHC 14 TeV, when comparing the second case with the first case, we notice that the uncertainty of pp → ttH is enhanced due to the cross section becoming smaller. The correlation angle of the second case is different from that of the first case due to form factors proportional to b 4 and b 2 being different from those proportional to a 4 , a 3 , and a 2 . When the collision energy is 100 TeV, the correlation starts to become stronger.
In the third case, the uncertainty of pp → ttH is smaller due to normalisation when compared with the first case. Meanwhile, the correlation becomes weaker due to the interference and form factors changing -the area of the third case is fatter than that of the first case -in both the 14 TeV and 100 TeV collisions. To determine a and b, it might also be useful to consider the process pp → tthh. Here we examine the cross section of the process tthh, which can be parametrised similarly to Eq. (6), but with a different set of free parameters The values of these free parameters are evaluated numerically and are tabulated in Table 11. All these parameters are positive and interference can occur when a and λ 3 have opposite signs. The interference terms proportional to T 4 and T 5 have larger enhancement factors than those of other terms. The feasibility of this process at hadron colliders has been studied in [83,84]. Table 11. Fitting coefficients of pp → tthh for the LHC 14 TeV and a 100 TeV collider. R 33 (R 100 ) is defined as

Discussion and conclusions
In this work, we examined the discovery potential of the full leptonic final state of Higgs pair production at a 100 TeV collider and proposed observables in order to distinguish signal and background events. We also examined the correlation between the experimental bounds on the Yukawa couplings a and b and the bounds on λ 3 at both the LHC and a 100 TeV collider and found that a 100 TeV collider can offer us a powerful way to probe these parameters.
Due to the small production rate of the fully leptonic final state in the SM, it is challenging to discover the Higgs pair production via the two modes explored in this work. But if there is new physics which might enhance the Higgs pair production by one or two orders of magnitude, like some bench mark points allowed in the 2HDM [85], this mode could be considered. It is obvious that if we can combine all final states of Higgs pair production, like bbγγ, WW * γγ, bbττ, etc, we can achieve a better bound on the triple Higgs coupling λ 3 .
To examine the detector effects, we compared seven fast-detector setups and it was noticed that compared with the hadronic level analysis, in order to achieve the same significance, different detector effects might require larger datasets, varying from 4 times to 10 times more. One crucial observation is that if at the data analysis level we could take into account those not very energetic leptons (say 10 GeV < P t ( ) < 20 GeV) in our analysis and without any other background, the significance might be improved by a factor 2.
It is interesting to compare the precision of λ 3 which can be achieved in the Higgs factory and the ILC. According to the theoretical calculation, the CEPC Higgs factory can determine the vertex ZZh to a precision like 0.3%, which can be turned into a precision of trilinear Higgs coupling δλ 3 ∼ 30% − 40% [86]. Such a precision might be reduced if the vertex tth can be measured precisely [87], especially when collision energy is higher (say higher than 240 GeV). The ILC can measure the cubic Higgs coupling to a precision 20% ∼ 40% via e + e − → Zhh with a collision energy √ s = 500 GeV with the H20 operation mode [88][89][90][91]. When the collision energy is 1 TeV, the VBF processes e + e − → hhX can be more sensitive to λ 3 than the process e + e − → Zhh, a precision 10% ∼ 13% with an assumed integrated luminosity 4/ab [88][89][90][91]. In a 100 TeV collider with an assumed integrated luminosity 30/ab, from the hh → bbγγ mode [32], hh → 3 +2j+MET mode [44] and hh → 4 +MET (especially the M4 mode in this work), precisions of 5%, 10%, 5% could be achieved, respectively. When the luminosity is assumed to be 3/ab, the corresponding precision could be 15%, 20%, 15%, respectively. Obviously, the combination of all these modes can further improve this precision.
We have omitted those background events where two b jets fake two leptons from b decays, where the main background events could be from tt → 2b2l+ / E, of which the cross section is huge. We have used 16 million events to estimate the suppression factor and after using the F2 detector setup and applying the preselection cuts, we found this factor is 10 −7 or so, which makes the cross section of this type of background of the same order of magnitude as that of our signal. However, in our fastdetector simulation, we could not use the information of secondary vertices. In realistic data analysis, the infor-mation of secondary vertices can efficiently reject leptons from b decays. Furthermore, when the reconstructed observables are used, it is noticed that all these background events in our data sample can be safely suppressed. Due to our limited computing resources, we could not estimate the fluctuations in our data sample, but we believe that this type of background is controllable.
We have checked the detector effects on the transverse missing energy distributions for both signal and background events and found that detector effects do not change our results significantly. However, processes which might have a large transverse missing energy distribution from forward jets and four lepton states, like pp → ZW + W − j, could contribute sizeably, which could increase the ZW + W − background by 30%. Other potentially harmful background events might arise from pileup collisions with leptons in the final states and forward jets which are missed and could fake the transverse missing energy due to the limited η coverage (say |η| < 6) of the detector. To analyse such background events needs more realistic and careful consideration of the pileup effects, the detector setup, etc., which can be done in future works but is beyond the scope of the current study.
We would like to thank Junping Tian, Fapeng Huang and Sichun Sun for useful discussions.
3. The average of these m integral estimations is finally taken as the estimation of the integral I(f ) and it can be proved to be unbiased [92].
4. An unbiased estimation of the mean-square error Qn,m(f ) can also be obtained by This algorithm has been realised and validated in Feynman diagram evaluation by using the sector decomposition method [93]. For more details about QMC and this algorithm, such as the construction of generating vector, the convergence rate, interested readers can refer to [92,94,95].
Except performing numerical integration to compute the cross section, our code can also generate unweighted events. The algorithm for generating unweighted events is given below.
Considering a multi-dimensional non-negative function f ( x), the popular algorithm is the rejection algorithm, which is described below: 1. Construct a non-negative function g( x), called the sampling function.

Find a value M satisfying
3. Generate a point x with probability density function dxg( x) , and a random number r with uniform distribution in [0, 1). If this point is accepted, otherwise this point is rejected. This step can be repeated many times to obtain the desired number of points.
The efficiency of this algorithm, which is defined by the ratio of accepted number of points and total number of points in the last step, is determined by the sampling function g( x). This sampling function can also be used in MC integration via importance sampling.
The most widely used algorithm for the sampling function in high energy physics is the VEGAS algorithm proposed by Lepage [96], which has a factorizable structure as follows: where gi(xi) are step functions. This algorithm starts with g( x) = 1, then samples with several points based on g( x) and optimizes g( x) based on the sampled results. The sampling and optimizing procedure can be iterated several times, to further optimize the function g( x). Although the function g( x) allows us to deal with integrands with large peaks, a general integrand is unfactorizable and the efficiency of this algorithm is limited by this. Another disadvantage of this algorithm is that the optimization is based on information gained via MC sampling, which suffers from fluctuation.
To overcome these issues of the VEGAS algorithm, we observed that the QMC method could give a better estimation of the integral. Based on the structure of R1LR, we propose a new g( x). The sampling function is constructed by using QMC points, so the fluctuation is well under control. In addition, the method does not assume a factorizable structure for g( x), and the efficiency of this algorithm could be high.
Below we describe the sampling function in this new algorithm: 1. Any vector vij = xi − xj is the period of the lattice, and we can choose d linearly independent vectors from them. Since there are multiple choices, we choose the shortest ones and mark them as ei(i = 1, 2, . . . , d).
2. With these linearly independent vector and one origin x k , we can build an affine coordinate system, and mark it as S k .
3. The region { y|0 yi 1, i = 1, 2, . . . , d} is called the kth unit cell, where yis are the coordinates in the affine coordinate system S k . It can be shown that the volume of one unit cell is 1 n , and the original domain of integration is divided into n unit cells. Any point can only belong to one unit cell. 4. If a point x belongs in the k-th unit cell, then g( x) is defined as the following: where y are the coordinates in S k , corresponding to x in the original coordinate system. h(a1, a2, . . . , a d ) is the values of function f ( x) at vertices of this unit cell, as below and these values has been calculated in the previous QMC integration.
An important fact is that the integration of g( x) can be easily obtained: The previously defined g( x) only depends on the lattice structure, and since random shifts keep this structure, it can be used in the random shifted lattice rule. In detail, if we use m shifted lattice rule, and denote the intepolation function for each shift as gi( x), i = 1, 2, . . . , m, then the following probability function can be used: gi( x) I(gi( x)) (A13) An important part of this algorithm is to generate random points in terms of the probability function Eq. (A13), which can be achieved via the following steps: 4. then (y1, y2, . . . , yn) is the random point satisfying the requested distribution in the affine coordinate system S k , i.e. the corresponding coordinates in the original coordinate system is where the pair of braces means that only the fractional part is taken, as before in Eq. (A3).
In Table A1, we demonstrate the efficiency of this algorithm in comparison with the VEGAS algorithm. The function used in this test is the differential cross section of gluon fusion into Higgs pair at the 14 TeV LHC, where the dimension of integration is 4. We adopt the R1LR algorithm and perform an integration with a lattice containing 1024 points and repeat the computation with 4 shifts. After that we test the reweighting efficiency. For the VEGAS algorithm, we compute 1024 points in each iteration and the number of iteration is 4, which correspond to the same number of points as in the R1LR algorithm. It is observed that the QMCbased algorithm is better than VEGAS, both in the speed of convergence and reweighting efficiency. Appendix B Performance of sub-detectors for detector effect study To investigate the detector effects, we use the DELPHES package version 3.3.2 [64]. Since our signals include leptons and missing energy, we expose the tracker efficiency of electrons and muons in both the ATLAS and FCC detector setups in Table B1 and Table B2. These two tables tell us that when the Pt of a lepton is fixed, the barrel regions have a better efficiency than the endcap regions to reconstruct the lepton. When the ATLAS-like and FCC detectors are compared, the FCC detector has a larger acceptance region and has a better efficiency to find a lepton.
In Table B3, we show the identification efficiency for both electrons and muons, which is mainly determined by the inner tracker subdetector. When electrons and muons are compared, the efficiency for muons is better than that of electrons. Table B1. The tracker efficiency for e ± . |η max | coverage for the first (ATLAS-like) detector is 2.5, for the improved ATLAS detector setups (A1,A2,A3) it is 4.0, and for the FCC detector it is 6.0.
Pt > 10 GeV ATLAS-like |η| < 1.5 95% 1.5 < |η| < η max 85% FCC |η| < 1.5 95% 1.5 < |η| < 4.0 90% 4.0 < |η| < 6.0 85% In order to reject the main background processes from tt final states, as revealed in the hadron level analysis, btagging is crucial. For the first and second detector setups given in Table 8 , b-tagging is only performed on jets with |η| < 2.5, while for the third and fourth detector, the vertex and tracker detectors are supposed to cover the range up to |η| < 4.0, so b-tagging is performed on all jets with |η| < 4.0. The b-tagging (mis-tagging) efficiency is listed in B4, which is used in the original card in the DELPHES package, based on [97], where the mis-tagging rates of light jets and c-jet are also listed. By using the results given in Table B4, the btagging efficiency is more realistically simulated. So for both ATLAS-like detector setups and the FCC detector setups, we adopt the results tabulated in Table B4.