Search for heavy neutral CP-even Higgs within Type-IV 2HDM at a future linear collider

In this paper, the production process $e^- e^+ \rightarrow A H$ is analyzed in the context of the type IV 2HDM and the question of observability of a neutral CP-even Higgs boson $H$ at a linear collider operating at $\sqrt{s}=1$ TeV is addressed. The CP-odd Higgs is assumed to experience a gauge-Higgs decay as $A\rightarrow ZH$ with hadronic decay of $Z$ boson as the signature of signal events. The production chain is thus $e^+e^- \rightarrow AH \rightarrow ZHH \rightarrow jj\ell\ell\ell\ell$ where $\ell$ is a $\tau$ or $\mu$. Four benchmark points with different mass hypotheses are assumed for the analysis. The Higgs mass $m_H$ is assumed to vary within the range 150-300 GeV in increments of 50 GeV. The anti-$k_t$ algorithm is used to perform the jet reconstruction. Results indicate that the neutral CP-even Higgs $H$ is observable through this production mechanism using the di-muon invariant mass distribution with possibility of mass measurement. The corresponding signal significances exceed $5\sigma$ at integrated luminosity of 3000 $fb^{-1}$.


Introduction
The standard model of elementary particles has been confirmed by a large number of experimental tests and its success has attracted attention on possible extensions of the theory which may pave the way for a solution to the present serious and challenging problems in physics. The discovery of the first elementary scalar particle, the Higgs boson [1,2], which was a prediction of the Higgs mechanism [3][4][5][6][7][8], has strengthened the idea that multiple Higgs bosons may exist in nature and the discovered SM Higgs may be only one of them. In the SM, the simplest possible scalar structure is assumed. However, supersymmetry [9], axion models [10], the SM inability to explain the baryon asymmetry of the universe [11], etc., have been some important motivations behind the idea of extending the SM by adding another SU (2) Higgs doublet. Assuming two SU (2) Higgs doublets instead of a single doublet forms one of the simplest extensions of the SM, i.e., the two-Higgs-doublet model (2HDM) [12][13][14][15][16][17][18].
Depending on how different types of fermions couple to Higgs doublets, one can divide 2HDM into 5 types, four of which with natural flavour conservaion and the fifth with flavour-changing neutral currents (FCNCs). Since no evidence for FCNC has been observed in nature, types with flavour conservation have been more interesting. In comparison with the SM, probing the entire parameter space of 2HDMs takes much longer because of the large number of free parameters in these models. 2HDMs predict more than one Higgs particle, one of which is expected to be the same as the SM Higgs. Other than the SM-like Higgs boson h, two other neutral Higgs bosons A and H, and two charged Higgs bosons H ± are predicted in 2HDMs. In this work the observability of the scalar CP-even neutral Higgs H is studied.
The production process e + e − → AH → ZHH followed by the decays Z → jj and HH → τ τ µµ or µµτ τ is assumed in this work to take advantage of the clear signal that leptonic decays can provide at linear colliders. From the four types of the 2HDM which conserve flavour, the type IV is chosen as the theoretical framework to enhance the Higgs leptonic decay at high tan β by utilizing the key role tan β plays [19]. This enhancement is caused by the appearance of cot β and tan β factors in Higgs-quark and Higgs-lepton couplings respectively in the Yukawa lagrangian of the type IV 2HDM. τ -jets in the final decay products are identified by performing a τ -tagger algorithm and it is expected that the di-muon invariant mass distribution will provide an observable signal on top of the background.
Contrary to the Minimal Supersymmetric Standard Model (MSSM) [9,[20][21][22] which constrains the Higgs masses, Higgs masses of the 2HDM are allowed to have arbitrary values in general. Therefore, 2HDMs provide a wider mass parameter space. In this work, four benchmark points in the Higgs mass parameter space are defined and a simulation is performed for each one to assess the observability of the neutral Higgs H. It will be shown that the signal is observable and the H mass reconstruction is possible for all of the four assumed benchmark points. In [23] a similar signal in the context of type IV 2HDM with multi-tau-lepton signature has also been studied at LHC with promising results.

Two-Higgs-doublet model
Contrary to the SM, where the diagonalized mass matrix leads to the diagonalized Yukawa interactions, the mass matrix is not diagonalizable in a general 2HDM and thus 2HDM contains FCNCs in general. Avoiding the severe difficulties arising from FCNCs, Paschos-Glashow-Weinberg theorem [14,24] states that FCNCs will be removed if all fermions with the same quantum numbers couple exactly to one of the two Higgs doublets. Following this theorem and assuming that the Higgs-fermion couplings follow from table 1, the four types of 2HDM with natural flavour conservation are produced.
Up-type quarks couple to Down-type quarks couple to Leptons couple to (2.1) where h, H, A and H + are SM-like Higgs, scalar CP-even neutral Higgs, pseudoscalar neutral Higgs and charged Higgs fields, u, d and are up-type quark, down-type quark and lepton fields, P L/R are projection operators for left-/right-handed fermions, and finally the factors ξ, as presented in table 2 for different types, are factors expressed in terms of trigonometric functions of the parameters α and β. As seen, different types use different couplings and therefore, exhibit different behaviours and collider phenomenology [18]. The types III and IV are also called "flipped" (or "type Y") and "lepton-specific" (or "type X") respectively. Taking the scalar neutral Higgs field h as the SM-like Higgs field, the Yukawa interactions of the Higgs boson h in 2HDM reduces to those of the SM by assuming sin(β − α) = 1 [12]. Under this assumption, the neutral Higgs part of the Yukawa lagrangian takes the form [25] v where the ρ factors of different types of 2HDM are presented in table 3. According to table 3, the type I is well-suited to low tan β studies because of the cot β factor appearing in its couplings. A study in the context of this type shows that the production of the pseudoscalar Higgs A decaying into ZH can be observed at LHC for certain H decay modes [26].
The type IV (first discussed in [27,28]) also provides an interesting environment for studying leptonic decays of the neutral Higgs bosons H and A, since the corresponding couplings are enhanced as tan β and the Higgs-quark couplings are suppressed at high tan β, as seen in table 3. That is the reason for searching for the H leptonic decay through the di-muon invariant mass distribution in this work. Figure 1 shows the branching ratio of different H decay channels at type IV 2HDM. As seen, the τ pair production channel has a larger branching ratio than di-muon channel, because of the larger τ lepton mass.

Signal and background processes
The type IV 2HDM is chosen as the theoretical framework in this work. Based on the features of this model, the process e + e − → AH → ZHH → jjτ τ µµ or jjµµτ τ is defined as the signal process and a linear collider operating at an energy of 1 TeV is assumed. The leptonic decay is chosen as the decay mode of the neutral Higgs H so that the analysis benefits from the enhanced H decay due to the tan β factor in the Higgs-lepton coupling. In addition, the leptonic decay mode is beneficial from the reconstruction efficiency aspect too. Since the lepton reconstruction efficiency at linear colliders is relatively high it is expected that the di-lepton (di-muon in this case) invariant mass will provide a clear signal. Despite using the muon pair to reconstruct the H mass, one of the H particles is defined to decay via decay mode H → τ τ . The reason is that the branching ratio of the decay to a tau pair is so high (BR H→τ τ = 0.99) suppressing the decay to muon pair to the level of few permil and the signal cross section would be very small if we choose muonic decay mode for both H bosons.
As [29] shows, a leptophilic neutral Higgs boson lighter than 140 GeV can be observed at 30 f b −1 at LHC. In the present work, we focus on moderate and high masses region and assume four benchmark points (BPs) with different mass hypotheses. Table 4 presents different parameters of the selected benchmark points. As seen in mass m h , tan β and sin(β − α) are assumed to be 125, 10 and 1 respectively for all of the selected points and the Higgs mass m H varies from 150 GeV for BP1 to 300 GeV for BP4 in increments of 50 GeV.
The benchmark points are all checked to satisfy the constraints on ρ = m 2 W (m Z cos θ W ) −2 parameter which may deviate from its SM value in extended models like 2HDMs. The measurement performed at LEP [30], which is in excellent agreement with SM predictions, constrains the ρ parameter in 2HDM [31,32]. It has been observed that degenerate Higgs boson masses produce negligible deviations of the ρ parameter from the corresponding SM value [33]. Defining ∆ρ as the non-SM part of the ρ, the pseudoscalar and charged Higgs masses in the benchmark points are chosen to be the same so that ∆ρ is reduced to allowed values consistent with the provided constraints.
The selected benchmark points are also checked to make sure that the resulting vacuum configurations are stable. The stability of a vacuum configuration is ensured by the positivity of the Higgs potential for asymptotically large values of the fields [34]. Moreover, the selected points satisfy the constraints imposed by requiring perturbativity as well as tree-level unitarity for the scattering of Higgs bosons and electroweak gauge bosons [35][36][37][38]. All of these requirements are checked using 2HDMC 1.6.3 [39,40] and are satisfied.
The current experimental limits on the pseudoscalar and charged Higgs masses are m H ± ≥ 78.6 GeV and m A ≥ 93.4 GeV, as shown in [41][42][43][44]. These constraints are obtained based on MSSM and cannot be applied to type IV 2HDM and thus in general, we are not required to respect these limits. However, since moderate and high masses region is our target in this work, the selected points are already consistent with these experimental limits.
Searches for heavy neutral Higgs boson at LHC have recently excluded the mass range m A/H = 200 − 400 GeV for tan β ≥ 5 [45,46]. However, this exclusion is also based on the MSSM and imposes no restriction on the mass spectrum space of this work, since MSSM Higgs-fermion couplings are different from those of the type IV 2HDM. Moreover, Higgs mass parameters in MSSM are not all free parameters like the mass parameters of the 2HDM and thus the mass spectrum of these models are different.
Flavor physics data also puts the lower limit m H ± > 480 GeV on the charged Higgs mass in the context of the types II and III [47]. This constraint results from the dependence of the charged Higgs-quark coupling (corresponding to many flavor observables) on tan β factor in these types. However, the corresponding couplings in types I and IV depend on cot β instead of tan β and thus the behaviour of these types is different from the behaviour of the types II and III. Therefore, the charged Higgs mass limit in types I and IV are so soft and the selected benchmark points are safe. Although the small cross section in the region of heavier Higgs masses gives rise to a small number of signal events, the narrowness of the tail of the background final resulting distribution (in that mass region) partially compensates for the smallness of the number of signal events and thus to a considerable extent makes searching for heavier Higgs bosons possible.
Considering the nature of the signal process, the most important background processes include top quark pair production, W ± gauge boson pair production, Z gauge boson pair production and finally Z/γ production.  [48] for event generation and further processing including multi-particle interactions, decays, final state showering, etc. The signal generation is performed for each benchmark point independently. The background event generation is also performed using PYTHIA 8.2.15 for all of the four background processes.
The jet reconstruction step is performed using FASTJET 3.1.0 [49,50] which includes a variety of sequential recombination clustering algorithms. According to the nature of the signal process, the anti-k t algorithm [51] is chosen as the jet reconstruction algorithm and is expected to give reasonable results. The algorithm uses the standard jet cone size ∆R = (∆η) 2 + (∆φ) 2 = 0.4, where η = −ln tan(θ/2) and φ and θ are the azimuthal and polar angles with respect to the beam axis respectively.
Jet energy smearing is also applied to jets according to energy resolution σ/E = 3.5% [52]. All jets distinguished by the anti-k t algorithm are then required to pass the condition p T ≥ 10 GeV, which sets a lower limit on their transverse momenta. Another kinematic limit applied to the resulting jets is defined by the condition η ≤ 2 to select only central jets.
The transverse momentum threshold for muons is set to p T > 5 GeV. Since the muon production probability in Higgs decays is low, the applied threshold here is lower than jets to compensate the very small branching ratio of Higgs decay to muons (BR H→µµ = 0.0035).
Apart from the di-muon signature of the signal process, which is crucial to the present analysis, the di-taus produced via the decay channel H → τ τ play a significant role in distinguishing the signal events. The corresponding branching ratio of 0.99 which indicates the dominance of this decay channel over others makes this role even more remarkable. This fact requires the analysis to be equipped with a suitable tau-tagging algorithm by which the tau leptons can be well distinguished.
Utilizing the single charged pion signature of the tau decay modes τ → π + ν τ and τ → π + π 0 π 0 ν τ , the tau-tagging algorithm first searches for the hottest charged particle in the vicinity of the jet center (defined by ∆R < 0.1) and identifies the hottest charged particle as the charged pion π + candidate if it satisfies the transverse momentum condition p T > 10 GeV. The narrowness of the tau decay can be used as another feature by which the tau jets can be well identified. Because of the approximate collinearity of the tau decay products, the algorithm performs a search in the immediate vicinity of the charged pion candidate (called as the signal cone defined by ∆R < 0.07). The number of all found particles in the signal cone is required to be 1 or 3 according to the mentioned tau decay modes. To take full advantage of the narrowness of tau jets, another restriction is applied by requiring that there must be no particle with p T > 1 GeV in an annulus around the charged pion candidate defined by 0.07 ≤ ∆R ≤ 0.4. Any jet satisfying the mentioned criteria is finally identified as a tau jet.
To achieve a better understanding of the behavior that the assumed theoretical model, the four hypothesized benchmark points are simulated and tested independently by identical analyses in the present study. Having mentioned the priliminaries, the analysis procedure is discussed in what follows.
The analysis begins by identifying muon leptons using the information in generator level. The identified muons are required to meet a transverse momentum threshold condition by applying the cut p T > 5 GeV. is applied to make sure the needed di-muons exist in the events. In the next step, a search for standard jets is performed using the anti-k t jet reconstruction algorithm and the resulting jets are examined to see if the kinematic criteria which suit our needs are satisfied or not. Counting the jets passing these criteria the plot of figure 3 is obtained for the multiplicity of jets in signal and background events. As seen in figure 3, for all the background processes except tt the jet multiplicity distribution seems to follow a significantly different pattern from the signal pattern. Based on this difference, the cut N o. of jets ≥ 4 is defined and applied to refine the selected events.
A b-tagging algorithm is then applied to jets of the survived events to find the number of b-jets included in each event. The used b-tagging algorithm performs a search for adjacent b or c quarks using the information in generator level for each selected jet. A jet is identified as a b-jet with 60% (10%) probability if it is near a b (c) quark. Having applied the b-tagging algorithm a b-jet multiplicity distribution is obtained. Figure 4 shows the b-jet multiplicity distributions. As seen in figure 4, all the background processes except the W W process b-jet multiplicity  follow a different pattern from the signal pattern. To take advantage of this contrast the cut N o. of b-jets ≤ 1 (4.5) is applied. As it's obvious from figure 4, the majority of signal events include no b-jet or only one b-jet (with less probability). It's due to the fact that the main source of the b quark in the signal process is the Z boson decay to bb pair which its branching ratio is relatively small. Considering the signal process, the jets in the signal events originate mainly from the products of the decay channels H → τ τ and Z → qq. Hence the τ -tagging algorithm is now applied to jets to distinguish τ -jets from the others. Having performed the τ -tagging algorithm, the plot of figure 5 is obtained which shows the τ -jet multiplicity of the various processes. According to figure 5, the average number of τ -jets in the signal events is greater jet multiplicity τ to the selected events in the previous step. Figure 6 shows the jet multiplicity after applying the cut 4.6 and excluding τ -jets. The remaining jets when τ -jets are excluded are candidates for decay products of the Z boson. Therefore, the invariant mass of each possible pair of them is calculated and tested by the mass window cut 70.0 < M inv < 110.0 (4.7) to assess its origin. Having tested all the possible pairs, the Z multiplicity is obtained as shown in figure 7. Since the signal and background distributions of figures 6 and 7 follow similar patterns, no cut is applied on the number of Z bosons. Applying all cuts to signal and background events, relative and total efficiencies are obtained as shown in table 7 for signal events and table 8 for different background processes.
The distinguished di-muons in signal events come from the Higgs boson H and therefore their invariant mass must be in principle equal to Higgs H mass. However, jet energy resolution, mis-identification of jets and also errors in energy and flight directions of particles result in an invariant mass distribution with a peak almost at the generated Higgs boson H mass. Figure 8 shows the di-muon invariant mass distribution in signal events for the four assumed benchmark points. The total number of expected events to use for nomalization is obtained from σ × × L, where σ is the signal cross section (from table 5), is the total efficiency (from table 7) and L is the integrated luminosity which is assumed to be 3000 f b −1 . Figures 9-12 show the signal distribution on top of the background distributions for different benchmark points. As the figures show, the signal events can be seen as relatively small excess of events on top of the background events. This is due the fact that the background processes possess larger cross section than the signal processes. Apart from the apparent peak centered almost at the Higgs generated mass, all the distributions indicate a small peak around Z boson mass which is due to the decaying Z bosons resulting mainly from the ZZ background events.
The Higgs candidate mass distributions are now used for the Higgs mass reconstruction. We construct two functions which have the best fit to the signal plus background and background distributions using ROOT 5.34 [53]. The Higgs boson mass is read from the right fit parameter. The fit function corresponding to the signal plus background distribution is a combination of a polynomial function along with two Gaussian functions. The two Gaussian functions cover the Higgs and Z peaks. The fit parameters of one of the Gaussians may be used to determine the Higgs reconstructed mass. For the background distribution, a combination of a polynomial and a Gaussian function is used as the fit function to be fitted to the distribution. Figures 13-16 show the fits results corresponding to the four distributions in figures 9-12.
As seen in figures 13-16, the signal is well distinguished from the background for all of the assumed benchmark points. The value of the "mean" parameter of the Gaussian fit function is close to the generated Higgs mass. However there is small off-set which is shown in figure 17. This difference can be due to the jet reconstruction algorithm and uncertainties arising from it. The jet reconstruction parameters can be tuned so that the algorithm works as well as possible. A thorough study of the jet reconstruction results and the generated particles and a comparison between them using MC truth matching tools can clarify the error sources. Aside from the mentioned error sources, a real experiment can be affected by other errors resulting from underlying-events, pile-up, electronic noise, etc. Hence an accurate correction to the distiguished jets and their properties is impossible unless all the corrections concerning underlying-events, pile-up, jet energy scale uncertainties, Data/MC calibration, etc., are taken into account. Since studying these effects lies beyond the scope of this analysis, a simple off-set correction is applied to make the reconstructed and generated Higgs masses matched as well as possible. This correction can be done by first using a flat function to fit to the plot of figure 17 and find the average difference between the reconstructed and generated masses, and then increasing the reconstructed masses by this  average value. As seen in figure 17, the average difference is −0.33 GeV which is used to perform the off-set correction. The obtained corrected masses are shown in table 9 including fit uncertainties. Figure 18 shows the difference between reconstructed and and generated Higgs masses after correction.

Di-muon invariant mass [GeV]
As seen in table 9 and figure 18, the Higgs mass can be measured using the di-muon invariant mass distribution for all of the assumed benchmark points with few GeV uncertainty which is in fact a statistical error. However, in a real experiment there are some considerable sources of systematic errors such as particle momentum resolution, the jet energy scale and resolution, the b-tagging uncertainty and the uncertainty arising from the fit function used to obtain the probability density function (p.d.f.) of the distributions.  The background modeling must be treated with special care for a reasonable observation of the signal. A thorough study and comparison of the distributions of different background samples resulting from real data and MC is needed to achieve a reasonable p.d.f. for the background.

Signal significance
The signal observability is quantified through the signal significance calculation. Using the distributions of figures 9-12, a mass window cut is determined for each benchmark point independently. Applying the mass window cuts, the final total efficiency, number       Table 10. Higgs mass window cut, signal total efficiency, number of signal and background events after all selection cuts and mass window cut, signal to background ratio and signal significance.
According to table 10, the signal significance decreases as the Higgs mass m H increases. This is not a surprising result though, as it is a consequence of the fact that the signal cross section decreases as the Higgs mass gets larger. As a result, the higher the Higgs mass, the harder the observation.
Considering the results shown in table 10, it is concluded that an observable signal (exceeding 5σ) can be extracted from the di-muon invariant mass distribution for any one of the four benchmark points at 3000 f b −1 .

Conclusions
The observability of a neutral CP-even Higgs boson H at a linear collider operating at √ s = 1 TeV was studied in the framework of a type IV 2HDM. The signal process was assumed to be e − e + → AH → ZHH followed by hadronic (leptonic) decay of Z (H) bosons. Four benchmark points were hypothesized and the simulation was performed for each one independently. The Higgs mass m H range under study was 150 GeV to 300 GeV in increments of 50 GeV. Although the branching ratio of the Higgs boson H decay to a pair of muons is very small, the ability to accurately identify the muons compensates for the branching ratio smallness and plays a significant role in this study. Taking advantage of the kinematic differences between signal and background events, appropriate selection cuts were applied and the Higgs boson candidate mass distribution was obtained. As the Higgs mass m H gets heavier the signal cross section decreases and this fact could have caused a major obstacle to observing the Higgs boson. However, the weakness of the background tail partially compensated for the decrease in the signal cross section and helped observation of the Higgs boson H with masses up to 300 GeV. These results indicate that for all of the four assumed benchmark points, an observable signal can be extracted from the SM background with mass measurement possibility at 3000 f b −1 .