Search for an exotic decay of the Higgs boson to a pair of light pseudoscalars in the ﬁnal state with two b quarks and two τ leptons in proton–proton collisions at

A search for an exotic decay of the Higgs boson to a pair of light pseudoscalar bosons is performed for the ﬁrst time in the ﬁnal state with two b quarks and two τ leptons. The search is motivated in the context of models of physics beyond the standard model (SM), such as two Higgs doublet models extended with a complex scalar singlet (2HDM + S), which include the next-to-minimal supersymmetric SM (NMSSM). The results are based on a data set of proton–proton collisions corresponding to an integrated luminosity of 35 . 9 fb − 1 , accumulated by the CMS experiment at the LHC in 2016 at a center-of-mass energy of 13 TeV. Masses of the pseudoscalar boson between 15 and 60 GeV are probed, and no excess of events above the SM expectation is observed. Upper limits between 3 and 12% are set on the branching fraction B ( h → aa → 2 τ 2b ) assuming the SM production of the Higgs boson. Upper limits are also set on the branching fraction of the Higgs boson to two light pseudoscalar bosons in different 2HDM + S scenarios. Assuming the SM production cross section for the Higgs boson, the upper limit on this quantity is as low as 20% for a mass of the pseudoscalar of 40 GeV in the NMSSM.


Introduction
Within the standard model (SM), the Brout-Englert-Higgs mechanism [1][2][3][4][5][6] is responsible for electroweak symmetry breaking and predicts the existence of a scalar particle-the Higgs boson. A particle compatible with the Higgs boson was discovered by the ATLAS and CMS collaborations at the CERN LHC [7][8][9]. Measurements of the couplings and properties of this particle leave room for exotic decays to beyond-the-SM particles, with a limit of 34% on this branching fraction at 95% confidence level (CL), using data collected at center-of-mass energies of 7 and 8 TeV [10].
The possible existence of exotic decays of the Higgs boson is well motivated [11][12][13][14][15][16]. The decay width of the Higgs boson in the SM is so narrow that a small coupling to a light state could lead to branching fractions of the Higgs boson to that light state of the order of several percent. Additionally, the scalar sector can theoretically serve as a portal that allows matter from a hidden sector to interact with SM particles [17]. In general, exotic decays of the Higgs boson are allowed in many models that are consistent with all LHC measurements published so far. E-mail address: cms -publication -committee -chair @cern .ch.
An interesting class of such processes consists of decays to a pair of light pseudoscalar particles, which then decay to pairs of SM particles. This type of process is allowed in various models, including two Higgs doublet models augmented by a scalar singlet (2HDM + S). Seven scalar and pseudoscalar particles are predicted in 2HDM + S. One of them, h, is a scalar that can be compatible with the discovered particle with a mass of 125 GeV, and another, the pseudoscalar a, can be light enough so that h → aa decays are allowed.
Four types of 2HDM, and by extension 2HDM + S, forbid flavorchanging neutral currents at tree level [18]. In type I, all SM particles couple to the first doublet. In type II, up-type quarks couple to the first doublet, whereas leptons and down-type quarks couple to the second doublet. The next-to-minimal supersymmetric SM (NMSSM) is a particular case of 2HDM + S of type II that brings a solution to the μ problem [19]. In type III, quarks couple to the first doublet, and leptons to the second one. Finally, in type IV, leptons and up-type quarks couple to the first doublet, while down-type quarks couple to the second doublet [15]. The branching fractions of the light pseudoscalars to pairs of SM particles depend on the type of 2HDM + S, on the pseudoscalar mass m a , and on tan β, defined   ing fraction B(aa → bbτ τ ) is slightly above 10% in the NMSSM-or more generally in 2HDM + S type II-for tan β > 1, and can reach up to about 50% in 2HDM + S type III with tan β ∼ 2, as shown in Fig. 1.
Several searches for exotic decays of the Higgs boson to a pair of light short-lived pseudoscalar bosons have been performed by the CMS Collaboration with data collected at a center-of-mass energy of 8 TeV in different final states: 2μ2b for 25.0 < m a < 62.5 GeV [20], 2μ2τ for 15.0 < m a < 62.5 GeV [20], 4τ for 4 < m a < 8 GeV [21] and 5 < m a < 15 GeV [20], and 4μ for 0.25 < m a < 3.50 GeV [22]. The CMS Collaboration also studied the 2μ2τ final state for 15.0 < m a < 62.5 GeV at a center-of-mass energy of 13 TeV [23]. The ATLAS Collaboration reported results for the following final states at a center-of-mass energy of 8 TeV: 4μ, 4e, and 2e2μ for 15 < m a < 60 GeV [24]; 4γ for 10 < m a < 62 GeV [25]; and 2μ2τ for 3.7 < m a < 50.0 GeV [26]. At a center-of-mass energy of 13 TeV, the ATLAS Collaboration published results for the 4b decay channel for 20 < m a < 60 GeV [27], and 4μ, 4e, and 2e2μ for 1 < m a < 60 GeV [28]. The 2b2τ final state has never been probed so far. This final state benefits from large branching fractions in most models because of the large masses of τ leptons and b quarks with respect to other leptons and quarks. The presence of light leptons originating from the τ decays allows events to be triggered in the dominant gluon fusion production mode.
This Letter reports on the first search with the CMS experiment for exotic decays of the Higgs boson to a pair of light pseudoscalar bosons, in the final state with two τ leptons and two b quarks. The search focuses on the mass range between 15 and 60 GeV. For low m a values, between the bb threshold and 15 GeV, the decay products of each of the pseudoscalar bosons become collimated, which would necessitate the use of special reconstruction techniques.
The search is based on proton-proton (pp) collision data collected at a center-of-mass energy of 13 TeV and corresponding to an integrated luminosity of 35.9 fb −1 . Throughout this Letter, the term τ h denotes τ leptons decaying hadronically. The τ τ final states studied in this search are eμ, eτ h , and μτ h . Despite its large branching fraction, the τ h τ h final state is not considered because the signal acceptance is negligible with the transverse momentum (p T ) thresholds available for the τ h τ h triggers. The ee and μμ final states for the τ τ pair are not considered either, because they have a low branching fraction and suffer from a large contribution of Drell-Yan background events.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume, there are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid. Events of interest are selected using a two-tiered trigger system [29]. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [30].

Simulated samples and event reconstruction
The signal and some of the background processes are modeled with samples of simulated events. The MadGraph5_amc@nlo [31] 2.3.2 generator is used for the h → aa → 2τ 2b signal process, in gluon fusion (ggh), vector boson fusion (VBF), or associated vector boson (Wh, Zh) processes. These samples are simulated at leading order (LO) in perturbative quantum chromodynamics (QCD) with the MLM jet matching and merging [32]. The Z + jets and W + jets processes are also generated with the MadGraph5_amc@nlo generator at LO with the MLM jet matching and merging. The Z + jets simulation contains non-resonant Drell-Yan production, with a minimum dilepton mass threshold of 10 GeV. The FxFx merging scheme [33] is used to generate diboson background with the MadGraph5_amc@nlo generator at next-to-LO (NLO). The tt and single top quark processes are generated at NLO with the powheg 2.0 and 1.0 generator [34][35][36][37][38][39]. Backgrounds from SM Higgs boson production are generated at NLO with the powheg 2.0 generator [40], and the minlo hvj [41] extension of powheg 2.0 is used for the Wh and Zh simulated samples. The generators are interfaced with pythia 8.212 [42] to model the parton showering and fragmentation, as well as the decay of the τ leptons. The CUETP8M1 tune [43] is chosen for the pythia parameters that affect the description of the underlying event. The set of parton distribution functions (PDFs) is NLO NNPDF3.0 for NLO samples, and LO NNPDF3.0 for LO samples [44]. The full next-to-NLO (NNLO) plus next-to-next-to-leading logarithmic (NNLL) order calculation [45][46][47][48][49][50], performed with the Top++ 2.0 program [51], is used to compute a tt production cross section equal to 832 +20 −29 (scale) ± 35 (PDF + α S ) pb setting the top quark mass to 172.5 GeV. This cross section is used to normalize the tt background simulated with powheg.
All simulated samples include additional proton-proton interactions per bunch crossing, referred to as "pileup", obtained by generating concurrent minimum bias collision events using pythia. The simulated events are reweighted in such a way to have the same pileup distribution as data. Generated events are processed through a simulation of the CMS detector based on Geant4 [52].
The reconstruction of events relies on the particle-flow (PF) algorithm [53], which combines information from the CMS subdetectors to identify and reconstruct the particles emerging from pp collisions: charged and neutral hadrons, photons, muons, and electrons. Combinations of these PF objects are used to reconstruct higher-level objects such as jets, τ h candidates, and missing transverse momentum.
Electrons are reconstructed by matching ECAL clusters to tracks in the tracker. They are then identified with a multivariate analysis (MVA) discriminant that makes use of variables related to the energy deposits in the ECAL, the quality of the track, and the compatibility between the properties of the ECAL clusters and the track that have been matched [54]. The MVA working point chosen in this search has an efficiency of about 80%. The reconstruction of muon candidates is performed combining the information of the tracker and the muon systems. Muons are then identified on the basis of the track reconstruction quality and on the number of measurements in the tracker and the muon systems [55]. The relative isolation of electrons and muons is defined as: In this formula, charged p T is the scalar sum of the transverse momenta of the charged particles, excluding the lepton itself, associated with the primary vertex and in a cone around the lepton direction, with size R = √ ( η) 2 + ( φ) 2 = 0.3 for electrons, or 0.4 for muons. The sum neutral p T represents a similar quantity for neutral particles. The last term corresponds to the p T of neutral particles from pileup vertices, which, as estimated from simulation, is equal to approximately half of that of charged hadrons associated with pileup vertices, denoted by charged,PU p T . The p T of the lepton is denoted p T . The azimuthal angle, φ, is measured in radians.
Jets are reconstructed from PF objects with the anti-k T clustering algorithm implemented in the FastJet library [56,57], using a distance parameter of 0.4. Corrections to the jet energy are applied as a function of the p T and η of the jet [58]. The jets in this search are required to be separated from the selected electrons, muons, or τ h , by R ≥ 0.5. Jets that originate from b quarks, called b jets, are identified with the combined secondary vertex (CSVv2) algorithm [59]. The CSVv2 algorithm builds a discriminant from variables related to secondary vertices associated with the jet if any, and from track-based lifetime information. The working point chosen in this search provides an efficiency for b quark jets of approximately 70%, and a misidentification rate for light-flavor and c quark jets of approximately 1 and 10%, respectively.
Hadronically decaying τ leptons are reconstructed with the hadrons-plus-strips (HPS) algorithm [60,61] as a combination of tracks and energy deposits in strips of the ECAL. The tracks are the signature of the charged hadrons, and the strips that of the neutral pions, which decay to a pair of photons with potential electron-positron conversion. The reconstructed τ h decay modes are one track, one track plus at least one strip, and three tracks.
The rate for jets to be misidentified as τ h is reduced by applying an MVA discriminator that uses isolation as well as lifetime variables. Its working point has been chosen to have an efficiency of approximately 45% for a misidentification rate of light-flavor jets of the order of 0.1%. Additionally, discriminators that reduce the rates with which electrons and muons are misidentified as τ h are applied. Loose working points with an efficiency above 90% are chosen because the Z → ee/μμ background does not contribute much in the region where the signal is expected.
To account for the contribution of undetected particles, such as the neutrinos, the missing transverse momentum, p miss T , is defined as the negative vectorial sum of the transverse momenta of all PF objects reconstructed in the event. The magnitude of this vector is denoted p miss T . The reconstructed vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex. The physics objects are the objects constructed by a jet finding algorithm [56,62] applied to all charged tracks associated with the vertex, and the corresponding associated missing transverse momentum. Table 1 Baseline selection criteria for objects required in various final states. The numbers given for the p T thresholds of the electron and muon in the eμ final state correspond to the leading and subleading particles. The p T threshold for the τ h candidates is the result of an optimization of the expected exclusion limits of the signal.

Event selection
Events are selected in three different τ τ final states: eμ, eτ h , and μτ h . They are additionally required to contain at least one b-tagged jet. The dominant backgrounds with these objects in the final state are tt and Z → τ τ production. Another large background consists of events with jets misidentified as τ h , such as W + jets events, the background from SM events composed uniquely of jets produced through the strong interaction, referred to as QCD multijet events, or semileptonic tt events.
All events are required to have at least one b-tagged jet with  Table 1 details the p T , η, isolation, and identification criteria for the various objects, in the different final states.
To increase the sensitivity of the analysis, events in each final state are separated into four categories with different signal-tobackground ratios. The categories are defined on the basis of the invariant mass of the visible decay products of the τ leptons and the b-tagged jet with the highest p T , denoted by m vis bτ τ . This variable is typically low for signal events because the three objects originate from a 125 GeV Higgs boson, but it is on average much larger for background events, where the three objects do not originate from a decay of a resonance, as shown in Fig. 2 for the μτ h final state. The thresholds that define the categories depend on the τ τ final state: they are lower in the eμ final state because there are more neutrinos not included in the mass calculation, and they are higher in the eτ h final state to keep enough events despite the lower signal acceptance related to the electron p T thresholds. The "jet → τ h " contribution includes all events with a jet misidentified as a τ h candidate, whereas the rest of background contributions only include events where the reconstructed τ h corresponds to a τ h , a muon, or an electron, at the generator level. The "Other" contribution includes events from single top quark, diboson, and SM Higgs boson processes. The signal histogram corresponds to 10 times the SM production cross section for ggh, VBF, and Vh processes, and assumes B(h → aa → 2τ 2b) = 100%.
(For interpretation of the colors in the figures, the reader is referred to the web version of this article.) Signal events with m a 25 GeV contribute mostly to the first two categories, whereas those with m a 25 GeV are concentrated in the second and third categories. This can be explained by the fact that the missing b jet in the mass calculation would be closer to the reconstructed b jet for a signal with lower m a because of the boost of the pseudoscalar bosons, leading to a larger reconstructed mass. The last category has large background yields; it is useful to constrain the various backgrounds and provides some additional sensitivity for low-m a signal samples. The results of the search are extracted from a fit of the visible τ τ mass (m vis τ τ ) distributions in each of the categories, because this is a resonant distribution for signal events.
Selection criteria are applied to optimize the expected limits on the product of the signal cross section and branching fraction. The same thresholds would be obtained with an optimization based on the discovery potential. One such criterion is based on the transverse mass of p miss T and each of the leptons. The transverse mass m T between a lepton and p miss T is defined as (2) where p T is the transverse momentum of the lepton , and φ is the azimuthal angle between the lepton momentum and p miss T . Selecting events with low m T strongly reduces the backgrounds from W + jets and tt events, which are characterized by a larger p miss T .
In addition, for W + jets events in which the selected lepton comes from the W boson decay, m T has a Jacobian peak near the W boson mass. The distributions of m T (μ, p miss T ) and m T (τ h , p miss T ) in the μτ h final state before the m vis bτ τ -based categorization are shown in Fig. 3 (top and center).
Another selection criterion is based on the variable D ζ , which is defined as where p ζ is the component of p miss  Table 2.

Background estimation
The Z → background is estimated from simulation. The distributions of the p T of the dilepton system and the visible invariant mass between the leptons and the leading b jet are reweighted with corrections derived using data from a region enriched in Z → μμ + ≥ 1 b events. The simulation is separated between Z → τ τ , where the reconstructed τ candidates correspond to τ leptons at generator level, and Z → ee/μμ decays, where at least one electron or muon is misidentified as a τ h candidate.
Backgrounds with a jet misidentified as a τ h candidate are estimated from data. They consist mostly of W + jets and QCD multijet events, as well as the fraction of tt, diboson, and single top quark production where the reconstructed τ h candidate comes from a jet. The probabilities for jets to be misidentified as τ h candidates, denoted f , are estimated from Z → μμ + jets events in data. They are parameterized with Landau distributions as a function of the . Such events typically have a genuine lepton coming from the W boson decay and a jet misidentified as the other lepton. The QCD multijet background, which also contains jets misidentified as leptons, is estimated from data. Its normalization corresponds to the difference between the data and the sum of all the other backgrounds in a so-called same-sign region where the τ candidates have the same sign. A smooth distribution is obtained by additionally relaxing the isolation conditions of both leptons. A correction that is extracted from data is applied to extrapolate the normalization obtained in the same-sign region to the signal region. in the μτ h final state before the m vis bτ τ -based categorization. The "jet → τ h " contribution includes all events with a jet misidentified as a τ h candidate, whereas the rest of background contributions only include events where the reconstructed τ h corresponds to a τ h , a muon, or an electron, at the generator level. The "Other" contribution includes events from single top quark, diboson, and SM Higgs boson processes. The signal histogram corresponds to 10 times the SM production cross section for ggh, VBF, and Vh processes, and assumes B(h → aa → 2τ 2b) = 100%.  In the eτ h and μτ h final states, where all backgrounds with a jet misidentified as a τ h candidate are estimated from data, simulated events with a reconstructed τ h that is not matched to an electron, a muon, or a τ h at the generator level are discarded to avoid double counting. Approximately 30% of simulated tt events after the selection have a reconstructed τ h that is not matched to an electron, a muon, or a τ h at the generator level.

Fit method and systematic uncertainties
The search for an excess of signal events over the expected background involves a global binned maximum likelihood fit based on the m vis τ τ distributions in the different channels and categories. The statistical uncertainty largely dominates over systematic uncertainties in this search. The systematic uncertainties are represented by nuisance parameters that are varied in the fit according to their probability density functions. A log-normal probability density function is assumed for the nuisance parameters that affect the event yields of the various background and signal contributions, whereas systematic uncertainties that affect the distributions are represented by nuisance parameters whose variation results in a continuous perturbation of the spectrum [65] and which are assumed to have a Gaussian probability density function.
To take into account the limited size of simulated samples and of data in the control regions used to estimate some of the background processes, statistical uncertainties in individual bins of the m vis τ τ distributions are considered as Poissonian nuisance parameters. The uncertainty can be as large as 40% for some bins in the low-m vis bτ τ categories. The combined effect of all these uncertainties is the dominant systematic uncertainty in this search.
The uncertainties in the jet energy scale [58] affect the overall yields of processes estimated from simulation, as well as their is evaluated event-by-event, and is also considered as a shape uncertainty.
Corrections for the efficiency of the identification of electrons, muons, and τ h candidates are derived from data using tag-andprobe methods [67], and are applied to simulated events as a function of the lepton p T and η. Uncertainties related to these corrections amount to 2% for electrons, 2% for muons, and 5% for τ h candidates. Additionally, events with an electron or muon misidentified as a τ h candidate have a yield uncertainty of 5%. Trigger scale factors are also estimated with tag-and-probe methods and their corresponding uncertainties in the yields of simulated processes are 1%.
The energy scale of τ h candidates is corrected for each reconstructed decay mode, and the uncertainty of 1.2% for each single decay mode is considered as a shape uncertainty. Uncertainties in the energy scales of electrons and muons are also included as shape uncertainties.
Corrections to the efficiency for identifying a b quark jet as a b jet, as well as for mistagging a jet originating from a different fla-vor, are applied to simulated events on the basis of the generated flavor of the jets. The uncertainties in the scale factors depend on the p T of the jet and are therefore considered as shape uncertainties. They amount to 1.5% for a jet originating from a b quark, 5% from a c quark, and 10% from a light-flavor parton [59].
The uncertainty in the yield of the backgrounds with jets misidentified as τ h candidates accounts for possibly different misidentification rates in Z + jets events (where the misidentification rates are measured), and in W + jets and QCD multijet events (which dominate the constitution of the reducible background in the signal region), and for differences between data and predicted backgrounds observed in a region enriched in reducible background events by inverting the charge requirement on the τ candidates and removing the m T and D ζ selection criteria. This uncertainty amounts to 20%, and is constrained to about 7% after the maximum likelihood fit because of the large number of events contributing to the last m vis bτ τ category. Uncertainties in the parameterization of the misidentification probability of jets as a function of p T result in shape uncertainties for the backgrounds with jets misidentified as τ h candidates.
The uncertainty in the yield of the QCD multijet background in the eμ final state is 20%; the value comes from the uncertainty in the extrapolation factor from the same-sign region to the oppositesign region. The uncertainty in the W + jets background in this channel also amounts to 20%, and accounts for a potential mismod- The uncertainty in the correction of the dilepton p T distribution for Drell-Yan events is equal to 10% of the size of the correction itself. The uncertainty in the correction of the m vis bτ τ distribution is equal to the correction itself, and considered as a shape uncertainty. Uncertainties in the production cross sections and branching fractions for SM Higgs boson processes are taken from Ref. [71]. The uncertainty in the integrated luminosity amounts to 2.5% [72].

Results
The m vis τ τ distributions in the different channels and categories are shown in Figs. 4-6. The binning corresponds to the bins used in the likelihood fit.
No excess is observed relatively to the SM background prediction. Upper limits at 95% CL are set on (σ (h)/σ SM )B(h → aa → 2τ 2b) using the modified frequentist construction CL s in the asymptotic approximation [73][74][75][76][77], for pseudoscalar masses between 15 and 60 GeV. In this expression, σ SM denotes the SM production cross section of the Higgs boson, whereas σ (h) is the h production cross section. The limits per channel and for the combination of the three channels are shown in Fig. 7. The most sensitive final state is μτ h . The sensitivity of the eτ h and eμ channels is approximately equivalent; the first channel suffers from higher trigger thresholds and lower object identification efficiency than μτ h , and the second one suffers from a lower branching fraction than μτ h . At low m a , the eμ final state has a higher signal ac-  This improves by more than one order of magnitude previous limits on B(h → aa) obtained in the 2μ2τ final state by CMS for 15 < m a < 25 GeV [20,23], and by up to a factor five those obtained in the 2μ2b final state by CMS for 25 < m a < 60 GeV [20]. In the scenario with the highest branching fraction, 2HDM + S type III with tan β = 2, the expected limit is as low as 6% at intermediate m a .  Fig. 8 shows the observed limits at 95% CL on (σ (h)/σ SM )B(h → aa) as a function of m a and tan β for type III and type IV 2HDM + S, for which there is a strong dependence with tan β. Fig. 9 shows the observed limits at 95% CL on (σ (h)/σ SM )B(h → aa) for a few scenarios of 2HDM + S, assuming the branching fractions of the light pseudoscalar to SM particles computed using Refs. [15,78]. The limit shown for type II 2HDM + S is approximately valid for any value of tan β > 1, and that for type I 2HDM + S does not depend on tan β. In the m a range considered in the analysis, the branching fraction B(aa → bbτ τ ) ranges between 0.10 and 0.11 in type I 2HDM + S, between 0.11 and 0.13 for tan β = 2 in type II 2HDM + S, between 0.44 and 0.46 for tan β = 2 in type III 2HDM + S, and between 0.16 and 0.21 for tan β = 0.5 in type IV 2HDM + S.

Summary
The first search for exotic decays of the Higgs boson to pairs of light bosons with two b quark jets and two τ leptons in the final state has been performed with 35.9 fb −1 of data collected at 13 TeV center-of-mass energy in 2016. This decay channel has a large branching fraction in many models where the couplings to fermions are proportional to the fermion mass, and can be triggered with high efficiency in the dominant gluon fusion production mode because of the presence of light leptons from leptonic

Acknowledgements
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses.