Search and measurement of the B → μμ rare processes with LHC Run I data

The decays B 0 s → μ + μ − and B 0 → μ + μ − are highly suppressed in the standard model constituting sensitive probes of new physics. We present the results of a search for these rare decays in proton-proton collisions using the full data sample collected by the CMS experiment during LHC Run I. Through a ﬁt to the dimuon invariant-mass spectrum, an excess of events with respect to the background is observed, compatible with the B 0s (B 0 ) signal with a signiﬁcance of 4.3 (2.0) standard deviations ( σ ). The measured branching fractions are B (B 0 s → μ + μ − ) = (3 . 0 + 1 . 0 − 0 . 9 ) × 10 − 9 and B (B 0 → μ + μ − ) = (3 . 5 + 2 . 1 − 1 . 8 ) × 10 − 10 . The combination of the CMS and LHCb results, using their full Run I data samples, yield an event excess compatible with the B 0s (B 0 ) signal with a signiﬁcance in excess of 6 σ (3 σ ). The ongoing and projected accelerator and detector upgrades will allow to establish and to carry out precision measurements of both rare decays, and explore novel observables with further sensitivity to new physics e ﬀ ects.


Introduction and motivation
The rare process B 0 s → μ + μ − is one of the milestones of the LHC physics program.The expected probability for it to occur is very small and is precisely predicted in the standard model (SM).Any observed deviation has therefore the potential to significantly rule out the SM.Indeed, numerous scenarios beyond the SM can induce sizable modifications.The observation and measurement of the decay can be used to exclude and constrain classes of new physics scenarios.The process B 0 → μ + μ − is predicted to be even rarer and more challenging experimentally.Sensitivity to the B 0 s decay has been attained for the first time with Run I of CERN's LHC while sensitivity to the B 0 decay will be achieved in the coming LHC runs.
In this article we present the results of a simultaneous search for the B 0 (s) → μ + μ − decays at CMS.The outcome of the Run I LHC combination, involving the analysis of the CMS and LHCb data, is also reported, along with prospects for the future higher luminosity and energy LHC runs.

Theory expectations
The B 0 s meson plays a key role in testing the quarkflavor sector of the SM.The exploration of its weak decays, after pioneering work at the Tevatron, has shifted to the LHC.Among the properties which render the B 0 s system particularly interesting and sensitive to new phenomena are its extremely high-rate flavor oscillations between particle and antiparticle [1,2] and its extremely low-rate decays to dileptons [3].
The exclusive decays of the neutral B mesons into muon pairs are highly suppressed -in the standard model.These processes involve effective flavorchanging neutral current (FCNC) b → c(d) transitions that can only proceed through higher-order diagrams in the SM, through box and penguin topologies.They are suppressed further by the required helicities of the final-state leptons in the decay of the spin-zero B mesons.The predicted SM values corresponding to the decay-time integrated branching fractions are 3,4], where the uncertainties are fully dominated by nonpertubrative QCD effects determined through lattice studies.
Several scenarios of physics beyond the standard model (BSM) predict significant modifications of the decay rates.Together with the precise SM expecations, this renders the study of these processes an excellent probe of BSM physics.This is the case, in particular, of theories with extended Higgs sectors.For example, in the context of the minimal supersymmetric SM, an enhancement of the branching fractions proportional to tan 6 β is predicted [5,6,7], where tan β is the ratio of the vacuum expectation values of the two Higgs fields.For large values of tan β, this search belongs to the most sensitive probes for BSM physics which can be performed at collider experiments.
Prior to the start of the LHC, the most stringent limits were obtained by the Tevatron experiments.In the last couple of years, Tevatron's CDF [13] and LHC's LHCb [18] have both reported double-sided intervals for B(B 0 s → μ + μ − ).Notwithstanding, sufficient statistical sensitivity for the B 0 s decay needed to wait for the full LHC Run I dataset, as is reported herein, while for B 0 it will be attained only with the coming LHC runs.

Analysis strategy and results
The signal is searched for in the dimuon invariantmass spectrum in a window around the B-mesons mass.Owing to their long lifetime, B mesons tend to travel a measurable distance inside the detector before decaying.As such, the experimental signature consists of two isolated muons fro the exclusive B decay, originating from a common displaced vertex, with the dimuon momentum aligned with the flight direction between the primary and B-decay vertices.

Event selection
A dedicated trigger algorithm has been deployed to collect the data samples.It is based on the presence of two muons, with transverse momentum p T,μ =3-4 GeV, and invariant mass 4.8 < m μμ < 6.0 GeV, forming a displaced vertex.
Vertex and isolation properties are further used offline to discriminate against combinatorial backgrounds.These arise from combinations of muons from uncorrelated sources, most importantly from two separate semileptonic b → cμν decays or from one such decay in combination with a misidentified hadron.Figure 2 illustrates the discriminating power of some of the variables used in the selection.
Instead of applying single thresholds to each of the selection variables, they are used to build boosted decision tree (BDT) multivariate discriminators.The BDTs are trained using signal events drawn from B 0 s → μ + μ − Monte Carlo (MC) simulation and background events The combinatorial background described above has a monotonic depedence on the dimuon invariant mass, which allows it to be extrapolated to the signal region from the mass sidebands.There are, however, physics background sources for which this is not the case.Rare single B decays contribute a non-peaking component (e.g.Λ b → pμν) and a peaking component from twobody hadronic decays (e.g.B 0 s → K + K − ), in the signal search window.These are estimated from MC simulation of the relevant processes, as indicated in Fig. 3.
These physics backgrounds arise due to the misidentification of charged hadrons as muons.A dedicated muon selection algorithm is employed to minimize the hadron-to-muon misidentification probability.It is based on a BDT that explores kinematic as well as silicon-tracker and muon-chamber combined-track information, which aims at descriminating genuine muons from those arising from hadron decay-in-flight or detec-   tor punchthrough.A misidentification rate at the permill level, (0.4 − 2.2) × 10 −3 , is achieved.

Likelihood fit and significance
The analysis is optimized separately in four separate channels, based on the √ s = 7 TeV and √ s = 8 TeV datasets and according to whether both muons lie within the region |η μ | < 1.4 (barrel) or not (endcap).Two methodologies are employed to determine the final results.(1) In the 1D-BDT method, a single threshold of the selection-BDT output b is optimized, per channel, for best significance.In the categorized-BDT method, the discriminant b is used to define 12 event categories, with different signal-to-background ratios but similar expected signal yield.The results are extracted via a simultaneous unbinned maximum likelihood fit to the resulting mass distributions.The fit projections are illustrated in Fig. 4 for both approaches.
An excess of events in the signal region is observed above the background predictions.The excess is found to be compatible with the B 0 s → μ + μ − signal with an observed (expected median) significance of 4.3(4.8)standard deviations (σ) in the categorized-BDT approach, and 4.8 σ(4.7 σ) in the 1D-BDT approach.The excess in the B 0 region has a significance of 2.0 σ.The B 0 s (B 0 ) : Scan of the ratio of the joint likelihood for the parameters μ s and λ ds defined in the text, for the categorized-BDT approach.The insets show the likelihood ratio scan for each of the parameters when the other is profiled together with other nuisance parameters; the significance at which the hypothesis wherein the parameter is zero is rejected is also shown.
significance is determined by evaluating the ratio of likelihood values for the hypothesis of no B 0 s (B 0 ) signal and when allowing the B 0 s (B 0 ) to float, while treating the B 0 (B 0 s ) as a nuisance parameter that is allowed to float in the fit.

Branching fractions
Under the hypothesis that the observed excess in the data corresponds to a genuine signal, the decay branching fractions may be extracted fom the fit to the data.The fit likelihood is built having the branching fractions as the parameters of interest.For minimizing uncertainties associated to the B production cross section and reconstruction efficiency, and for constraining the normalization of the physics backgrounds, a sample of B + → J/ψK + (with J/ψ → μ + μ − ) decays, obtained with identical dimuon selection as for the signal, is employed.The branching fractions are measured using where N denotes the extracted signal yields, ε the total trigger and reconstruction efficiencies, B(B + )= (6.0 ± 0.2) × 10 −5 [19] the branching fraction for B + → μ + μ − K + , and f u / f s = 0.256 ± 0.020 [20] the ratio of the B + and B 0 s fragmentation fractions.The branching fractions, determined from the categorized-BDT fit, are: The uncertainty is statistics dominated.The main systematic uncertainties arises from f u / f s and the description of the physics backgrounds.
The results in Eq. 2 are compatible with the SM expectations.To verify and further quanitfy the level of agreement of the measuremens with the SM expectations, the likelihood is expressed in terms of the signal strength parameters μ s and λ ds , The scan of the joint likelihood for μ s and λ ds is shown in Fig. 5.The compatibility with the SM point, μ s = λ ds = 1, is further illustrated therein.

LHC combination
The CMS and LHCb collaborations at the LHC have both reported measurements of the branching fractions B(B 0 (s) → μ + μ − ) based on the full Run I dataset.The ATLAS results based on the full Run I dataset have not been yet reported.A combination of the CMS and LHCb measurements is here presented.

Combined CMS and LHCb results
The results reported above in Sect. 2 are based on the analysis of pp-collision data corresponding to 5 fb −1 LHCb has reported [22], simultaneously with CMS, the results of a search based on their full LHC Run I dataset, corresponding to 1 fb −1 at √ s = 7 TeV collected in 2011 and to 2 fb −1 at √ s = 8 TeV in 2012.LHCb equally reports an excess of events over background expectation with a significance of 4.0 σ for B 0 s → μ + μ − and 2.0 σ for B 0 → μ + μ − .The CMS and LHCb measurements are compatible with each other.In order to increase the precision of the measured branching fractions, their combination is pursued.In the combination, known correlations and common external inputs need to be synchronized.Associated systematic uncertainties result for example from the imperfect knowledge of the ratio of fragmentation fractions f u / f s , that arises as the decay B + → J/ψK + is used as normalization channel.
In a preliminary combination of the results from both experiments [21], the asymmetric uncertainties are incorporated by using ensembles of simplified pseudoexperiments.The results are illustrated in Fig. 6.As for the individual analyses, and despite small upper fluctuations detected in both experiments for B(B 0 → μ + μ − ), also the combined results are compatible with the SM expectations.
A more thorough approach explores the combination of the likelihood functions, through a simultaneous fit to the two datasets [23].This allows in particular a more reliable evaluation of the statistical significance of the combined analyses.The excess of events over background observed is compatible with the B 0 s → μ + μ − (B 0 → μ + μ − ) signals at a significance level in excess of 6 σ (3 σ).

Prospects
Following LHC's very sucessful Run I, both accelerator and detectors are undergoing and will further un- L dt dergo in the coming years major upgrades.These will result in dramatic increases in sensitivity for rare decay searches such as those discussed in this article.

Further measurements
After having detected for the first time the B 0 s → μ + μ − decay during LHC Run I, as is reported herein, the focus of the coming LHC runs will be on pursuing the search for the B 0 → μ + μ − decay and further establishing and carrying out precision measurements of both channels.
The analysis reported in Sect. 2 was optimized primarily for searching for the B 0 s → μ + μ − decay.Further analysis refinements may be pursued, for each channel; for example at the level of candidate selection, fitting, and normalization channels.
It will be possible to explore additional observables, that are also theoretically clean and provide complementary sensitivity to BSM effects.A precise determination of the ratio of branching fractions allow to probe non-minimal flavor violation scenarios.B 0 s mixing induced, effective-lifetime [4] and CP asymmetry observables are also particularly interesting.

Accelerator and detector upgrades
The LHC will increase the centre-of-mass collision energy, to 14 TeV, and the delivered luminosity in the coming years.From 2023, a major upgrade is projected (HL-LHC); the instantaneous luminosity for CMS will be leveled at a maximum of 5 × 10 34 cm −2 s −1 , with the goal of deliverying a total integrated luminosity of 3ab −1 .The (staged) increase in instantaneous luminosities, leading up to an average of 140 primary interactions per collision (pileup), demand in turn equally dramatic upgrades of the detectors.
Starting with Run II (2015-2017) of the LHC, the CMS detector will have improved muon trigger capabilities, providing sharper momentum resolution and ability to cope with increased pileup conditions.With Run III (2019-2021), it will have an improved pixel detector, with an additional layer, improving vertex resolution by 50%.For the HL-LHC phase, from 2023 onwards, the detector will have been refurbished with an enhanced muon system and redesigned inner-tracker.The new tracker is expected to have hardware trigger level capabilities.The smaller silicon sensors pitch, combined with the reduced (x2) material budget, will improve the mass resolution by about a factor of 1.5 in the barrel region (|η| < 1.4).This will help separate the B 0 signal from the tail of the B 0 s signal, which now becomes a background to the B 0 search and measurement.

Sensitivity projections
Projections are made by scaling the results obtained in Run I, and by parameterizing, under estimated assumptions [24], the expected effect of improvements in the detector and harsher collision environment.Same analysis and trigger performance as in Run I is assumed, along with a 35% loss in detection efficiency.The SM ratio of the two branching fractions is also assumed.Projected reconstruction yields and resolutions are illustrated in Fig. 7. Branching fraction sensitivities are summarized in Table 1.The scenarios considered for these projections include LHC Run III and HL-LHC, with correpsonding CMS upgrades [25].

Summary
The study of the decays B 0 s → μ + μ − and B 0 → μ + μ − is among the highest priorities in heavy flavour physics, due to their exceptional sensitivity to sources of physics beyond the standard model.Based on the analysis of the full LHC Run I dataset collected by CMS, a first observation of the B 0 s → μ + μ − with more the 4 standard deviations is reported.The combination of the CMS and LHCb results yields the first observation of the B 0 s → μ + μ − channel with more than 6 σ and first evidence of the B 0 → μ + μ − channel with 3 σ.The exploration of the B 0 (s) → μ + μ − rare decays remains a highest priority of the heavy-flavor physics program into the coming LHC runs.

Figure 1 :
Figure 1: Summary of searches over time, illustrating the steady increase in sensitivity achieved.

Figure 2 :
Figure 2: Subset of selection variables: (a) fligth-distance significance; (b) pointing angle between dimuon momentum and flight direction; (c) number of tracks close to the B vertex; (d) single-muon isolation.Signal distributions are from B 0 s → μ + μ − MC simulation and background from mass sidebands in data.

Figure 3 :
Figure 3: Processes contributing to the (left) non-peaking and (right) peaking physics backgrounds, following from hadron misidentification.Estimated from MC simulation.

Figure 4 :
Figure 4: Illustration of the simultaneous fit to the mass spectrum, weighted over categories, for the (left) 1D-BDT and (right) categorized-BDT methods.

SMFigure 5
Figure5: Scan of the ratio of the joint likelihood for the parameters μ s and λ ds defined in the text, for the categorized-BDT approach.The insets show the likelihood ratio scan for each of the parameters when the other is profiled together with other nuisance parameters; the significance at which the hypothesis wherein the parameter is zero is rejected is also shown.