Study of the rare $B_s^0$ and $B^0$ decays into the $\pi^+\pi^-\mu^+\mu^-$ final state

A search for the rare decays $B_s^0 \to \pi^+\pi^-\mu^+\mu^-$ and $B^0 \to \pi^+\pi^-\mu^+\mu^-$ is performed in a data set corresponding to an integrated luminosity of 3.0 fb$^{-1}$ collected by the LHCb detector in proton-proton collisions at centre-of-mass energies of 7 and 8 TeV. Decay candidates with pion pairs that have invariant mass in the range 0.5-1.3 GeV/$c^2$ and with muon pairs that do not originate from a resonance are considered. The first observation of the decay $B_s^0 \to \pi^+\pi^-\mu^+\mu^-$ and the first evidence of the decay $B^0 \to \pi^+\pi^-\mu^+\mu^-$ are obtained and the branching fractions, restricted to the dipion-mass range considered, are measured to be $\mathcal{B}(B_s^0 \to \pi^+\pi^-\mu^+\mu^-)=(8.6\pm 1.5\,({\rm stat}) \pm 0.7\,({\rm syst})\pm 0.7\,({\rm norm}))\times 10^{-8}$ and $\mathcal{B}(B^0 \to \pi^+\pi^-\mu^+\mu^-)=(2.11\pm 0.51\,({\rm stat}) \pm 0.15\,({\rm syst})\pm 0.16\,({\rm norm}) )\times 10^{-8}$, where the third uncertainty is due to the branching fraction of the decay $B^0\to J/\psi(\to \mu^+\mu^-)K^*(890)^0(\to K^+\pi^-)$, used as a normalisation.

In this Letter, a search for the B 0 (s) → π + π − μ + μ − decays is reported. The analysis is restricted to events with muons that do not originate from φ, J /ψ, and ψ(2S) resonances, and with pion pairs with invariant mass in the range 0.5-1.3 GeV/c 2 . This mass range is set to include both f 0 (980) and ρ(770) 0 resonances, which overlap because of their large widths [17]. Other resonances, as well as non-resonant pions, might contribute [1,2]. However, due to the limited size of the data sample, an amplitude analysis of the π + π − mass spectrum is not attempted. The analysis is performed in a data set corresponding to an integrated luminosity of 3.0 fb −1 , collected by the LHCb detector in protonproton (pp) collisions. The first 1.0 fb −1 of data was collected in 2011 with collisions at the centre-of-mass energy of 7 TeV; the remaining 2.0 fb −1 in 2012 at 8 TeV. The signal yields are obtained from a fit to the unbinned π + π − μ + μ − mass distribution of the decay candidates. The fit modelling and the methods for the background estimation are validated on data, by fitting while the branching fractions of B 0 (s) → π + π − μ + μ − decays are normalised using B 0 → J /ψ K * (892) 0 decays reconstructed in the same data set.

Detector and simulation
The LHCb detector [18] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5 detector surrounding the pp interaction region [19], a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of siliconstrip detectors and straw drift tubes [20] placed downstream of the magnet. The tracking system provides a measurement of momentum with a relative uncertainty that varies from 0.4% at low momentum to 0.6% at 100 GeV/c. The minimum distance of a track to a primary vertex (PV), the impact parameter (IP), is measured with a resolution of 20 μm for charged particles with high transverse momentum (p T ). Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors (RICH) [21]. Photon, electron and hadron candidates are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [22]. Samples of simulated events are used to determine the efficiency of selecting B 0 (s) → π + π − μ + μ − and B 0 → J /ψ K * (892) 0 decays, and to study backgrounds. In the simulation, pp collisions are generated using Pythia [23,24] with a specific LHCb configuration [25]. Decays of hadronic particles are described by Evt-Gen [26], in which final-state radiation is generated using Photos [27]. The model of Refs. [12,28,29] is used to describe B 0 (s) → π + π − μ + μ − decays. The interaction of the generated particles with the detector and its response are implemented using the Geant4 toolkit [30,31] as described in Ref. [32].

Event selection
The online event-selection (trigger) consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction [33]. For this analysis, the hardware trigger requires at least one muon with p T > 1.48 (1.76) GeV/c, or two muons with  [35] are negligible. The muon candidates are combined with a pair of oppositely charged pions with invariant mass in the range 0.5-1.3 GeV/c 2 to form B 0 mass is required to be in the range 2.796-3.216 GeV/c 2 , and the invariant mass of the pion and kaon system in the range 0.826-0.966 GeV/c 2 . The four tracks are required to originate from the same B 0 (s) decay vertex. The B 0 (s) momentum vector is required to be within 14 mrad of the vector that joins the PV with the B 0 (s) decay vertex (flight distance vector).
The information from the RICH, the calorimeters, and the muon systems is used for particle identification (PID), i.e., to define a likelihood for each track to be associated with a certain particle hypothesis. Requirements on the muon-identification likelihood are applied to reduce to O(10 −2 ) the rate of misidentified muon candidates, mainly pions, whilst preserving 95% signal efficiency. In the case of B 0 → J /ψ K * (892) 0 decays, PID requirements on kaon candidates are applied to suppress any contributions from B 0 (s) → J /ψ π + π − decays with pions misidentified as kaons. In the case of B 0 (s) → π + π − μ + μ − decays, a requirement on the PID of pion candidates is applied to reduce the contami- misidentified as pions; this background peaks around 5.25 GeV/c 2 in the π + π − μ + μ − mass spectrum. A large data set of B 0 → J /ψ π + π − decays is used to optimise the PID requirement of pion candidates, assuming that the proportion between misidenti- The requirement retains about 55% of the signal candidates. Simulations show that additional contributions from B 0 ble kaon-pion misidentification are negligible. A requirement on the proton-identification likelihood of pion candidates suppresses the contamination from decays with protons misidentified as pions, with a 95% signal efficiency. After this selection, simulations show that contributions from where both the proton and the kaon are misidentified as pions.
In addition to the above requirements, a multivariate selection based on a boosted decision tree (BDT) [36,37] is used to suppress the large background from random combinations of tracks (combinatorial background) present in the π + π − μ + μ − sample.
The BDT is trained using simulated B 0 s → π + π − μ + μ − events to model the signal, and data candidates with π + π − μ + μ − mass in the range 5.5-5.8 GeV/c 2 for the background. The training is performed separately for the 2011 and 2012 data, and using simulations that reproduce the specific operational conditions of each year. The variables used in the BDT are the significance of the displacement from the PV of pion and muon tracks, the fit χ 2 of the B 0 (s) decay vertex, the angle between the B 0 (s) momentum vector and the flight distance vector, the p T of the B 0 (s) candidate, the sum and the difference of the transverse momenta of pions, the difference of the transverse momenta of muons, the B 0 (s) decay time, and the minimum p T of the pions. The resulting BDT output is independent of the π + π − μ + μ − mass and PID variables.
A requirement on the BDT output value is chosen to maximise [38], where ε is the signal efficiency; N b is the number of background events that pass the selection and have a mass within 30 MeV/c 2 of the known value of the B 0 s mass [17]; α represents the desired significance of the signal, expressed in terms of number of standard deviations. The value of α is set to 3 (5) for the 2011 (2012) data set. The resulting selection has around 85% efficiency to select signal candidates. The The efficiencies of all selection requirements are estimated with simulations, except for the efficiency of the PID selection for hadrons. The latter is determined in data using large and low-background samples of D * + → D 0 (→ K − π + )π + decays; the efficiencies are evaluated after reweighting the calibration samples to match simultaneously the momentum and pseudorapidity distributions of the final-state particles of B 0 Table 1 Selection efficiencies of the 2011 and 2012 data sets; ε s for the B 0 9.33 ± 0.05 (stat) ± 0.35 (syst) 9.74 ± 0.08 (stat) ± 0.27 (syst)  Table 1. The statistical uncertainties are due to the size of the calibration and simulation samples; systematic uncertainties are described in what follows. The total efficiency varies by approximately 15% in the π + π − mass range considered and it is parametrised with a second-order polynomial.
The signal candidates are weighted in order to have a constant efficiency as a function of the π + π − mass spectrum.
Systematic uncertainties of the efficiencies are dominated by the limited information about the signal decay-models; the main contribution comes from the unknown angular distributions of B 0 (s) → π + π − μ + μ − decay products. To estimate this uncertainty, the difference in efficiencies between decays generated according to a phase-space model and to the model of Refs. [12,28,29] is considered. The resulting relative uncertainty is 5.4%. A relative uncertainty of 3.7% (2.8%) for 2011 (2012) data is estimated by considering the difference of the efficiencies evaluated in the simulation and in data for B 0 → J /ψ K * (892) 0 decays. The same relative uncertainty is assigned to the efficiency associated with B 0 (s) → π + π − μ + μ − decays, as the cancellation of this uncertainty in the ratio of the efficiencies of signal and normalisation decays may not be exact. This is due to the fact that the p T distributions of the final-state particles are different between the decay modes.
An additional 1.6% relative uncertainty is assigned to ε s , due to the unknown mixture of B 0 s mass eigenstates in B 0 s → π + π − μ + μ − decays, which results in a B 0 s effective lifetime that could differ from the value used in the simulations [39].

Determination of the signal yields
The ratio of the branching fractions is the quantity being measured; it is used to express the observed yields of B 0 (s) → π + π − μ + μ − decays as follows: where N n is the B 0 → J /ψ K * (892) 0 yield, f s / f d is the ratio of the fragmentation probabilities for B 0 s and B 0 mesons [40], ε q is the selection efficiency of B 0 s → π + π − μ + μ − (B 0 → π + π − μ + μ − ) decays, and ε n the one of B 0 → J /ψ K * (892) 0 decays.
The number of events N n in Eq. (1) is obtained from an extended maximum likelihood fit to the unbinned μ + μ − K + π − mass distribution of the B 0 → J /ψ K * (892) 0 candidates in the range 4.97-5.77 GeV/c 2 . The μ + μ − K + π − mass distribution is shown in Fig. 1 with fit projections overlaid. A sum of two Gaussian functions, with a power-law tail on either side derived from simulations, is used to describe the dominant B 0 → J /ψ K * (892) 0 peak and the small B 0 s → J /ψ K * (892) 0 contribution. All function parameters are in common between the B 0 and B 0 s signal functions, except for the mass; the mass difference between B 0 s and B 0 mesons is fixed to the known value [17]. An exponential function is used to model the combinatorial background. A small contamination of B + → J /ψ K + decays combined with an additional charged pion is modelled with an ARGUS function [41]. Partially reconstructed B 0 decays at masses lower than the B 0 signal are described with another ARGUS function. The fitted yields of B 0 → J /ψ K * (892) 0 decays are corrected by subtracting a (6.4 ± 1.0)% contribution of B 0 → J /ψ K + π − decays [42], where the K + π − pair is in a S-wave state and does not originate from the decay of a K * (892) 0 resonance. The numbers of B 0 → J /ψ K * (892) 0 decays are 9821 ± 110 (stat) ± 134 (syst) ± 97 (S wave) and 23 521 ± 175 (stat) ± 172 (syst) ± 243 (S wave) in the 2011 and 2012 data sets, respectively, where the third uncertainty is due to the S-wave subtraction. The systematic uncertainty accounts for the uncertainties in the parameters fixed in the fit to the values determined in simulations, and are calculated with the method described at the end of this section.
The ratios R s and R d are measured from an extended maximum likelihood fit to the unbinned π + π − μ + μ − mass distribution, where the signal yields are parametrised using Eq. (1), and all other inputs are fixed. The different centre-of-mass energies result in different bb production cross sections and selection efficiencies in the 2011 and 2012 data samples. Therefore, the two samples are fitted simultaneously with different likelihood functions, but with the parameters R s and R d in common. We also fit simultaneously the B 0 (s) → π + π − μ + μ − and B 0 (s) → J /ψ π + π − samples. The latter are selected with the B 0 (s) → π + π − μ + μ − requirements, except for the dimuon mass, which is restricted to the 2.796-3.216 GeV/c 2 range. The B 0 (s) → J /ψ π + π − fit serves as a consistency check of the fit modelling, since the B 0 (s) → π + π − μ + μ − and B 0 (s) → J /ψ π + π − mass distributions are expected to be similar. In both samples, the fit range is 2 GeV/c 2 wide and starts from 5.19 GeV/c 2 . This limit is set to remove partially reconstructed decays of the B 0 mesons with an unreconstructed π 0 . The stability of the fit results is checked against the extension of the fit range in the lower mass region of the B 0 (s) → π + π − μ + μ − and B 0 (s) → J /ψ π + π − mass distributions, where an additional component is needed in the fit to describe the partially reconstructed B 0 decays below 5.19 GeV/c 2 .  The B 0 (s) → π + π − μ + μ − and B 0 (s) → J /ψ π + π − signals are described by a model similar to that used for the B 0 → J /ψ K * (892) 0 signal in the fit of the μ + μ − K + π − mass distribution. The B 0 peak position is a common parameter for the B 0 and B 0 (s) → J /ψ π + π − fits, as well as the signal resolutions; the difference between the B 0 and the B 0 s masses is fixed to the known value. The B 0 (s) → π + π − μ + μ − signal widths are multiplied by scale factors, derived from simulations, which accounts for the different momentum spectra between non-resonant muons and muons from J /ψ meson decays. In both fits, the combinatorial background is modelled with an exponential function.
decays, where kaons are misidentified as pions, are estimated using control samples of these decays reconstructed in data. They are selected as B 0 (s) → π + π − μ + μ − (B 0 (s) → J /ψ π + π − ) candidates, except for different requirements on the PID variables of the kaon and pion candidates, as for the normalisation decay mode. To obtain the yields and the shapes of the mass distribution of the misidentified decays, the kaon candidates are assigned the pion mass, and the resulting π + π − μ + μ − mass distribution is reweighted to reproduce the PID selection of the B 0 (s) → π + π − μ + μ − sample. In the final fit, the yields of the two backgrounds are constrained using Gaussian functions with means fixed to the values obtained with this method, and widths that account for a relative uncertainty in the 2011 (2012) data sample of 15% (10%) for B 0 → K * (892) 0 μ + μ − decays, and of 2% (1%) for B 0 → J /ψ K * (892) 0 decays. The shape of the B 0 → K * (892) 0 μ + μ − background is modelled with a Gaussian function with a power-law tail on the low-mass side; the shape of the B 0 → J /ψ K * (892) 0 background is modelled with a sum of two Gaussian functions with different means. All parameters of these functions are fixed from the values obtained in the fit to the control samples. The background from B 0 s → J /ψ K * (892) 0 decays is expected to be less than 0.5% [17] of the B 0 → J /ψ K * (892) 0 yield and is neglected. Similarly, the background from B 0 Backgrounds from decays B 0 s → φ(→ π + π − π 0 )μ + μ − with an unreconstructed π 0 , B 0 s → η (→ π + π − γ )μ + μ − with an unreconstructed γ , and B + → K + μ + μ − or B + → π + μ + μ − combined with an additional charged pion, are estimated from simulations. The mass distributions of these backgrounds are modelled with ARGUS functions with parameters fixed from fits to simulated events. Backgrounds from similar decay modes, where the muons come from the J /ψ meson, are described in the B 0 (s) → J /ψ π + π − fit using the same methods. An additional contribution is given by where a pion is not reconstructed. This background is modelled with a sum of two Gaussian functions, one of which has a power-law tail on the low-mass side. Backgrounds from give a negligible contribution at π + π − μ + μ − mass greater than 5.19 GeV/c 2 .
icance by the factor 1/ 1 + (σ (syst) /σ (stat) ) 2 , where σ (stat) is the statistical uncertainty, and σ (syst) is the sum in quadrature of the contributions in Table 2, except for the uncertainty on Fig. 3 compares the π + π − mass spectra of B 0 (s) → π + π − μ + μ − and B 0 (s) → J /ψ π + π − candidates, separately for the B 0 s and the B 0 decays. The background is subtracted using the sPlot technique [44] with the π + π − μ + μ − mass as the discriminating variable. The data show the dominance of the f 0 (980) resonance in the case of B 0 s → J /ψ π + π − decays, and of the ρ(770) 0 resonance in the case of B 0 → J /ψ π + π − decays, as expected from previous LHCb analyses [1,2]. The B 0 (s) → π + π − μ + μ − data show indications of a similar composition of the π + π − mass spectrum, although the size of the sample is not sufficient to draw a definite conclusion.
Several systematic uncertainties on R s and R d are considered, as summarised in Table 2. The contribution due to the uncertainties on parameters that are fixed in the fit, and on the efficiencies and the yields of B 0 → J /ψ K * (892) 0 decays that are fixed in Eq. (1), is obtained by repeating the fit, each time with the relevant parameters or inputs fixed to alternate values. These are sampled from Gaussian distributions centred at the nominal value, and whose widths correspond to the uncertainties on the fixed parameters and inputs. Known correlations between fixed parameters are taken into account. The r.m.s. spreads of the resulting R s and R d values are taken as the systematic uncertainties. The uncertainties associated with efficiencies are the sums in quadrature of their statistical and systematic uncertainties, reported in Table 1. The uncertainty on the B 0 → J /ψ K * (892) 0 yield is the sum in quadrature of the statistical uncertainty, the systematic uncertainty, and the uncertainty due to the S-wave subtraction. A systematic uncertainty is assigned on the estimation of the combinatorial background with the following method; pseudo experiments are generate in an extended mass range from 4.97 GeV/c 2 , where an additional peaking component is also added to simulate the partially reconstructed B 0 decays, and the pseudo data are fitted in the nominal range from 5.19 GeV/c 2 . The shifts between the average fitted values and the input values of R s and R d are taken as the systematic uncertainties. The contribution to the systematic uncertainty of R s due to the uncertainty on the values of f s / f d is also included. The final systematic uncertainties are the sums in quadrature of all contributions and correspond to 45% and 28% of the statistical uncertainties of R s and R d , respectively.