Observation of the decay B+ to psi(2S) phi(1020) K+ in pp collisions at sqrt(s) = 8 TeV

The decay B+ to psi(2S) phi(1020) K+ is observed for the first time using data collected from pp collisions at sqrt(s) = 8 TeV by the CMS experiment at the LHC, corresponding to an integrated luminosity of 19.6 inverse femtobarns. The branching fraction of this decay is measured, using the mode B+ to psi(2S) K+ as normalization, to be (4.0 +/- 0.4 (stat) +/- 0.6 (syst) +/- 0.2 (B)) x E-6, where the third uncertainty is from the measured branching fraction of the normalization channel.


Introduction
The large cross section for b quark production at the CERN LHC and the high luminosity of the accelerator provide the possibility to study rare B meson decays. Recently, several experiments have reported the likely presence of structures in the J/ψφ(1020) mass spectrum from B ± → J/ψφ(1020)K ± decays [1][2][3][4][5][6][7]. A natural extension of these results is to study the ψ(2S)φ(1020)K ± and the ψ(2S)φ(1020) mass spectra. As part of that investigation, we report the first observation of the decay B ± → ψ(2S)φ(1020)K ± , with ψ(2S) → µ + µ − and φ(1020) → K + K − . We measure the corresponding branching fraction using data collected at the LHC with the CMS detector in proton-proton (pp) collisions at √ s = 8 TeV, corresponding to an integrated luminosity of 19.6 fb −1 . Possible contributions from nonresonant K + K − and f 0 (980) states in the signal are also studied, and an upper limit is determined in the fraction of events that do not correspond to φ(1020) → K + K − in the B ± → ψ(2S)K ± K ∓ K ± channel. In what follows, φ is used to represent the φ(1020) meson, and all results are combined in the investigation of the two charge-conjugate states.

The CMS detector
The central feature of the CMS apparatus is a 13 m long superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. A silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections, reside within the volume of the solenoid. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. Extensive forward calorimetry complements the coverage provided by the barrel and endcap detectors.
The main subdetectors used in the present analysis are the silicon tracker and the muon detection system. Muons are identified within the pseudorapidity range |η| < 2.4, using detection planes based on three technologies: drift tubes, cathode-strip chambers, and resistive-plate chambers. The silicon tracker measures charged particles within the range |η| < 2.5. It consists of 66 million 100×150 µm 2 silicon pixels and more than 9 million silicon strips. For reconstructed particles with transverse momenta 1 < p T < 10 GeV and |η| < 1.4, the track resolutions are typically 1.5% in p T , and the transverse and longitudinal impact parameters are in the respective ranges of 25-90 and 45-150 µm [8].

Data and event selection
priate for the data analyzed, including the effects of alignment, efficiency, and average number of additional pp collisions and their multiple reconstructed vertices per beam crossing (pileup).
The events in the analysis were collected with a trigger based on the invariant mass and p T of the dimuon system. The following criteria are applied in the HLT: (i) the dimuon p T is required to be greater than 4.9 GeV, (ii) the two muons must be oppositely charged, (iii) the dimuon invariant mass is required to be in the range of 3.35-4.05 GeV, and (iv) the dimuon tracks must form a three-dimensional (3D) vertex with a χ 2 probability greater than 0.5%.
The two muons from the triggered event are required to be within 5 σ varies from 23 MeV for |η| < 0.6 to 45 MeV for |η| > 1.8. The B + → ψ(2S)φK + candidates are reconstructed by combining three additional charged particle tracks consistent with originating from the ψ(2S) vertex, and have a total charge of 1. These tracks are assigned the kaon mass. The B + decay vertex is reconstructed using a kinematic fit to a common 3D vertex constraining the invariant mass of the two muons to the nominal ψ(2S) mass. For multiple candidates, the one with the highest B + vertex probability is retained. The overall efficiencies in selecting the correct candidate obtained from MC studies are 96.8% and 99.4% for the B + → ψ(2S)φK + and B + → ψ(2S)K + events, respectively. The p T of each kaon track is required to be greater than 1 GeV. Only tracks passing the standard CMS high-purity requirements [14] are used. There are two K + K − combinations for the three charged kaon tracks, and the combination with invariant mass closest to the nominal φ meson mass [13] is used as the φ candidate. This selection yields the correct K + K − pair (94 ± 1)% of the time, as determined from simulation. The mass of the φ candidate is not constrained to its nominal value because the experimental K + K − mass resolution (1.3 MeV, obtained from our MC simulation) is less than the natural width of the φ meson (4.3 MeV) [13].
Additional requirements are placed on the resulting sample to optimize the sensitivity to the signal mode. The signal region is defined to lie within ±5 σ B M of the nominal B + mass [13], where σ B M is determined to be 3 MeV in a fit to simulated signal events using a single Gaussian function. Five quantities are chosen to optimize a 5 standard deviation discovery Punzi figureof-merit (FOM), defined as N S /(5/2 + √ N B ) [15], where N S is the number of B + candidates in the simulated sample, and N B is the number of background candidates within ±5 σ B M of the B + mass peak. The background contribution is obtained from a fit to the sideband events in the ψ(2S)φK + invariant mass spectrum, where the lower and upper sidebands are defined as 5.220-5.264 and 5.294-5.330 GeV, respectively. The five quantities used to optimize the FOM are as follows: (i) the B + vertex probability; (ii) the significance of the transverse displacement, defined as the ratio of the transverse distance L xy of the B + secondary vertex relative to the center of the beam spot and its uncertainty σ L xy , with the latter being the sum in quadrature of the uncertainty in the transverse position of the secondary vertex and the transverse size of the beam spot; (iii) the cosine of the pointing angle θ, defined as the angle between the reconstructed B + momentum vector and its flight direction, as determined from the vector connecting the primary vertex [8] to the B + secondary vertex, where the primary vertex is chosen so that this angle is closest to zero; (iv) the p T of the dimuon system; and (v) the φ mass window, defined as the difference between the invariant mass of the K + K − system and the mass of the φ meson [13]. The selection criteria derived from the optimization procedure are shown in Table 1. The overall efficiency of the offline signal selection is (1.91 ± 0.01) × 10 −3 . Table 1: The selection criteria derived from the optimization procedure.
The invariant mass spectrum of the selected ψ(2S)φK + candidates is shown in Fig. 1. An extended unbinned maximum-likelihood estimator from RooFit [16] is used to perform the fit to the data, using two Gaussian functions for the signal and a first-order polynomial for the background. The two Gaussian functions share a common mean fixed to the nominal B + mass [13], while their widths and relative fractions are fixed to the values obtained in the MC simulation. The goodness of fit is checked using a χ 2 test, which returns a χ 2 per degree of freedom (dof) of 23.0/24, with a corresponding probability of 52%. The fit gives a B + yield of 140 ± 15 events, where the uncertainty is statistical. Possible contamination from the decays of the f 0 (980) meson and nonresonant K + K − is studied through a simultaneous fit of the K + K − invariant mass distributions for the combinations closest to the nominal φ mass inside and outside of an 18 MeV mass window centered around the nominal B + mass, not using the ±8 MeV φ mass window listed in Table 1. The distributions of the nonresonant K + K − and f 0 background invariant mass contributions are obtained from dedicated B + → ψ(2S)K + K − K + and B + → ψ(2S)f 0 K + MC simulations generated using EVT-GEN [11], which models the f 0 distribution as a coupled-channel Breit-Wigner function [17]. Both the nonresonant K + K − and f 0 contributions are distorted through the selection of the K + K − pair closest to the nominal φ mass. We parametrize these forms using Gaussian functions that are very similar for the two components. Their correlation coefficients show that the  Fig. 2(a) thereby provides the non-B + background. Both the distributions share the same non-B + background function. We find (not displayed) that the non-B + contribution within the B + mass window is of 194 ± 14 events, obtained from a fit to the ψ(2S)φK + invariant mass spectrum. In the simultaneous fit, we therefore fix this number to 194, while the number of non-B + events contributing in the sidebands is allowed to vary.
The φ signal component is parametrized by a P-wave relativistic Breit-Wigner function, convolved with a Gaussian resolution function. The standard deviation of the Gaussian function is fixed to 1.3 MeV. The mass and width of the φ reflect their nominal values [13]. Since there is a φ signal in the non-B + events, the non-B + φ contribution in Fig. 2(a) is parametrized by the sum of a Crystal Ball function [18] and the above-mentioned function that represents the φ component.
The data in Fig. 2(a) is fitted using the non-B + function, and simultaneously in Fig. 2(b) using the above three functions. The fit returns a yield of 2 ± 20 events for the non-φ signal contribution that is too small to be seen in the Fig. 2(b). The systematic uncertainty in this yield is negligible. The fit quality in Fig. 2(b) is checked using a χ 2 test, which returns χ 2 /dof = 21.6/16. We set an upper limit on the fraction of the non-φ component in B + → ψ(2S)K + K − K + decays, obtained with the CL s method [19,20] using an asymptotic approximation [21], of 0.26 at the 95% confidence level.

B + → ψ(2S)K + decay
The B + → ψ(2S)K + decay is chosen as the normalization channel because its absolute branching fraction is well measured, it is recorded with the same trigger as the signal channel, and it is topologically similar to the signal, so that many systematic uncertainties cancel or are re-duced. All applicable selection requirements are kept the same as those for the signal channel. The ψ(2S)K + invariant mass distribution is shown in Fig. 3. A binned maximum-likelihood fit is used to determine the number of events in this channel. Again, two Gaussian functions are used to model the B + signal, and a first-order polynomial to model the background. The large number of events allows a fit with all parameters free to vary and the yield is found to be 87264 ± 363 (stat). The goodness of the fit is checked using a χ 2 test, which returns χ 2 /dof = 363/253.
Estimates of the contributions to the systematic uncertainty in B(B + → ψ(2S)φK + ) are summarized in Table 2, and described below. The uncertainty from modeling the shape of the B + invariant mass distribution is estimated to be 8.6% by allowing the widths of the two Gaussian functions to vary in the fit, with the background function fixed to a first order polynomial. Systematic uncertainties from sources such as muon identification, trigger efficiency, and track reconstruction efficiency for the three common tracks (two muons and a kaon) almost cancel in the measurement of the signal branching fraction. The uncertainty in the charged particle track reconstruction efficiency, obtained in an independent study by comparing two-body and four-body D 0 decays in data and simulated events [8], gives an uncertainty of 3.9% per track 6 Results and systematic uncertainties and a total uncertainty of 7.8% for the two additional kaon tracks.A mismatch in the p T distribution between B + mesons in MC simulations and in data can lead to an incorrect efficiency. We therefore reweight the signal and normalization events using a weighting function derived from the normalization channel. The ratio of efficiencies from the reweighted MC events is compared to the nominal value to extract a systematic uncertainty of 5.3%.
The choice of the K + K − candidate closest to the nominal φ mass causes a bias, and, to estimate any systematic contamination of the K + K − mass peak from non-φ backgrounds, the analysis is repeated after removing the selection on the K + K − mass being closest to the mass of the φ. This makes the choice of the K + K − pair independent of the closest value to the nominal φ mass, and the branching fraction is remeasured by keeping both K + K − pair candidate events. The subsequent B + → ψ(2S)K + K − K + invariant mass distribution is shown in Fig. 4. The signal in Fig. 4 is clear, but there is more background relative to the signal mass distribution shown in Fig. 1. There are 165 ± 18 B + signal events with two K + K − combinations for each event. The efficiency for the B + → ψ(2S)φK + signal after removing the choice of φ candidate is (2.14 ± 0.02) × 10 −3 , and the redetermined B(B + → ψ(2S)φK + ) is (4.2 ± 0.4 (stat)) × 10 −6 . The 5.0% difference between this and the nominal branching fraction is used as the systematic uncertainty from possible non-φ backgrounds.
The uncertainties in modeling the B + → ψ(2S)φK + and the normalization channel backgrounds are estimated to be 2.9% and 2.2%, respectively, by adding polynomials of higher order in the fit to describe the background. The uncertainty from the angular distribution of the K + K − system is estimated to be 1.9%, based on the changes induced in the B + reconstruction efficiency by weighting the simulated events with different helicity angle distributions. The uncertainty in the B + mass shape for the normalization channel is estimated to be 1.0% by adding a third Gaussian function with a common mean and a varying width to the fit, with the background again modeled by a linear function. The uncertainty in B(φ → K + K − ) is 1% [13].
Possible systematic uncertainties introduced by different trigger and pileup conditions and analysis selections have been investigated by dividing the data into subsets and evaluating the statistical consistency [13] of the independent samples; the resulting variations are found to be within the expected uncertainties.