Measurement of the CP-violating phase φs and the B 0s meson decay width difference with B 0s→ J/ψφ decays in ATLAS

A measurement of the B 0 s decay parameters in the B 0 s → J /ψφ channel using an integrated luminosity of 14 . 3 fb − 1 collected by the ATLAS detector from 8 TeV pp collisions at the LHC is presented. The measured parameters include the CP -violating phase φ s , the decay width Γ s and the width di ﬀ erence between the mass eigenstates ∆Γ s . The values measured for the physical parameters are statistically combined with those from 4.9 fb − 1 of 7 TeV data, leading to the following: In the analysis the parameter ∆Γ s is constrained to be positive. Results for φ s and ∆Γ s are also presented as 68% and 95% likelihood contours in the φ s – ∆Γ s plane. Also measured in this decay channel are the transversity amplitudes and corresponding strong phases. All measurements are in agreement with the Standard Model predictions.


Introduction
New phenomena beyond the predictions of the Standard Model (SM) may alter CP violation in bhadron decays. A channel that is expected to be sensitive to new physics contributions is the decay B 0 s → J/ψφ. CP violation in the B 0 s → J/ψφ decay occurs due to interference between direct decays and decays with B 0 s -B 0 s mixing. The oscillation frequency of B 0 s meson mixing is characterized by the mass difference ∆m s of the heavy (B H ) and light (B L ) mass eigenstates. The CP violating phase φ s is defined as the weak phase difference between the B 0 s -B 0 s mixing amplitude and the b → ccs decay amplitude. In the absence of CP violation, the B H state would correspond to the CP-odd state and the B L to the CP-even state. In the SM the phase φ s is small and can be related to Cabibbo-Kobayashi-Maskawa (CKM) quark mixing matrix elements via the relation φ s −2β s , with β s = arg[−(V ts V * tb )/(V cs V * cb )]; assuming no physics beyond the SM contributions to B 0 s mixing and decays, a value of −2β s = −0.0363 +0.0016 −0.0015 rad can be predicted by combining beauty and kaon physics observables [1].
Other physical quantities involved in B 0 s -B 0 s mixing are the decay width Γ s = (Γ L + Γ H )/2 and the width difference ∆Γ s = Γ L − Γ H , where Γ L and Γ H are the decay widths of the different eigenstates. The width difference is predicted to be ∆Γ s = 0.087 ± 0.021 ps −1 [2]. Physics beyond the SM is not expected to affect ∆Γ s as significantly as φ s [3]. However, extracting ∆Γ s from data is interesting as it allows theoretical predictions to be tested [3]. Previous measurements of these quantities have been reported by the DØ, CDF, LHCb, ATLAS and CMS collaborations [4,5,6,7,8,9].
The decay of the pseudoscalar B 0 s to the vector-vector J/ψ(µ + µ − )φ(K + K − ) final state results in an admixture of CP-odd and CP-even states, with orbital angular momentum L = 0, 1 or 2. The final states with orbital angular momentum L = 0 or 2 are CP-even, while the state with L = 1 is CP-odd. The same final state can also be produced with K + K − pairs in an S -wave configuration [10]. This S -wave final state is CP-odd. The CP states are separated statistically using an angular analysis of the final-state particles. Flavour tagging is used to distinguish between the initial B 0 s andB 0 s states. The analysis presented here provides a measurement of the B 0 s → J/ψφ decay parameters using 14.3 fb −1 of LHC pp data collected by the ATLAS detector during 2012 at a centre-of-mass energy of 8 TeV. This is an update of the previous flavour-tagged time-dependent angular analysis of B 0 s → J/ψφ [8] that was performed using 4.9 fb −1 of data collected at 7 TeV. Electrons are now included, in addition to final-state muons, for the flavour tagging using leptons.

ATLAS detector and Monte Carlo simulation
The ATLAS detector [11] is a multi-purpose particle physics detector with a forward-backward symmetric cylindrical geometry and nearly 4π coverage in solid angle. * The inner tracking detector (ID) consists of a silicon pixel detector, a silicon microstrip detector and a transition radiation tracker. The ID is surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, and by a high-granularity liquid-argon (LAr) sampling electromagnetic calorimeter. A steel/scintillator tile calorimeter provides hadronic coverage in the central rapidity range. The end-cap and forward regions are instrumented with LAr calorimeters for electromagnetic and hadronic measurements. The muon spectrometer (MS) surrounds the calorimeters and consists of three large superconducting toroids with eight coils each, a system of tracking chambers, and detectors for triggering.
The muon and tracking systems are of particular importance in the reconstruction of B meson candidates. Only data collected when both these systems were operating correctly and when the LHC beams were declared to be stable are used in the analysis. The data were collected during a period of rising instantaneous luminosity, and the trigger conditions varied over this time. The triggers used to select events for this analysis are based on identification of a J/ψ → µ + µ − decay, with transverse momentum (p T ) thresholds of either 4 GeV or 6 GeV for the muons. The measurement uses 14.3 fb −1 of pp collision data collected with the ATLAS detector at a centre-of-mass energy of 8 TeV. Data collected at the beginning of the 8 TeV data-taking period are not included in the analysis due to a problem with the trigger tracking algorithm. The trigger was subsequently changed to use a different tracking algorithm that did not have this problem.
To study the detector response, estimate backgrounds and model systematic effects, 12 million Monte Carlo (MC) simulated B 0 s → J/ψφ events were generated using Pythia 8 [12,13] tuned with ATLAS data [14]. No p T cuts were applied at the generator level. The detector response was simulated using the ATLAS simulation framework based on GEANT4 [15,16]. In order to take into account the varying number of proton-proton interactions per bunch crossing (pile-up) and trigger configurations during data-taking, the MC events were weighted to reproduce the same pile-up and trigger conditions in data. Additional samples of the background decay B 0 d → J/ψK 0 * , as well as the more general bb → J/ψX and pp → J/ψX backgrounds were also simulated using Pythia 8.

Reconstruction and candidate selection
Events must pass the trigger selections described in Section 2. In addition, each event must contain at least one reconstructed primary vertex, formed from at least four ID tracks, and at least one pair of oppositely charged muon candidates that are reconstructed using information from the MS and the ID [17]. A muon identified using a combination of MS and ID track parameters is referred to as a combined-muon. A muon formed from a MS track segment that is not associated with a MS track but is matched to an ID track extrapolated to the MS is referred to as a segment-tagged muon. The muon track parameters are determined from the ID measurement alone, since the precision of the measured track parameters is dominated by the ID track reconstruction in the p T range of interest for this analysis. Pairs of oppositely charged muon tracks are refitted to a common vertex and the pair is accepted for further consideration if the quality of the fit meets the requirement χ 2 /d.o.f. < 10. The invariant mass of the muon pair is calculated from the refitted track parameters. In order to account for varying mass resolution in different parts of the detector, the J/ψ candidates are divided into three subsets according to the pseudorapidity η of the muons. A maximum-likelihood fit is used to extract the J/ψ mass and the corresponding mass resolution for these three subsets. When both muons have |η| < 1.05, the dimuon invariant mass must fall in the range 2.959-3.229 GeV to be accepted as a J/ψ candidate. When one muon has 1.05 < |η| < 2.5 and the other muon |η| < 1.05, the corresponding signal region is 2.913-3.273 GeV. For the third subset, where both muons have 1.05 < |η| < 2.5, the signal region is 2.852-3.332 GeV. In each case the signal region is defined so as to retain 99.8% of the J/ψ candidates identified in the fits.
The candidates for the decay φ → K + K − are reconstructed from all pairs of oppositely charged particles with p T > 1 GeV and |η| < 2.5 that are not identified as muons. Candidate events for B 0 s → J/ψ(µ + µ − )φ(K + K − ) decays are selected by fitting the tracks for each combination of J/ψ → µ + µ − and φ → K + K − to a common vertex. Each of the four tracks is required to have at least one hit in the pixel detector and at least four hits in the silicon microstrip detector. The fit is further constrained by fixing the invariant mass calculated from the two muon tracks to the J/ψ mass [18]. A quadruplet of tracks is accepted for further analysis if the vertex fit has a χ 2 /d.o.f. < 3, the fitted p T of each track from φ → K + K − is greater than 1 GeV and the invariant mass of the track pairs (assuming that they are kaons) falls within the interval 1.0085 GeV < m(K + K − ) < 1.0305 GeV. If there is more than one accepted candidate in the event, the candidate with the lowest χ 2 /d.o.f. is selected. In total, 375,987 B 0 s candidates are collected within a mass range of 5.150-5.650 GeV. For each B 0 s meson candidate the proper decay time t is estimated using the expression: where p T B is the reconstructed transverse momentum of the B 0 s meson candidate and m B denotes the mass of the B 0 s meson, taken from [18]. The transverse decay length, L xy , is the displacement in the transverse plane of the B 0 s meson decay vertex with respect to the primary vertex, projected onto the direction of the B 0 s transverse momentum. The position of the primary vertex used to calculate this quantity is determined from a refit following the removal of the tracks used to reconstruct the B 0 s meson candidate.
For the selected events the average number of pile-up proton-proton interactions is 21, necessitating a choice of the best candidate for the primary vertex at which the B 0 s meson is produced. The variable used is the three-dimensional impact parameter d 0 , which is calculated as the distance between the line extrapolated from the reconstructed B 0 s meson vertex in the direction of the B 0 s momentum, and each primary vertex candidate. The chosen primary vertex is the one with the smallest d 0 .
A study [19] made using a MC simulated dataset has shown that the precision of the reconstructed B 0 s proper decay time remains stable over the range of pile-up encountered during 2012 data-taking. No B 0 s meson decay-time cut is applied in this analysis.

Flavour tagging
The initial flavour of a neutral B meson can be inferred using information from the opposite-side B meson that contains the other pair-produced b-quark in the event [20,21]. This is referred to as opposite-side tagging (OST).
To study and calibrate the OST methods, events containing B ± → J/ψK ± decays are used, where the flavour of the B ± -meson is provided by the kaon charge. A sample of B ± → J/ψK ± candidates is selected from the entire 2012 dataset satisfying the data-quality selection described in Section 2.
Since the OST calibration is not affected by the trigger problem at the start of the 8 TeV data-taking period, the tagging measurement uses 19.5 fb −1 of integrated luminosity of pp collision data.  Figure 1: The invariant mass distribution for B ± → J/ψK ± candidates satisfying the selection criteria, used to study the flavour tagging. Data are shown as points, and the overall result of the fit is given by the blue curve. The contribution from the combinatorial background component is indicated by the red dotted line, partially reconstructed B decays by the green shaded area, and decays of B ± → J/ψπ ± , where the pion is mis-assigned a kaon mass, by the purple dashed line.

B ± → J/ψK ± event selection
In order to select candidate B ± → J/ψK ± decays, firstly J/ψ candidates are selected from pairs of oppositely charged combined-muons forming a good vertex, following the criteria described in Section 3. Each muon is required to have a transverse momentum of at least 4 GeV and pseudorapidity within |η| < 2.5. The invariant mass of the dimuon candidate is required to satisfy 2.8 GeV < m(µ + µ − ) < 3.4 GeV. To form the B candidate, an additional track, satisfying the same quality requirements described for tracks in Section 3, is combined with the dimuon candidate using the charged kaon mass hypothesis, and a vertex fit is performed with the mass of the dimuon pair constrained to the known value of the J/ψ mass. To reduce the prompt component of the combinatorial background, a requirement is applied to the transverse decay length of the B candidate of L xy > 0.1 mm.
A sideband subtraction method is used in order to study parameter distributions corresponding to the B ± signal processes with the background component subtracted. Events are divided into sub-sets into five intervals in the pseudorapidity of the B candidate and three mass regions. The mass regions are defined as a signal region around the fitted peak signal mass position µ ± 2σ and the sideband regions are defined as [µ − 5σ, µ − 3σ] and [µ + 3σ, µ + 5σ], where µ and σ are the mean and width of the Gaussian function describing the B signal mass. Separate binned extended maximum-likelihood fits are performed to the invariant mass distribution in each region of pseudorapidity.
An exponential function is used to model the combinatorial background and a hyperbolic tangent function to parameterize the low-mass contribution from incorrectly or partially reconstructed B decays. A Gaussian function is used to model the B ± → J/ψπ ± contribution. The contribution from non-combinatorial background is found to have a negligible effect on the tagging procedure. Figure 1 shows the invariant mass distribution of B candidates for all rapidity regions overlaid with the fit result for the combined data.

Flavour tagging methods
Several methods that differ in efficiency and discriminating power are available to infer the flavour of the opposite-side b-quark. The measured charge of a muon or electron from a semileptonic decay of the B meson provides strong separation power; however, the b → transitions are diluted through neutral B meson oscillations, as well as by cascade decays b → c → , which can alter the charge of the lepton relative to those from direct b → decays. The separation power of lepton tagging is enhanced by considering a weighted sum of the charge of the tracks in a cone around the lepton, where the weighting function is determined separately for each tagging method by optimizing the tagging performance. If no lepton is present, a weighted sum of the charge of tracks in a jet associated with the opposite-side B meson decay provides some separation. The flavour tagging methods are described in detail below.
For muon-based tagging, an additional muon is required in the event, with p T > 2.5 GeV, |η| < 2.5 and with |∆z| < 5 mm from the primary vertex. Muons are classified according to their reconstruction class, combined or segment-tagged, and subsequently treated as distinct flavour tagging methods. In the case of multiple muons, the muon with the highest transverse momentum is selected.
A muon cone charge variable is constructed, defined as where q is the charge of the track, κ = 1.1 and the sum is performed over the reconstructed ID tracks within a cone, ∆R = (∆φ) 2 + (∆η) 2 < 0.5, around the muon direction. The reconstructed ID tracks must have p T > 0.5 GeV and |η| < 2.5. Tracks associated with the B ± signal decay are excluded from the sum. In Figure 2 the opposite-side muon cone charge distributions are shown for candidates from B ± signal decays.
For electron-based tagging, an electron is identified using information from the inner detector and calorimeter and is required to satisfy the tight electron quality criteria [22]. The inner detector track associated with the electron is required to have p T > 0.5 GeV and |η| < 2.5. It is required to pass within |∆z| < 5 mm of the primary vertex to remove electrons from non-signal interactions. To exclude electrons associated with the signal-side of the decay, electrons are rejected that have momenta within a cone of size ∆R = 0.4 around the signal B candidate direction in the laboratory frame and opening angle between the B candidate and electron momenta, ζ b , of cos(ζ b ) > 0.98. In the case of more than one electron passing the selection, the electron with the highest transverse momentum is chosen. As in the case of muon tagging, additional tracks within a cone of size ∆R = 0.5 are used to form the electron cone charge Q e with κ = 1.0. If there are no additional tracks within the cone, the charge of the electron is used. The resulting opposite-side electron cone charge distribution is shown in Figure 3 for B + and B − signal events. In the absence of a muon or electron, b-tagged jets (i.e. jets that are the product of a b-quark) are identified using a multivariate tagging algorithm [23], which is a combination of several b-tagging algorithms using an artificial neural network and outputs a b-tag weight classifier. Jets are selected that exceed a b-tag weight of 0.7. This value is optimized to maximize the tagging power of the calibration sample. Jets are reconstructed from track information using the anti-k t algorithm [24] with a radius parameter R = 0.8. In the case of multiple jets, the jet with the highest value of the b-tag weight is used.
The jet charge is defined as where κ = 1.1 and the sum is over the tracks associated with the jet, excluding those tracks associated with a primary vertex other than that of the signal decay and tracks from the signal candidate. Figure 4 shows the distribution of the opposite-side jet-charge for B ± signal candidates.
The efficiency, , of an individual tagging method is defined as the ratio of the number of events tagged by that method to the total number of candidates. A probability P(B|Q) (P(B|Q)) that a specific event has a signal decay containing ab-quark (b-quark) given the value of the discriminating variable is constructed from the calibration samples for each of the B + and B − samples, which defines P(Q|B + ) and P(Q|B − ), respectively. The probability to tag a signal event as containing ab-quark is therefore P(B|Q) = P(Q|B + )/(P(Q|B + ) + P(Q|B − )), and correspondingly P(B|Q) = 1 − P(B|Q). It is possible to define a quantity called the dilution D = P(B|Q) − P(B|Q) = 2P(B|Q) − 1, which represents the strength of a particular flavour tagging method. The tagging power of a particular tagging method is defined as   Table 1: Summary of tagging performance for the different flavour tagging methods described in the text. Uncertainties shown are statistical only. The efficiency and tagging power are each determined by summing over the individual bins of the charge distribution. The effective dilution is obtained from the measured efficiency and tagging power. For the efficiency, dilution, and tagging power, the corresponding uncertainty is determined by combining the appropriate uncertainties in the individual bins of each charge distribution.
distribution as a function of the charge variable. An effective dilution, D = √ T/ , is calculated from the measured tagging power and efficiency.
The flavour tagging method applied to each B 0 s candidate event is taken from the information contained in a given event. By definition there is no overlap between lepton-tagged and jet-charge-tagged events. The overlap between muon-and electron-tagged events, corresponding to 0.4% of all tagged events, is negligibly small. In the case of doubly tagged events, the tagger with the highest tagging power is selected; however, the choice of hierarchy between muon-and electron-tagged events is shown to have negligible impact on the final fit results. If it is not possible to provide a tagging response for the event, then a probability of 0.5 is assigned. A summary of the tagging performance is given in Table 1.

Using tag information in the B 0 s fit
The tag-probability for each B 0 s candidate is determined from calibrations derived from a sample of B ± → J/ψK ± candidates, as described in Section 4.2. The distributions of tag-probabilities for the signal and background are different and since the background cannot be factorized out, additional probability terms, P s (P(B|Q)) and P b (P(B|Q)) for signal and background, respectively, are included in the fit. The distributions of tag-probabilities for the B 0 s candidates consist of continuous and discrete parts (events with a tag charge of ±1); these are treated separately as described below.
To describe the continuous part, a fit is first performed to the sideband data, i.e., 5.150 GeV < m(B 0 s ) < 5.317 GeV or 5.417 GeV < m(B 0 s ) < 5.650 GeV, where m(B 0 s ) is the mass of the B 0 s candidate. Different functions are used for the different tagging methods. For the combined-muon tagging method, the function has the form of the sum of a fourth-order polynomial and two exponential functions. A second-order polynomial and two exponential functions are applied for the electron tagging algorithm. A sum of three Gaussian functions is used for the segment-tagged muons. For the jet-charge tagging algorithm an eighth-order polynomial is used. In all four cases unbinned maximum-likelihood fits to data are used. In the next step, the same function as applied to the sidebands is used to describe the distributions for events in the signal region: the background parameters are fixed to the values obtained from the fits to the sidebands while the signal parameters are free in this step. The ratio of background to signal (obtained from a simultaneous mass-lifetime fit) is fixed as well. The results of the fits projected onto histograms of B 0 s tag-probability for the different tagging methods are shown in Figure 5.
To account for possible deviations between data and the selected fit models a number of alternative fit functions are used to determine systematic uncertainties in the B 0 s fit. These fit variations are described in Section 7.
The discrete components of the tag-probability distribution originate from cases where the tag is derived from a single track, giving a tag charge of exactly +1 or −1. The fractions of events f +1 and f −1 with charges +1 and −1, respectively, are determined separately for signal and background using events from the same B 0 s mass signal and sideband regions. Positive and negative charges are equally probable for background candidates formed from a random combination of a J/ψ and a pair of tracks, but this is not the case for background candidates formed from a partially reconstructed b-hadron. For signal and background contributions, similar fractions of events that are tagged with +1 or −1 tagging charge are observed for each of the tagging methods. The remaining fraction of events, 1 − f +1 − f −1 , constitute the continuous part of the distributions. Table 2 summarizes the fractions f +1 and f −1 obtained for signal and background events and for the different tag methods.

Maximum likelihood fit
An unbinned maximum-likelihood fit is performed on the selected events to extract the parameter values of the The fit uses information about the reconstructed mass m, the measured proper decay time t, the measured proper decay time uncertainty σ t , the tagging probability, and the transversity angles Ω of each B 0 s → J/ψφ decay candidate. The measured proper decay time uncertainty σ t is calculated from the covariance matrix associated with the vertex fit of each candidate event. The transversity angles Ω = (θ T , ψ T , φ T ) are defined in Section 5.1. The likelihood is independent of the K + K − mass distribution. The likelihood function is defined as a combination of the signal and background probability density functions as follows: where N is the number of selected candidates, w i is a weighting factor to account for the trigger efficiency (described in Section 5.3), and f s is the fraction of signal candidates. The background fractions f B 0 and f Λ b are the fractions of B 0 mesons and Λ b baryons mis-identified as B 0 s candidates calculated relative to the number of signal events; these parameters are fixed to their MC values and varied as part of the systematic uncertainties. The mass m i , the proper decay time t i and the decay angles Ω i are the values measured from the data for each event i. F s , F B 0 , F Λ b and F bkg are the probability density functions (PDF) modelling the signal, B 0 background, Λ b background, and the other background distributions, respectively. A detailed description of the signal PDF terms in Equation (1) is given in Section 5.1. The three background functions are described in Section 5.2.

Signal PDF
The PDF used to describe the signal events, F s , has the following composition: The mass function P s (m i ) is modelled by a sum of three Gaussian distributions. The probability terms P s (σ t i ) and P s (p Ti ) are described by gamma functions and are unchanged from the analysis described in Ref. [25]. The tagging probability term for signal P s (P(B|Q)) is described in Section 4.3.
The term P s (Ω i , t i , P(B|Q), σ t i ) is a joint PDF for the decay time t and the transversity angles Ω for the B 0 s → J/ψ(µ + µ − )φ(K + K − ) decay. Ignoring detector effects, the distribution for the time t and the angles Ω is given by the differential decay rate [26]: where O (k) (t) are the time-dependent functions corresponding to the contributions of the four different amplitudes (A 0 , A || , A ⊥ , and A S ) and their interference terms, and g (k) (θ T , ψ T , φ T ) are the angular functions. Table 4 shows these time-dependent functions and the angular functions of the transversity angles. The formulae for the time-dependent functions have the same structure for B 0 s andB 0 s but with a sign reversal in the terms containing ∆m s . In Table 4, the parameter A ⊥ (t) is the time-dependent amplitude for the CP-odd final-state configuration while A 0 (t) and A (t) correspond to CP-even finalstate configurations. The amplitude A S (t) gives the contribution from the CP-odd non-resonant B 0 s → J/ψK + K − S -wave state (which includes the f 0 ). The corresponding functions are given in the last four lines of Table 4 (k = 7-10). The amplitudes are parameterized by |A i |e iδ i , where i = {0, ||, ⊥, S }, with δ 0 = 0 and are normalized such that |A 0 (0)| 2 + |A ⊥ (0)| 2 + |A (0)| 2 = 1. |A ⊥ (0)| is determined according to this condition, while the remaining three amplitudes are parameters of the fit. The formalism used throughout this analysis assumes no direct CP violation.
The angles (θ T , ψ T , φ T ), are defined in the rest frames of the final-state particles. The x-axis is determined by the direction of the φ meson in the J/ψ rest frame, and the K + K − system defines the x-y plane, where p y (K + ) > 0. The three angles are defined as: • θ T , the angle between p(µ + ) and the normal to the x-y plane, in the J/ψ meson rest frame, • φ T , the angle between the x-axis and p xy (µ + ), the projection of the µ + momentum in the x-y plane, in the J/ψ meson rest frame, • ψ T , the angle between p(K + ) and − p(J/ψ) in the φ meson rest frame.
The PDF term P s (Ω i , t i , P(B|Q), σ t i ) takes into account the lifetime resolution, so each time element in Table 4 is smeared with a Gaussian function. This smearing is performed numerically on an event-byevent basis where the width of the Gaussian function is the proper decay time uncertainty, measured for each event, multiplied by a scale factor to account for any mis-measurements. The proper decay time uncertainty distribution for data, including the fits to the background and the signal contributions is shown in Figure 6. The average value of this uncertainty for signal events is 97 fs.
The angular acceptance of the detector and kinematic cuts on the angular distributions are included in the likelihood function through A(Ω i , p Ti ). This is calculated using a 4D binned acceptance method, applying an event-by-event efficiency according to the transversity angles (θ T , ψ T , φ T ) and the p T of the candidate. The p T binning is necessary, because the angular acceptance is influenced by the p T of the B 0 s candidate. The acceptance is calculated from the B 0 s → J/ψφ MC events. Taking the small discrepancies between data and MC events into account have negligible effect on the fit results. In the likelihood function, the acceptance is treated as an angular acceptance PDF, which is multiplied with the time-and angle-dependent PDF describing the B 0 s → J/ψ(µ + µ − )φ(K + K − ) decays. As both the acceptance and time-and angle-dependent decay PDFs depend on the transversity angles they must be normalized together. This normalization is done numerically during the likelihood fit. The PDF is normalized over the entire B 0 s mass range 5.150-5.650 GeV.

Background PDF
The background PDF has the following composition: The proper decay time function P b (t i |σ t i ) is parameterized as a prompt peak modelled by a Gaussian distribution, two positive exponential functions and a negative exponential function. These functions are smeared with the same resolution function as the signal decay time-dependence. The prompt peak models the combinatorial background events, which are expected to have reconstructed lifetimes distributed around zero. The two positive exponential functions represent a fraction of longer-lived backgrounds with non-prompt J/ψ, combined with hadrons from the primary vertex or from a B/D meson in the same event. The negative exponential function takes into account events with poor vertex resolution. The probability terms P b (σ t i ) and P b (p Ti ) are described by gamma functions. They are unchanged from the analysis described in Ref.
[25] and explained in detail there. The tagging probability term for background P b (P(B|Q)) is described in Section 4.3.
The shape of the background angular distribution, P b (Ω i ) arises primarily from detector and kinematic acceptance effects. These are described by Legendre polynomial functions: where the coefficients a k,l,m are adjusted to give the best fit to the angular distributions for events in the B 0 s mass sidebands. These parameters are then fixed in the main fit. The B 0 s mass interval used for the background fit is between 5.150 and 5.650 GeV excluding the signal mass region |(m(B 0 s ) − 5.366 GeV| < 0.110 GeV. The background mass model, P b (m i ) is an exponential function with a constant term added.
Contamination from B d → J/ψK 0 * and Λ b → J/ψpK − events mis-reconstructed as B 0 s → J/ψφ are accounted for in the fit through the F B 0 and F Λ b terms in the PDF function described in Equation (1). The fraction of these contributions, f B 0 = (3.3 ± 0.5)% and f Λ b = (1.8 ± 0.6)%, are evaluated from MC simulation using production and branching fractions from Refs. [18,27,28,29,30,31]. MC simulated events are also used to determine the shape of the mass and transversity angle distributions. The 3D angular distributions of B 0 d → J/ψK * 0 and of the conjugate decay are modelled using input from Ref.
[32], while angular distributions for Λ b → J/ψpK − and the conjugate decay are modelled as flat. These distributions are sculpted for detector acceptance effects and then described by Legendre polynomial functions, Equation (4), as in the case of the background described by Equation (3). These shapes are fixed in the fit. The B d and Λ b lifetimes are accounted for in the fit by adding additional exponential terms, scaled by the ratio of B d /B 0 s or Λ b /B 0 s masses as appropriate, where the lifetimes and masses are taken from Ref. [18]. Systematic uncertainties due to the background from B d → J/ψK 0 * and Λ b → J/ψpK − decays are described in Section 7. The contribution of B d → J/ψKπ events as well as their interference with B d → J/ψK 0 * events is not included in the fit and is instead assigned as a systematic uncertainty.
To account for possible deviations between data and the selected fit models a number of alternative fit functions and mass selection criteria are used to determine systematic uncertainties in the B 0 s fit. These fit variations are described in Section 7.

Muon trigger proper time-dependent efficiency
It was observed that the muon trigger biases the transverse impact parameter of muons, resulting in a minor inefficiency at large values of the proper decay time. This inefficiency is measured using MC simulated events, by comparing the B 0 s proper decay time distribution of an unbiased sample with the distribution obtained including the trigger. To account for this inefficiency in the fit, the events are re-weighted by a factor w: where p 0 , p 1 , p 2 and p 3 are parameters determined in the fit to MC events. No significant bias or inefficiency due to off-line track reconstruction, vertex reconstruction, or track quality selection criteria is observed.

Results
The full simultaneous unbinned maximum-likelihood fit contains nine physical parameters: ∆Γ s , φ s , Γ s , |A 0 (0)| 2 , |A (0)| 2 , δ || , δ ⊥ , |A S (0)| 2 and δ S . The other parameters in the likelihood function are the B 0 s signal fraction f s , parameters describing the J/ψφ mass distribution, parameters describing the B 0 s meson decay time plus angular distributions of background events, parameters used to describe the estimated decay time uncertainty distributions for signal and background events, and scale factors between the estimated decay time uncertainties and their true uncertainties. In addition there are also 353 nuisance parameters describing the background and acceptance functions that are fixed at the time of the fit. The fit model is tested using pseudo-experiments as described in Section 7. These tests show no significant bias, as well as no systematic underestimation of the statistical errors reported from the fit to data.  Multiplying the total number of events supplied to the fit with the extracted signal fraction and its statistical uncertainty provides an estimate for the total number of B 0 s meson candidates of 74900±400. The results and correlations of the physics parameters obtained from the fit are given in Tables 5 and 6. Fit projections of the mass, proper decay time and angles are given in Figures 7 and 8, respectively.

Systematic uncertainties
Systematic uncertainties are assigned by considering effects that are not accounted for in the likelihood fit. These are described below.
• Flavour tagging: There are two contributions to the uncertainties in the fit parameters due to the flavour tagging procedure, the statistical and systematic components. The statistical uncertainty due to the size of the sample of B ± → J/ψK ± decays is included in the overall statistical error. The systematic uncertainty arising from the precision of the tagging calibration is estimated by changing the model used to parameterize the probability distribution, P(B|Q), as a function of tag charge from the third-order polynomial function used by default to one of several alternative functions. The alternatives used are: a linear function; a fifth-order polynomial; or two thirdorder polynomials describing the positive and negative regions that share the constant and linear terms but have independent quadratic and cubic terms. For the combined-muon tagging, an additional model consisting of two third-order polynomials sharing the constant term but with independent linear, quadratic and cubic terms is also used. The B 0 s fit is repeated using the alternative models and the largest difference is assigned as the systematic uncertainty.
• Angular acceptance method: The angular acceptance (from the detector and kinematic effects mentioned in Section 5.1) is calculated from a binned fit to MC simulated data. In order to estimate the size of the systematic uncertainty introduced from the choice of binning, different acceptance functions are calculated using different bin widths and central values. These effects are found to be negligible.
• Inner detector alignment: Residual misalignments of the ID affect the impact parameter, d 0 , distribution with respect to the primary vertex. The effect of a radial expansion on the measured d 0 is determined from data collected at 8 TeV, with a trigger requirement of at least one muon with a transverse momentum greater than or equal to 4 GeV. The radial expansion uncertainties determined in this way are 0.14% for |η| < 1.5 and 0.55% for 1.5 < |η| < 2.5. These values are used to estimate the effect on the fitted B 0 s parameter values.Ê Small deviations are seen in some parameters, and these are included as systematic uncertainties.
• Trigger efficiency: To correct for the trigger lifetime bias the events are re-weighted according to Equation (5). The uncertainty of the parameters p 0 , p 1 , p 2 and p 3 are used to estimate the systematic uncertainty due to the time efficiency correction. These uncertainties originate from the following sources: the limited size of the MC simulated dataset, the choice of bin-size for  • Background angles model, choice of p T bins: The shape of the background angular distribution, P b (θ T , ϕ T , ψ T ), is described by the Legendre polynomial functions given in Equation (4). The shapes arise primarily from detector and kinematic acceptance effects and are sensitive to the p T of the B 0 s meson candidate. For this reason, the parameterization using the Legendre polynomial functions is performed in four p T intervals: 0-13 GeV, 13-18 GeV, 18-25 GeV and >25 GeV. The systematic uncertainties due to the choice of p T intervals are estimated by repeating the fit, varying these intervals. The biggest deviations observed in the fit results were taken to represent the systematic uncertainties. • B d contribution: The contamination from B d → J/ψK 0 * events mis-reconstructed as B 0 s → J/ψφ is accounted for in the final fit. Studies are performed to evaluate the effect of the uncertainties in the B d → J/ψK 0 * fraction, and the shapes of the mass and transversity angles distribution. In the MC events the angular distribution of the B d → J/ψK 0 * decay is modelled using parameters taken from Ref. [32]. The uncertainties of these parameters are taken into account in the estimation of systematic uncertainty. After applying the B 0 s signal selection cuts, the angular distributions are fitted using Legendre polynomial functions. The uncertainties of this fit are included in the systematic tests. The impact of all these uncertainties is found to have a negligible effect on the B 0 s fit results. The contribution of B d → J/ψKπ events as well as their interference with B d → J/ψK 0 * events is not included in the fit and is instead assigned as a systematic uncertainty. To evaluate this uncertainty, the MC background events are modelled using both the P-wave B d → J/ψK 0 * and S-wave B d → J/ψKπ decays and their interference, using the input parameters taken from Ref. [32]. The B 0 s fit using this input was compared to the default fit, and differences are included in Table 7.
• Λ b contribution: The contamination from Λ b → J/ψpK − events mis-reconstructed as B 0 s → J/ψφ is accounted for in the final fit. Studies are performed to evaluate the effect of the uncertainties in the Λ b → J/ψpK − fraction f Λ b , and the shapes of the mass, transversity angles, and lifetime distributions. Additional studies are performed to determine the effect of the uncertainties in the Λ b → J/ψΛ * branching ratios used to reweight the generated MC. These are uncertainties are included in Table 7.
• Fit model variations To estimate the systematic uncertainties due to the fit model, variations of the model are tested in pseudo-experiments. A set of ≈2500 pseudo-experiments is generated for each variation considered, and fitted with the default model. The systematic error quoted for each effect is the difference between the mean shift of the fitted value of each parameter from its input value for the pseudo-experiments altered for each source of systematic uncertainty.
In the first variation tested, the signal mass is generated using the fitted B 0 s mass convolved with a Gaussian function using the measured per-candidate mass errors. In another test, the background mass is generated from an exponential function with the addition of a first-degree polynomial function instead of an exponential function plus a constant term. The time resolution model was varied by using two different scale factors to generate the lifetime uncertainty, instead of the single scale factor used in the default model. The non-negligible uncertainties derived from these tests are included in the systematic uncertainties shown in Table 7. To determine the possible systematics effects of mis-modelling of the background events by the fitted background model, as seen in the low mass side-band region (5.150-5.210 GeV) of Figure 7, left, alternative mass selection cuts are used with the default fit model. The effect of these changes on the fit results are found to be negligible.
• Default fit model: Due to its complexity, the fit model is less sensitive to some nuisance parameters. This limited sensitivity could potentially lead to a bias in the measured physics parameters, even when the model perfectly describes the fitted data. To estimate the systematic uncertainty due to the choice of default fit model, a set of pseudo-experiments were conducted using the default model in both the generation and fit. The systematic uncertainties are determined from the mean of the pull distributions of the pseudo-experiments scaled by the statistical error of that parameter on the fit to data. These tests show no significant bias in the fit model, and no systematic underestimation of the statistical errors reported from the fit to data.
The systematic uncertainties are listed in Table 7. For each parameter, the total systematic error is obtained by adding all of the contributions in quadrature. [

Discussion
The PDF describing the B 0 s → J/ψφ decay is invariant under the following simultaneous transformations: Since ∆Γ s was determined to be positive [33], there is a unique solution. Figure 9 shows the 1D loglikelihood scans of φ s , ∆Γ s and of the three measured strong phases δ || , δ ⊥ and δ ⊥ −δ S . The variable on vertical axis, 2∆ln(L) ≡ 2(ln(L G ) − ln(L i )), is a difference between the likelihood values of a default fit, (L G ), and of the fit in which the physical parameter is fixed to a value shown on horizontal axis, (L i ). 2∆ln(L) = 1 corresponds to the estimated 1σ confidence level. There are a small asymmetries in the likelihood curves, however at the level of one statistical σ these are small compared to the corresponding statistical uncertainties of the physical variables, for which the scan is done. Therefore symmetric statistical uncertainties are quoted. Figure 10 shows the likelihood contours in the φ s -∆Γ s plane. The region predicted by the Standard Model is also shown.

Combination of 7 TeV and 8 TeV results
The measured values are consistent with those obtained in a previous analysis [8], using ATLAS data collected in 2011 at a centre-of-mass energy of 7 TeV. This consistency is also clear from a comparison of the likelihood contours in the φ s -∆Γ s projection shown in Figure 11. A Best Linear Unbiased Estimate (BLUE) combination [34] is used to combine the 7 TeV and 8 TeV measurements   to give an overall result for Run 1. In Ref.
[8] the strong phases δ and δ ⊥ -δ S were given as 1σ confidence intervals. These are not considered in the combination and the 8 TeV result is taken as the Run 1 result.
The BLUE combination requires the measured values and uncertainties of the parameters in question as well as the correlations between them. These are provided by the fits separately in the 7 TeV and 8 TeV measurements. The statistical correlation between these two measurements is zero as the events are different. The correlations of the systematic uncertainties between the two measurements are estimated by splitting the uncertainty into several categories.
The trigger efficiency is included as a systematic uncertainty only in the 7 TeV measurement, so there is no correlation with the 8 TeV measurement. Similarly, the systematic uncertainties arising from the Λ b → J/ψpK − background, and the choice of p T bins and mass sidebands in the modelling of background angles, are included as systematic uncertainties only in the 8 TeV measurement so there is no correlation with the 7 TeV measurement. In both the 7 TeV and 8 TeV results, a systematic uncertainty is assigned to the inner detector alignment and B d contribution. The inner detector alignment systematic uncertainties are highly correlated and small. The assumed correlation between these systematics made no difference to the final combined result and was set to 100%. For the B d contribution, while the systematic uncertainty tests are different, they are both performed to account for an imprecise knowledge of the B d contribution and are therefore assumed to be 100%.  systematic uncertainty is small and therefore regardless of what value of ρ acc is chosen the combination stays the same. For the 8 TeV measurement, electron tagging is added, therefore the systematic uncertainty is not 100% correlated. For ρ tag = 0.25, 0.5, 0.75 there is negligible difference between the results. The fit model was changed between the 7 TeV and 8 TeV measurement, the most significant change is that the mass uncertainty modelling was removed and the event-by-event Gaussian error distribution was replaced with a sum of three Gaussian distributions. It would be incorrect to estimate the correlation as 100% and there is negligible difference between the results for ρ mod = 0.25, 0.5, 0.75.
The combined results for the fit parameters and their uncertainties for Run 1 are given in Table 8. Due to the negative correlation between Γ s and ∆Γ s , and the change in the value of ∆Γ s between the 7 TeV and 8 TeV results, the combined value of Γ s is less than either individual result. The Run 1 likelihood contours in the φ s -∆Γ s plane are shown in Figure 11. They agree with the Standard Model predictions.