Amplitude analysis and branching fraction measurement of B + → D ∗− D + s π + decays

The decays of the B + meson to the final state D ∗− D + s π + are studied in proton-proton collision data collected with the LHCb detector at centre-of-mass energies of 7, 8, and 13 TeV, corresponding to a total integrated luminosity of 9 fb − 1 . The ratio of branching fractions of the B + → D ∗− D + s π + and B 0 → D ∗− D + s decays is measured to be 0 . 173 ± 0 . 006 ± 0 . 010, where the first uncertainty is statistical and the second is systematic. Using partially reconstructed D ∗ + s → D + s γ and D + s π 0 decays, the ratio of branching fractions between the B + → D ∗− D ∗ + s π + and B + → D ∗− D + s π + decays is determined as 1 . 31 ± 0 . 07 ± 0 . 14. An amplitude analysis of the B + → D ∗− D + s π + decay is performed for the first time, revealing dominant contributions from known excited charm resonances decaying to the D ∗− π + final state. No significant evidence of exotic contributions in the D + s π + or D ∗− D + s channels is found. The fit fraction of the scalar state T ∗ cs 0 (2900) ++ observed in the B + → D − D + s π + decay is determined to be less than 2.3% at a 90% confidence level.


Introduction
Recent studies of B-meson decays involving pairs of open-charm hadrons have yielded a number of intriguing results in the field of open-charm and charmonium spectroscopy.The LHCb collaboration has conducted a number of these studies, including the analysis of B + → D + D − K + decays 1 that led to the discovery of tetraquark states in the D − K + system [1,2], the observation of near-threshold D + s D − s structures in the analysis of B + → D + s D − s K + decays [3,4], and the first observation of a doubly charged charm tetraquark and its neutral partner in B + → D − D + s π + and B 0 → D 0 D + s π − decays [5,6].In addition to the primary objective of studying doubly charmed decays to improve the understanding of the strong interaction, the analyses of doubly charmed final states can provide valuable information for the studies of semileptonic decays used to search for phenomena beyond the Standard Model.Semileptonic decays involving higher excitations of D mesons decaying to D * − π final state, dominated by the first orbital excitations denoted as D * * , constitute a substantial background to B 0 → D * − ℓ + ν ℓ decays.Thus, the knowledge of the spectrum of these excitations is essential for the measurements of such quantities as the lepton flavour universality ratio R(D * ) (see, e.g.[7,8]).The properties of excited charm mesons decaying to D * + π − final state have been recently studied in the analysis of B − → D * + π − π − decays [9].However, a reliable prediction of the B → D * * form factor in semileptonic decays requires additional information from other hadronic decays, such as B → D * * D + s and B → D * * D * + s processes [10].Decays of B mesons to double-charm final states, themselves, constitute an essential background to the semileptonic B decays.For instance, processes in which one of the charm mesons decays to a final state with a muon are a background to B 0 → D * − µ + ν µ decays, and to B 0 → D * − τ + ν τ decays where the τ + lepton is reconstructed in the τ + → µ + ν µ ν τ final state.Double-charm decays with one of the charm mesons (typically, the D + s meson) decaying to π + π − π + are a background to the B 0 → D * − τ + ν τ processes with the subsequent τ + → π + π − π + (π 0 )ν τ transition.
The dominant weak transition leading to the double-charm final states is b → ccs, with either internal W emission topology and intermediate cc resonances decaying further to DD state, or external W emission with intermediate D + s or D resonances.While the DDK final states have been studied by both B-factories and LHCb [11][12][13][14][15], experimental information on the B → D ( * ) D ( * )+ s decays is still scarce.Notably, evidence of the B + → D * * 0 D ( * )+ s decay was reported by the CLEO collaboration [16].In addition to the amplitude analyses of the B + → D − D + s π + and the B 0 → D 0 D + s π − decays [5,6] mentioned above, an angular analysis of the B 0 → D * − D * + s decay has been performed by LHCb [17].
This paper presents the observation and measurement of the branching fraction of the decay B + → D * − D + s π + , along with an analysis of its amplitude structure.A measurement of the branching fraction of the B + → D * − D * + s π + decay is also reported using the B 0 → D * − D + s decay as a normalisation channel.The analysis is based on the proton-proton (pp) collision data collected with the LHCb detector at centre-of-mass energies of 7 and 8 TeV (Run 1), and 13 TeV (Run 2), corresponding to a total luminosity of approximately 9 fb −1 .

Amplitude analysis formalism
The process B + → D * − D + s π + is a three-body decay of a pseudoscalar particle to two pseudoscalar and one vector final-state particles.The differential decay rate can be written as dΓ = A(m 2 (D * − π + ), m 2 (D + s π + ), θ D , ϕ D ) 2 dm 2 (D * − π + ) dm 2 (D + s π + ) d cos θ D dϕ D , (1) where A is the decay amplitude, m 2 (D * − π + ) and m 2 (D + s π + ) are the squared invariant masses of the D * − π + and D + s π + combinations, respectively, and θ D and ϕ D are the polar and azimuthal angles of the D 0 meson in the rest frame of the D * − particle.

Amplitude model
Within the isobar model used in this analysis, the decay amplitude A is expressed as a sum of quasi-two-body amplitudes with potential resonant and nonresonant intermediate states in the D * − π + , D + s π + and D * − D + s channels: A = A (D * π) + A (Dsπ) + A (D * Ds) . ( Each of the amplitudes is represented as a sum over the resonant or nonresonant components, which in turn are the products of the line shape R n and the angular H n terms, where n denotes the index of the component in each quasi-two-body decay channel (ch = D * − π + , D + s π + , D * − D + s ): The angular terms are functions of the helicity angles, defined for each channel in Fig. 1 (see, e.g., Ref. [18] for the formalism used): ( Here λ is the helicity of the D * − meson, J n is the spin of the intermediate resonance, d J λ,λ ′ (θ) are the reduced Wigner functions, and h n,λ are complex couplings for each helicity amplitude.The angles θ D and ϕ D in these expressions are the same as those used as the phase-space variables in Eq. 1, while the other angles are functions of the four phase-space variables.The angles θ (cs,ccs) D and ϕ (cs,ccs) D are defined in the D * − rest frame, while θ D * , θ ccs D * and θ cs Ds are defined in the rest frames of the D * * 0 , T 0 ccs , and T ++ cs states, respectively.The amplitudes of the two-body decay A → BC can be expressed in terms of LS couplings c L,S , where L and S denote the orbital angular momentum and the total spin of the BC combination.The relations between the helicity couplings and the LS couplings are given by Wigner 3-j symbols, Combining this with Eq. 4, the angular terms of the amplitude given in Table 1 are obtained.The expressions for the h while in the T ++ cs → D + s π + decay the angular momentum is fixed by the spin J, and parity conservation requires that the parity P of the T ++ cs state equals (−1) J .The explicit expressions for the angular dependency in this case are given in Table 2.
The resonant states are parametrised by relativistic Breit-Wigner (BW) line shapes of the form  . with the mass-dependent width, Γ(m), given by where m is the invariant mass of the resonance decay products, m 0 and Γ 0 are, respectively, the mass and the width of the intermediate resonance, L B is the orbital momentum of the B decay, L R is the orbital momentum of the decay of the resonance, while p and q are, respectively, the momentum of the resonance in the B + rest frame and the momentum of resonance decay products in its rest frame.The Blatt-Weisskopf form factors [20] for the intermediate resonance, F R (m, L R ), and for the B + meson, F B (m, L B ), are parametrised as where z(m) = p(m)d, z 0 = p 0 d, q 0 and p 0 are the momenta evaluated at the nominal resonance mass, and d is the radial parameter set to 4.5 GeV −1 [9] for all resonances. 3onresonant contributions are parametrised using an exponential function, where α is a shape parameter that is extracted from the fit.The value of m 0 in this expression is arbitrary and only affects the numerical values of the couplings.When adding nonresonant D * − π + or D + s π + amplitudes, m 0 is set to 2.15 GeV, while for nonresonant D * − D + s amplitudes, m 0 = 4 GeV, which roughly corresponds to kinematic thresholds of the respective channels.

Amplitude fit procedure
The probability density function (PDF) used in the amplitude analysis consists of signal and background contributions and is a function of the phase-space variables The amplitude fit minimises the unbinned negative logarithmic likelihood (NLL) where N is the total number of candidates and P tot (x i ) is the total PDF for candidate i, including the efficiency term and the background component Explicit parametrisations are used to describe the efficiency ϵ, and the background density P bkg as functions of x (see Sects. 6 and 7).The parameter f bkg is the fraction of background events in the B + signal region and is fixed to the value obtained from the results of the fits to the invariant-mass distributions described in Sect. 5.The signal N sig and background N bkg normalisation integrals are calculated numerically using a sample of 10 6 events distributed uniformly over the four-dimensional phase space of the decay.At each minimisation step, the PDF is evaluated on the normalisation sample, and the integrals are calculated by summing the density for each normalisation event.The signal decay amplitude A in Eq. 2 is constructed using the AmpliTF package [21].The amplitude fit is implemented in TFA2 [22], a fitting package based on TensorFlow [23], interfaced with the iminuit minimisation library [24].The fit fraction F i due to a resonant or nonresonant contribution i is computed from the fitted parameters, and is defined as If all components correspond to partial waves with different quantum numbers, the sum of the fit fractions adds up to 100%; otherwise, the sum may differ from 100% due to interference effects.

Detector and simulation
The LHCb detector [25,26] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks.The detector includes a high-precision tracking system consisting of a siliconstrip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet.The tracking system provides a measurement of the momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV.The impact parameter (IP), defined as the minimum distance of a track to a primary pp collision vertex (PV), is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV.Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov (RICH) detectors.Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic and a hadronic calorimeter.Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers.The online event selection is performed by a trigger [27,28], which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction.At the hardware trigger stage, events are required to have a muon with high p T or a hadron, photon or electron with high transverse energy in the calorimeters.The software trigger requires a two-, three-or four-track secondary vertex with a significant displacement from any primary pp interaction vertex.At least one charged particle must have a high transverse momentum p T and be inconsistent with originating from a PV.A multivariate algorithm [29,30] is used for the identification of secondary vertices consistent with the decay of a b hadron.
Simulation is required to model the effects of the detector acceptance and the imposed selection requirements.In the simulation, pp collisions are generated using Pythia [31] with a specific LHCb configuration [32].Decays of unstable particles are described by EvtGen [33], in which final-state radiation is generated using Photos [34].The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [35,36] as described in Ref. [37].
A data-driven approach is employed to correct the simulated particle identification (PID) information based on large samples of D * + → D 0 π + , D 0 → K − π + events.For each track PID response, unbinned four-dimensional probability density functions are extracted for data, p data (x| p T , η, N tr ), and simulated samples, p sim (x| p T , η, N tr ) based on a kernel density estimation technique [38], where x is the PID response, p T and η are the transverse momentum and pseudorapidity of the track, and N tr is the track multiplicity of the event.The cumulative distribution functions for data, P data (x| p T , η, N tr ) and for the simulated samples P sim (x| p T , η, N tr ) are determined, and the corrected PID response x corr is evaluated by transforming the simulated response x sim as x corr = P −1 data (P sim (x sim | p T , η, N tr )| p T , η, N tr ) [39].

Signal selection
Two decay chains are reconstructed: the B + → D * − D + s π + decays are selected for the amplitude analysis and branching fraction measurement, while the B 0 → D * − D + s decays are used as the normalisation sample for the B + → D * − D + s π + branching fraction measurement.The D 0 candidates are reconstructed in the K + π − final state, D * − candidates are formed from the D 0 π − combinations, and D + s candidates are reconstructed in the K + K − π + final state.Pion and kaon candidate tracks used to create combinations are selected based on loose requirements for track-fit quality, momentum p, and transverse momentum p T .They are required to lie within the RICH detector acceptance and be positively identified as either a pion or a kaon, respectively.To suppress the combinatorial background from the tracks originating from the PV, all tracks are required to be displaced from any PV in the event.The separation from the PV is characterised by the quantity χ 2 IP , defined as the difference in the vertex-fit χ 2 of a given PV reconstructed with and without the track under consideration.
The D + s and D 0 candidates are required to form good-quality vertices well separated from any PV.Their invariant masses must be within 20 MeV from the known masses of the respective particles [40].The D * − candidates are formed from the combinations of the D 0 candidate and a pion track creating a good-quality vertex and having the difference of the invariant masses of D * − and D 0 candidates less than 150 MeV.When constructing the B + → D * − D + s π + and B 0 → D * − D + s decay candidates, a kinematic fit is performed [41], which constrains the B and D (s) meson decay products to originate from the corresponding vertices and the masses of the D 0 and D + s candidates to match their known values [40].The vertices of the D 0 and D + s candidates are required to be downstream of the fitted B meson vertex.In the calculation of the phase-space variables m 2 (D * − π + ), m 2 (D + s π + ), cos θ D and ϕ D , the mass of the D * − D + s π + combination is further constrained in the kinematic fit to be equal to the known B + meson mass m B + [40].
In the selection, trigger signals are associated with reconstructed particles.Selection requirements are made on whether the trigger decision was due to the signal candidate (trigger-on-signal, or TOS, category), other particles produced in the pp collision (trigger independent of signal, or TIS, category), or both.The candidates in the analysis are categorised as TOS and NotTOS (which comprises the candidates selected as TIS, but not as TOS) based on the hardware trigger decision.Subsequent selection requirements, including software triggers, do not induce further splits in the analysed data sample.
A boosted decision tree (BDT) classifier [42,43] is used to separate signal candidates from the background arising from random combinations of final-state particles not originating from the same b hadron (combinatorial background).The BDT algorithm is implemented with the TMVA toolkit [44].For the B + → D * − D + s π + process, the selections are trained using simulated events, where all the decays are distributed uniformly across the phase space, as the signal training sample s and D 0 vertices from the B meson vertex parallel to the beam pipe, and the corrected PID information of the final-state kaon and pion candidates.The decision on the BDT response is made based on retaining a high fraction of signal events while maintaining a relatively pure sample with a purity of 90%.The estimated purity is calculated as S/(S + B), where S and B are the expected signal and background yields in the ±30 MeV range around the nominal B meson mass.
Backgrounds, where one of the final-state tracks is misidentified (misID backgrounds), are suppressed with stricter requirements on the PID variables and the application of vetoes.To mitigate contributions from other double-charm b-hadron decays (specifically, the , the PID requirements for the D + s final-state tracks are tightened for candidates with invariant mass within ±20 MeV around the D + or Λ + c mass under the m(K − π + π + ) or m(K − pπ + ) hypotheses, respectively.Additional misID backgrounds arise from , where double K → π misidentification occurs.Candidates under alternative mass hypotheses lying within a ±30 MeV interval around the B meson masses are subsequently rejected.
A significant source of background is due to B + and B 0 decays that proceed without the production of the D + s meson (non-D + s background).Its fraction is calculated by extracting the number of candidates in the invariant-mass peak near the B mass from the D + s sidebands and scaling this number by a factor equal to the ratio between the sizes of the D + s signal window and the D + s sideband region.The contribution of this background is suppressed by imposing a requirement on the significance of the D + s flight distance along the beam direction, denoted as ∆z/σ(∆z).To avoid a significant loss of efficiency, a loose cut, ∆z/σ(∆z) > 1, is applied, which reduces the fraction of non-D + s decays to 4-6% (depending on data-taking period) for the B + → D * − D + s π + sample, and to approximately 2% for the B 0 → D * − D + s sample.The remaining non-D + s background is explicitly considered in both the amplitude analysis and the ratio of branching fractions measurement.
After applying the full selection, around 2% (1%) of the events include more than one reconstructed All candidates are retained for subsequent analysis.The extra candidates correspond to the combinatorial background and are treated as such in the fits to extract signal yields.The effect of keeping multiple candidates in the amplitude fit is evaluated by performing an alternative fit keeping one random candidate per event, and is found to be negligible.The invariant-mass distributions of the B + → D * − D + s π + and B 0 → D * − D + s candidates contain a small admixture from the non-D + s background and are referred to as the fully reconstructed B meson decays.These distributions are parametrised by a sum of a Gaussian function and a double-sided Crystal Ball (DCB) function [45] for each category.The distribution of non-D + s events is broader than the signal distribution due to the correlation between the B and D + s candidate invariant masses, which is induced by the D + s mass constraint.It is represented by the convolution of the signal distribution with the rectangular function of a width equal to the D + s invariant mass range of ±20 MeV.In the fit to the data, the peak positions (shared between the Gaussian and DCB functions), width, and relative magnitude of the Gaussian peak are allowed to float.The parameters describing the tails of the DCB function are fixed to the values obtained from fits to the simulation samples.The D * − D + s invariant-mass spectrum includes a small contribution from the B 0 s decays [17], with the distribution fixed to be the same as for the B 0 decays, but shifted by the known B 0 s -B 0 mass difference [40].The invariant-mass range below the B meson mass is populated with candidates from B meson decays with D * + s subsequently decaying as D * + s → D + s γ/π 0 , where the photon or the π 0 from the decay is not reconstructed.This structure is referred to as the partially reconstructed B meson decay component.The distribution for the B 0 → D * − D * + s decays is obtained from the corresponding simulated sample and is described by a non-parametric kernel density estimator [46].The shape of the partially reconstructed B + → D * − D * + s π + decays depends on the unmeasured structure of the decay amplitude and is thus parametrised by an empirical shape, the sum of two Gaussian peaks.
The combinatorial background is parametrised by an exponential distribution.Its slope is floated in the fit to data.
The invariant-mass distributions of the D * − D + s and D * − D + s π + combinations in data and the results of the fits are shown in Fig. 2 for all categories combined.The yields of various fit components are given in Table 3 for the D * − D + s and in Table 4 for the D * − D + s π + combinations.The reported yields are extracted by performing independent fits to the four different categories.For the B + → D * − D + s π + mode, the signal and background yields in the range |m(D * − D + s π + ) − m B + | < 30 MeV ("signal box") are also reported.This range is used to select the candidates for the amplitude fit.
Two-dimensional projections of the Dalitz-plot and angular variables for the

Efficiency
The efficiency variations across the four-dimensional phase space are obtained from the simulated signal samples.These samples are generated uniformly in the decay phase space and are analysed using the same reconstruction and selection procedure as for data.The efficiency profiles are computed separately for Run 1 and Run 2 conditions and further split into TOS and NotTOS categories.Prior to performing the density estimation, each simulated event is assigned a weight, derived from control samples in data, to correct for known differences in track reconstruction [47] and hardware trigger [48] efficiency between data and simulation.
For each category, the efficiency as a function of the phase-space variables is parametrised using an artificial neural network (ANN) density estimator [49].In this approach, the weights and biases of a fully connected feed-forward ANN are treated as free parameters in a maximum-likelihood fit to the unbinned simulated data.The first layer of neurons (input layer) is provided with the phase-space variables (m 2 (D * − π + ), m 2 (D + s π + ), cos θ D , ϕ D ).To ensure both continuity and periodicity of the estimated density as a function of the ϕ D angle, the ANN estimator takes both cos ϕ D and sin ϕ D as inputs instead of the angle itself.The network contains three hidden layers of 40, 80, and 20 neurons, respectively.The output is a single neuron that returns the estimated density.
The optimisation is performed using TensorFlow [23] with the Adam algorithm [50].An L2 regularisation term, calculated as the sum of squared weights multiplied by a tunable parameter λ 2 , is added to the loss function.This term controls overfitting and ensures smoothness in the density function by penalising large neuron weights.The choice of the λ 2 parameter is driven by the compromise between overfitting for low λ 2 (which manifests itself as large fluctuations of the fitted density) and systematic bias for high λ 2 values.The range of valid λ 2 values is chosen by visually inspecting the projections of the fitted density and their comparison with data, and the middle of this range (λ 2 = 0.3) is taken as the λ 2 parameter for the baseline fit.The upper and lower values of the λ 2 range are used for the systematic uncertainty evaluation (see Sect. 8.4).
The efficiency profile used in the fits is determined as the average of the TOS and NotTOS profiles, weighted according to the ratio of yields of the two categories of events in data.The projections of the resulting efficiency profile onto the Dalitz plot variables m 2 (D * − π + ), m 2 (D + s π + ) and the angular variables cos θ D , ϕ D are shown in Fig. 4 for Run 1 and Run 2 separately.

Background distributions
The density of the combinatorial background as a function of the phase space variables is derived using candidates in the B + upper-mass sideband defined as 5.31 < m(D * − D + s π + ) < 5.60 GeV.The low-mass sideband is excluded as it is dominated by partially reconstructed decays with distinct amplitude structures.
A five-dimensional probability density, P bkg (x, m(D * − D + s π + )), is estimated using an ANN density estimator.The inclusion of m(D * − D + s π + ) accounts for any potential effect arising from applying the B + mass constraint when computing the phase-space variables, x.The resulting P bkg (x, m(D * − D + s π + )) parametrisation is then used to extrapolate the combinatorial background into the signal region, P bkg (x) ≡ P bkg (x, m B + ) [49].The density estimation is performed on the combined Run 1 and Run 2 B + data upper-mass sideband.The architecture of the ANN estimator aligns with that of the efficiency density estimation, except that the input layer incorporates an additional neuron to capture the m(D * − D + s π + ) variable.The λ 2 regularisation parameter is chosen similarly to the case of the efficiency shape (Sect.  of the non-D + s background: (15) Here the P non-D + s (x) represents the density function of the non-D + s background component, also parametrised using an ANN estimator with N non-D + s as the normalisation term, and f non-D + s denotes its estimated fractional contribution.The shape of the non-D + s background is obtained from the combination of Run 1 and Run 2 events in the D + s sidebands and requiring ∆z/σ(∆z) < 2. The distributions and the results of the density estimation are presented in Fig. 6.The same ANN architecture as in the efficiency density estimation is used, with the λ 2 regularisation parameter set to 0.18, following the same approach as described in Sect.6.

Amplitude analysis
Various amplitude models are employed to fit the B + → D * − D + s π + data.The Run 1 and Run 2 samples are fitted simultaneously.The baseline fit exclusively incorporates resonance activity in the D * − π + channel.Alternative fits, exploring exotic contributions in the D + s π + and D * − D + s channels, are also examined.This is motivated by the data and the amplitude analysis of the B + → D − D + s π + decays, where an exotic state in the D + s π + channel has been identified [5].

Baseline fit
In the baseline model, the parametrisation of excited charm-meson resonances decaying to the D * − π + final state employs Breit-Wigner line shapes.The considered states are listed in Table 5, with the default inclusion of D 1 (2420), D 1 (2430), and D * 2 (2460) states.The fits with all possible combinations including or not the remaining D 0 (2550), D * 1 (2600), D 2 (2740), and D * 3 (2750) states are performed.Masses and widths are fixed to the values reported in Ref. [40].The data are fitted with the complex LS couplings for each amplitude component treated as floating parameters, except for the D 1 (2420) D-wave amplitude, which is taken as a reference and set to unity.The decision to include additional resonances is contingent on the variation of the fit likelihood and the corresponding fit fraction.
Considering all the resonant compositions of the fit model, the inclusion of the D 0 (2550), D * 1 (2600), and D 2 (2740) states improves the fit likelihood, yielding fit fractions larger than 1%.Conversely, the D * 3 (2750) fit fraction is below 1%, and the NLL difference stands at approximately three units for four additional degrees of freedom.Consequently, the D * 3 (2750) state is excluded from the baseline model.Estimations of the significances for the remaining three resonances are derived from the NLL difference for fits with and The results of the baseline fit, along with the alternative fits discussed in Sect.8.2, are presented in Appendix A. The invariant-mass and angular projections of the baseline fit result are presented in Fig. 7. To assess the fit quality, a χ 2 /ndof value is computed from the two-dimensional distribution (m 2 (D * − π + ), m 2 (D + s π + )).Adaptive binning is constructed such that each bin contains a minimum of 25 entries.The resulting fit quality is χ 2 /ndof = 58.6/46,where the effective number of degrees of freedom is obtained from simulated pseudoexperiments.Figure 8 (a) illustrates the pulls in bins of the Dalitz plot, using the same binning employed for the χ 2 /ndof calculation.
The m(D + s π + ) projection reveals a certain excess of data over the baseline fit around 2.9 GeV.The pull distribution indicates that this enhancement is particularly significant for invariant masses m 2 (D * − π + ) < 6 GeV 2 , inconsistent with attributing it to an additional resonant spin-zero state in the D + s π + channel with the mass around 2.9 GeV.Nevertheless, additional studies are conducted in an attempt to improve the description of the data, considering exotic D + s π + contributions.Results from those fits are discussed in Sect.8.3.Ensembles of pseudoexperiments, where the baseline model is used both to generate and to fit samples of the same size as in the data, are used to evaluate the statistical uncertainties on the fit fractions and check for systematic biases due to the fitting procedure as discussed in Sect.8.4.
Table 6 provides the obtained fit fractions for the components of the baseline amplitude model and the phase differences between the amplitude components and the reference D 1 (2420) D-wave amplitude.The reported values include both statistical and systematic uncertainties.The assessment of systematic uncertainties is detailed in Sect.8.4.Notably, all components of the baseline model, except for the 1 + resonances D 1 (2420) and D 1 (2430), have different quantum numbers.Therefore, only the interference terms between these two components are different from zero after integration over the phase space.
Since the B + → D * − D + s π + decay amplitude is dominated by the favoured b → c transition, CP -violating effects in it are anticipated to be minimal.Independent fits of the baseline model to the B + and B − samples are performed and exhibit statistical agreement.Accounting for the correlation of the fit parameters, the p-value of the agreement between them is 19%, equivalent to a difference of 1.3 standard deviations.
The consistency between the Run 1 and Run 2 datasets is tested by conducting separate fits to the data.The fit utilises the selection efficiencies obtained separately for Run 1 and Run 2 simulated samples, while the background parametrisation is taken to be common, identical to the one used in the baseline fit.The parameters of the fit exhibit good statistical agreement for the two independent samples.

Alternative parametrisations of the D * − π + amplitude
The use of the Breit-Wigner line shape may be less suitable for amplitudes involving broad resonances or overlapping contributions with the same quantum numbers, such as the case of the broad D 1 (2430) state overlapping with D 1 (2420) in the 1 + S-wave of this analysis.Consequently, an alternative strategy is explored, involving a quasi-model-independent description (QMI) of the broad 1 + S-wave amplitude.In this approach, the amplitude is represented by a complex-valued cubic spline, while all other amplitudes are parametrised by Breit-Wigner functions with masses and widths fixed to the values given in Table 5.
A spline with six knots is employed, with the first and last knots set to the kinematic limits (m D * − + m π + ) ≃ 2.15 GeV and (m B + − m D + s ) ≃ 3.31 GeV, respectively.Internal knots are fixed at the masses 2.2, 2.3, 2.5, and 3.0 GeV, and the corresponding six complex coefficients are floated in the fit.
Comparing the QMI approach with the Breit-Wigner description of the D 1 (2430) state, the NLL difference is only −∆ ln L = −4.0(see Table 8 in Appendix A under the column "QMI"), while the number of floating parameters is N par = 24, compared to N par = 16 in the Breit-Wigner model.Thus, it is concluded that the model-independent description does not yield a significant improvement relative to the Breit-Wigner description given the available statistics.
As an additional test, the mass and width of the Breit-Wigner line shape characterising the D 1 (2430) state are allowed to vary in the amplitude fit (the model that appears as "Floated D 1 (2430)" in Table 8).However, this alternative fit yields only a marginal improvement in the NLL compared to the fit where the D 1 (2430) parameters are fixed to the values given in Ref. [40].The resulting values for the D 1 (2430) mass and width are 2.378 ± 0.025 GeV and 0.24 ± 0.06 GeV, respectively, consistent with the values reported in Ref. [40].
Non-D + s backgr.Comb.backgr.In the baseline model, the S-and D-wave decays of the 1 + D 1 (2420) and D 1 (2430) states are considered independent.Therefore, the mixing of the two 1 + states is implicitly accounted for.As an alternative parametrisation of the 1 + D * − π + amplitude, the model with the mixing angle ω and the phase ψ, identical to the one used in the B + → D * − π + π + amplitude analysis [9] is used.The mixing parameters are determined to be ω = −0.054± 0.018 and ψ = 1.23 ± 0.72, where the uncertainties are statistical only, consistent with those obtained in Ref. [9].However, since the phases of all the amplitude components in this model are strongly correlated with the phase ψ, and given that ψ is not determined precisely, this model is not included in the list of variations to assess the model uncertainty.
Other tests involve the inclusion of additional broad nonresonant contributions in different D * − π + partial waves.The line shapes are parametrised with the exponential functions multiplied by the orbital barrier factors for the amplitudes with orbital momentum L > 0. For the spin-parity combinations where two partial waves are possible, both partial waves are allowed in this fit with floated couplings, but the exponential slope is constrained to be equal for both waves.
The only partial wave where the addition of an exponential nonresonant amplitude results in a significant improvement of the fit is the J P = 1 − wave.Results of this fit are presented in Table 8 under the "NR D * π 1 − " column.Nevertheless, the addition of the nonresonant 1 − component does not result in a substantial change in any of the couplings or the resonance fit fractions, except for the D * 1 (2600) state which has the same quantum numbers, 1 − , as the added nonresonant amplitude.The fit fraction for this state reduces from (4.9 ± 1.1)% to (3.0 ± 1.3)%, where the uncertainties are statistical only.

Fits with exotic contributions
Various fits incorporating D + s π + amplitudes are examined, prompted by the discovery of the T * cs0 (2900) ++,0 states in B + → D − D + s π + and B 0 → D 0 D + s π − decays [5,6].All of them include the same resonances as in the baseline model, with one or two additional resonant (Breit-Wigner) or nonresonant (exponential) amplitudes.The fits do not reveal evidence of a scalar D + s π + state with the parameters fixed to those of the T * cs0 (2900) ++ state [5].A fit with the scalar state and floated parameters results in a very narrow state and only a marginal improvement in the fit likelihood.Introducing a vector state improves the fit but yields a relatively broad amplitude with Γ ≃ 0.4 GeV.A tensor state with floating parameters produces a very broad peak with the width reaching the upper limit of 1 GeV with smaller significance than for the vector one.
Repeating similar fits but with exponential nonresonant amplitudes instead of the Breit-Wigner line shape, the vector and tensor nonresonant amplitudes result in an improved fit likelihood.However, the vector nonresonant model, which yields the best fit, generates a rising amplitude with the exponential parameter α < 0, which is physically implausible.Incorporating a scalar D + s π + state with the mass and width fixed to that of the T * cs0 (2900) ++ state into the amplitude with the additional vector D + s π + contribution leads to further improvement of the fit.Floating the mass and width of the scalar state yields the parameters statistically consistent with those reported in Ref. [5].
Among all the models explored with the additional exotic D + s π + amplitude, the one featuring the nonresonant vector and resonant scalar state, with the parameters fixed to those of the T * cs0 (2900) ++ state from the B + → D − D + s π + analysis, is selected for further evaluation.The values of the fitted parameters are reported in Table 8 under the column "NR, T * cs0 (2900) ++ ". Figure 9 illustrates the difference between the baseline model and the model with D + s π + components, emphasised by the requirement m(D * − π + ) > 2.5 GeV applied to all projections except for m(D * − π + ) itself, to eliminate the dominant D * * states.The fit quality is χ 2 /ndof = 51.3/46employing the same binning scheme used to evaluate the fit quality of the baseline model.Figure 8 (b) displays the pulls in bins of the Dalitz plot using this model.
The significance and fit fraction of the doubly charged scalar state in the D + s π + channel depend significantly on the model.In the fit where this state is added to the baseline model, its fitted contribution is negligible, with a fit fraction of approximately 0.1%.However, when incorporated into the model with the 1 − D + s π + amplitude, the fit fraction of this state increases to 1.2 ± 0.8%, and its statistical significance reaches 2.6σ.While the conclusion suggests no compelling evidence for the contribution of this state in the D + s π + channel, an upper limit on its fit fraction is established.Using the fitted value of 1.2%, along with a statistical uncertainty of 0.8% obtained from pseudoexperiments, and a systematic uncertainty of 0.5%, the upper limit is set at 2.3% (at 90% CL), or 2.7% (at 95% CL) assuming both the statistical and systematic uncertainties to be Gaussiandistributed.It is consistent with the fit fraction for the T * cs0 (2900) ++ → D + s π + contribution of (2.25 ± 0.67 ± 0.77)% measured in the analysis of B + → D − D + s π + decays [6].For completeness, fits incorporating exotic contributions in the D * − D + s channel are conducted using the baseline model, extended with various amplitudes.None of the models with a single D * − D + s amplitude shows a substantial improvement over fits with additional nonresonant states in the D * − π + or D + s π + channels.The best fit is achieved with J P = 2 + amplitudes, resulting in a very narrow state near the upper kinematic boundary for resonant amplitudes or an amplitude rising with D * − D + s mass for nonresonant ones.When incorporating combinations of nonresonant shapes in the D * − D + s and D * − π + channels, the best fits are achieved by models featuring two tensors or two vectors of opposite parity, both resulting in an exotic amplitude rising with the D * − D + s mass.It is concluded that none of these fits provides a physical description of the amplitude, and therefore are not considered as part of the model uncertainty.

Systematic uncertainties
The analysis considers various sources of systematic uncertainties affecting the fit parameters, including LS amplitudes, phases, and fit fractions.Tables 10 and 11 in Appendix B present the systematic uncertainties for the baseline amplitude model.
Systematic uncertainties associated with the efficiency and geometric acceptance description are due to the finite size of simulation samples used for their parametrisation, as well as due to the calibration applied to correct the simulation PID responses and hardware trigger efficiencies.Sample size uncertainty is evaluated by fitting the data employing different efficiency (acceptance) profiles obtained by bootstrapping the original sample, with the standard deviation assigned as the associated uncertainty.The PID correction uncertainty is estimated using an alternative efficiency shape after regenerating the PID responses in simulation with a modified kernel width used for density estimation [39].The trigger efficiency correction uncertainty is evaluated by computing efficiencies from simulation samples without trigger efficiency correction.In both instances, systematic uncertainties are established by comparing the results to those of the baseline fit.
Additional systematic uncertainties arise from the choice of the structure of hidden ANN layers and the regularisation parameter λ 2 in the efficiency (acceptance) density estimation.Alternative parametrisations are obtained with various configurations of ANN hidden layers and by altering the λ 2 regularisation parameter within the range 0.20-0.45.Fits to the data are performed for each alternative efficiency shape, and the largest difference observed compared to the baseline fit is taken as the corresponding uncertainty.
Similar sources of systematic uncertainty arise from the description of the combinatorial (non-D + s ) background: the finite size of the combinatorial (non-D + s ) background sample, the choice of the hidden ANN layer structure, and the regularisation parameters used in density estimation.These are assessed employing the same strategy as for the efficiency.Additionally, the uncertainty related to the fraction of combinatorial (non-D + s ) background in the signal region is evaluated by varying the estimated contribution within its statistical uncertainties.
A sample of 10 6 events distributed uniformly over the four-dimensional phase space of the decay is used to normalise the PDFs in the amplitude fit.To evaluate the uncertainty due to the precision of PDF normalisation, alternative normalisation samples are generated by bootstrapping the original one.Fits to the data are then carried out using these samples, and the standard deviation is taken as the associated uncertainty.
Fixed parameters in the baseline model are the masses and widths of Breit-Wigner amplitudes, along with the radial parameter in the Blatt-Weisskopf form factors F R and F B set to d = 4.5 GeV.The associated systematic uncertainty linked to the Blatt-Weisskopf radius is evaluated by varying the radial parameters between 3 and 6 GeV.The deviation from the baseline result is then considered as the associated systematic uncertainty.The uncertainty on resonance parameters is evaluated by varying each parameter by its total uncertainty.Positive and negative variations are averaged and added in quadrature to estimate the overall uncertainty.
The inherent biases of the fitting procedure are assessed by fitting ensembles of pseudoexperiments.The same ensembles as those used for the calculation of the statistical uncertainties on the fit fraction are considered.For each parameter, the difference between the baseline fit results and the central values obtained from the pseudoexperiments is considered as the associated systematic uncertainty.
For the assessment of model uncertainty, four alternative models are considered, detailed in Sects.8.2 and 8.3, with the parameters reported in Appendix A. Asymmetric uncertainties on couplings and fit fractions are assigned as the maximum positive and negative deviations from the baseline model fit.

Branching fraction measurement
The branching fraction of the B + → D * − D + s π + decay is measured relative to the branching fraction of the B 0 → D * − D + s decay as where (after subtracting the non-D + s contributions) in each category, i, are related through R according to Here, ϵ i B +(0) denotes the total reconstruction and selection efficiency for the B + (B 0 ) decays in category i.These efficiencies are determined using simulated samples and corrected to account for known differences in track reconstruction, hardware trigger efficiency, and PID response between data and simulation.
The efficiency of the three-body decay B + → D * − D + s π + depends on the phase space variables, x.Since the simulated signal decay density is generated uniformly over x, the results from the amplitude analysis are used to account for this dependency by correcting the B + efficiencies with the factor where A(x) is the fitted decay amplitude, ϵ(x) is the four-dimensional efficiency map parametrised using ANN, and integration is performed over the full phase space of the decay.
The analysis also measures the ratio of B + → D * − D * + s π + and B + → D * − D + s π + branching fractions, denoted as R * .This is achieved by linking the N i B + →D * − D * + s π + and N i B + →D * − D + s π + yields in the simultaneous fit as where s π + denotes the ratio of the total reconstruction and selection efficiencies of the B + → D * − D + s π + and B + → D * − D * + s π + modes, determined from simulation.
The ratios of branching fractions measured from the fit are determined to be R = 0.173 ± 0.006, and R * = 1.32 ± 0.07, where the uncertainties are statistical only.
The consistency between the Run 1, Run 2 and TOS, NotTOS datasets is tested by performing separate fits to the data in the four categories, showing good agreement within the statistical uncertainties on the relative branching fractions.
A number of systematic uncertainties affecting the ratios of branching fractions are considered.Table 7 presents the systematic uncertainties evaluated in per cent relative to the values of the ratios of branching fractions.
The yields obtained from the fit depend on the model chosen to describe the signal and background invariant-mass distributions.To assess these effects, alternative models are employed, and the difference between the alternative and the baseline fits is taken as the systematic uncertainty.The fully reconstructed signal model uncertainty is evaluated using a DCB shape instead of the baseline sum of the Gaussian and DCB distributions.An additional uncertainty is associated with the values of the DCB tail parameters and the fraction between the Gaussian and DCB distributions.This contribution is estimated by propagating the corresponding uncertainties on the values obtained from the simulation and is found to be negligible.
The uncertainty due to the combinatorial background model is obtained by replacing the baseline exponential distribution with a linear function.For the partially reconstructed B + → D * − D * + s π + contribution, a bifurcated Gaussian shape is considered while the uncertainty due to the partially reconstructed shape in the B 0 → D * − D + s sample is considered to be negligible.
The uncertainty due to the estimated fractions of non-D + s contamination in the fully reconstructed signal peaks is obtained by varying them within their uncertainties while fitting the data, and taking the RMS deviation as the systematic uncertainty.
The uncertainties in the efficiency ratios account for simulation sample size as well as uncertainties related to PID response, trigger, and nonuniformity corrections.The uncertainty due to limited simulation sample size is accounted for by varying the efficiencies obtained from the simulated samples in the data fit and taking the RMS of the fitted R ( * ) values as the uncertainty.The PID correction uncertainty is assessed by extracting efficiencies after transforming the PID responses in simulation with a modified kernel width used for density estimation.The trigger efficiency correction uncertainty is evaluated by computing efficiencies from simulation samples without any trigger efficiency correction.Systematic uncertainties are determined by comparing these results to the baseline.
Non-uniformity correction factor uncertainties are driven by the limited size of simulation samples and uncertainties in the four-dimensional efficiency profile parametrisation.These uncertainties are propagated to the uncertainty in the calculation of the R ( * ) values.For the B + → D * − D * + s π + mode, where the correction is not applied, the uncertainty is conservatively assigned to be the maximum absolute value of the correction for the B + → D * − D + s π + mode (4.8%).The total systematic uncertainty is taken as the sum in quadrature of each of the contributions described above.Including them, the measured values for the ratios of branching fractions are R = 0.173 ± 0.006 ± 0.010, R * = 1.32 ± 0.07 ± 0.14, where the first uncertainty is statistical, and the second is total systematic.

(
R→D * Ds) n,λ couplings in the B + → T 0 ccs (→ D * − D + s )π + decay chain are identical, with the substitution of the angles θ D * , θ D , and ϕ D by θ ccs D * , θ ccs D , and ϕ ccs D , respectively.In the decay chain B + → D * − T ++ cs (→ D + s π + ), the helicities of the D * − and T ++ cs states are equal, their total spin is zero, and the range of angular momenta L B depends on the spin J of the T ++ cs state: |J − 1| ≤ L B ≤ J + 1.The expression for the helicity coupling takes the form

2 sin θ cs Ds cos θ cs Ds sin θ cs D sin ϕ cs D 3 √ 15 10 (− 2
sin θ cs Ds cos θ cs Ds sin θ cs D cos ϕ cs D + 3 sin 2 θ cs Ds cos θ cs D − 2 cos θ cs D ) . The background training sample comprises wrong-sign D * − D − s π + and D * − D + s π − combinations in data.In the case of the normalisation channel B 0 → D * − D + s , the simulation sample is used for signal, while D * − D + s combinations with invariant masses greater than 5.4 GeV are used as the background in the BDT training.The classifier takes as inputs the vertex χ 2 for the B, D + s and D 0 meson candidates, the χ 2 IP of the B, D + s , D 0 candidates and the final-state tracks, the signed significances of the separation of the D +

5
Fits of D * − D + s and D * − D + s π + invariant-mass distributions The yields of the signal and normalisation decays are obtained by fitting the D * − D + s π + and D * − D + s invariant-mass distributions split into four categories, according to the data-taking period (Run 1 or Run 2) and trigger category (TOS or NotTOS).

Figure 2 :
Figure 2: Invariant-mass distributions of (a) D * − D + s and (b) D * − D + s π + combinations and the results of the fits used to obtain the yields of the B 0 → D * − D + s , B + → D * − D + s π + and B + → D * − D * + s π + decays.The inset in the plot (a) shows a zoomed region with the contribution of the B 0 s → D * − D + s decay component.
6) and set to 0.2.An extra L2 regularisation term with λ 2 = 30 is introduced for the neuron weights of the first hidden layer corresponding to the m(D * − D + s π + ) input to ensure smooth dependence of the distribution in phase-space variables on the m(D * − D + s π + ) invariant mass.Projections of the estimated five-dimensional background density and the distribution of combinatorial background candidates in the upper mass sideband are shown in Fig. 5.The projections of the phase-space variable distribution extrapolated to the signal region, m(D * − D + s π + ) = m B + , are also presented.After the final selection, the B + → D * − D + s π + sample is still contaminated by non-D + s background (see Sect.4), which has to be explicitly included in the amplitude fit.This is done by modifying the total PDF as defined in Eq. 13 to incorporate the parametrisationD * − π + ) [GeV 2 ]

Figure 5 :
Figure 5: (a-e) One-dimensional projections of the distribution of B + → D * − D + s π + candidates in the invariant mass range 5.31 < m(D * − D + s π + ) < 5.60 GeV (upper B + invariant-mass sideband), results of its density estimated with ANN, and, except for (a), results of the extrapolation of its density to the signal region with m(D * − D + s π + ) = m B + .(f-i) Two-dimensional projections of the estimated density of B + → D * − D + s π + candidates in the upper B + mass sideband.

Figure 6 :
Figure 6: (a-d) One-dimensional projections of the distribution of B + → D * − D + s π + candidates in the D + s sideband region and the results of its density estimation using ANN.(e, f) Twodimensional projections of the estimated non-D + s background density.

Figure 7 :
Figure 7: Results of the fit of the B + → D * − D + s π + distribution with the baseline model.Figures (a) and (b) show the m(D * − π + ) projection, with (a) zoomed in to illustrate the contributions from all the resonances while (b) shows the projection near the D 1 (2420) resonance.The m(D + s π + ), m(D * − D + s ), cos θ D and ϕ D projections are shown in (c), (d), (e) and (f), respectively.

Figure 8 :
Figure 8: Pulls of the B + → D * − D + s π + Dalitz-plot distribution for the fits with (a) the baseline model and (b) the model including D + s π + components.

Figure 9 :
Figure 9: Comparison of the fit results of the B + → D * − D + s π + distribution with the baseline model and with the model that includes D + s π + components.The m(D * − π + ), m(D + s π + ), m(D * − D + s ), cos θ D and ϕ D projections are shown in (a), (b), (c), (d) and (e), respectively.The requirement m(D * − π + ) > 2.5 GeV is applied to the distributions shown in plots (b)-(e).

Table 3 :
Yields of signal and background components for the D * − D + s invariant-mass fit in range 5.15-5.60GeV.

Table 4 :
Yields of signal and background components for the D * − D + s π + invariant-mass fit in range 4.80-5.60GeV and in the signal box |m(D * − D + s π + ) − m B + | < 30 MeV.

Table 5 :
Resonances considered in the baseline amplitude model and their parameters.

Table 6 :
Fit fractions (in %) for the components of the B + → D * − D + s π + amplitude.The first uncertainty is statistical, the second is systematic, and the third is the uncertainty related to the amplitude model.

Table 7 :
Relative systematic uncertainties on the R ( * ) measurements.

Table 6 .
No strong evidence of exotic contributions in the D * − D + s or D + s π + channels is observed.The fit fraction of the scalar state T * cs0 (2900) ++ in the D + s π + channel observed in the B + → D − D + s π + analysis [5, 6] is found to be less than 2.3% at 90% CL. acknowledge the computing resources that are provided by CERN, IN2P3 (France), KIT and DESY (Germany), INFN (Italy), SURF (Netherlands), PIC (Spain), GridPP (United Kingdom), CSCS (Switzerland), IFIN-HH (Romania), CBPF (Brazil), and Polish WLCG (Poland).We are indebted to the communities behind the multiple open-source software packages on which we depend.Individual groups or members have received support from ARC and ARDC (Australia); Key Research Program of Frontier Sciences of CAS, CAS PIFI, CAS CCEPP, Fundamental Research Funds for the Central Universities, and Sci.

Table 8 :
Fit results for the baseline and selected alternative B + → D * − D + s π + amplitude models.

Table 9 :
Fitted complex values f i of the 1 + S-wave amplitudes at the spline knots of the QMI model.

Table 10 :
Absolute systematic uncertainties on the fit parameters and fit fractions for the baseline model