1 Introduction

Within the Standard Model of particle physics (SM), the weak interaction is rather peculiar as it allows flavour-changing currents described by the Cabibbo–Kobayasky–Maskawa (CKM) matrix. Its elements are not predicted and have to be extracted from experimental data combined with theoretical inputs. The magnitude of the matrix elements \(|V_{cb}|\) and \(|V_{ub}|\) are obtained from semileptonic \(b\rightarrow c\) and \(b\rightarrow u\) transitions, respectively. Two types of final states are studied, depending on whether the final state is fully reconstructed (exclusive) or not (inclusive) decays. Both for \(|V_{ub}|\) and \(|V_{cb}|\), the extractions from inclusive and exclusive decays remain to show puzzling tensions although recent developments reduce the tension (see e.g. [1]). While exclusive decays rely mainly on lattice QCD calculations for the hadronic form factors, inclusive decays are fully based on the heavy-quark expansion with its non-perturbative input determined from data.

In this review, we review the current status for the inclusive decays and give a brief outlook for the future. We start with a short introduction of the Heavy-Quark Expansion, then discuss \(B\rightarrow X_c \ell {\bar{\nu }}_\ell\) decays and the extraction of \(|V_{cb}|\) including an outlook for future studies. In Sect. 4, we discuss \(B\rightarrow X_u \ell {\bar{\nu }}_\ell\) and the extraction of \(|V_{ub}|\). We discuss the most used theoretical approaches. We focus on the hybrid Monte-Carlo approach used to obtain the experimental determinations rather than on specific measurements. Finally, we discuss the weak-annihilation contributions and present an outlook for the future, briefly discussing recent measurements of ratios of inclusive \(B\rightarrow X_u\) and \(B\rightarrow X_c\) semileptonic decays.

2 Heavy-Quark expansion

The Heavy-Quark Expansion (HQE) has become a well-established tool in the study of inclusive B meson decays, allowing the expression of observables in a double expansion of \(\alpha _s\) and \(1/m_b\). We start from the inclusive decay of the \({\bar{B}}\) meson

$$\begin{aligned} {\bar{B}}(p_B)\rightarrow X(p_X) \ell (p_\ell ) {\bar{\nu }}_\ell (p_\nu ) \ , \end{aligned}$$
(1)

mediated through the weak effective Hamiltonian

$$\begin{aligned} H_{\textrm{eff}} = \frac{G_F}{\sqrt{2}} V_{qb} J_H^\mu J_{\ell \mu }\ , \end{aligned}$$
(2)

where \(q=u,c\), with \(V_{qb}\) the CKM element of the transition, the hadronic current \(J_H^\mu = {\bar{q}} \gamma ^{\mu }(1-\gamma _5) b\) and the leptonic current \(J_{\ell }^{\mu } = {{\bar{\ell }}} \gamma ^{\mu }(1-\gamma _5) \nu _\ell\).

The triple differential decay rate is then given by

$$\begin{aligned} \frac{\text {d}\Gamma }{\text {d}E_{\ell } \text {d}q^2 \text {d}E_{\nu }}&= \frac{G_{F}^2 |V_{qb}|^2}{16 \pi ^3} L_{\mu \nu } W^{\mu \nu } , \end{aligned}$$
(3)

where \(E_{\ell (\nu )}\) is the lepton (neutrino) energy and \(q^2 = (p_\ell + p_\nu )^2\) is the dilepton invariant mass. Here \(L_{\mu \nu }\) is the leptonic tensor and \(W^{\mu \nu }\) is the hadronic tensor:

$$\begin{aligned} W^{\mu \nu }&= \frac{1}{4} \sum _{X} \frac{1}{2 m_{B}} (2\pi )^3 \langle {\bar{B}}|J_{H}^{\dagger \mu } |X\rangle \langle X| J_{H}^{\nu } |{\bar{B}}\rangle \delta ^{(4)} (p_{B} -q - p_{X}) \ , \end{aligned}$$
(4)

where \(p_{X}\) is the total momentum of the hadronic system X. Decomposing (4) into Lorentz scalars gives

$$\begin{aligned} W^{\mu \nu }&= -g ^{\mu \nu } W_{1} + v^{\mu } v^\nu W_{2} - i \epsilon ^{\mu \nu \rho \sigma } v_{\rho } q_{\sigma } W_3 + q^\mu q^\nu W_4 + (q^\mu v^\nu + v^\mu q^\nu ) W_5, \end{aligned}$$
(5)

leading to

$$\begin{aligned} \frac{\text {d}\Gamma }{\text {d}E_{\ell } \text {d}q^2 \text {d}E_{\nu }}&= \frac{G_{F}^2 |V_{qb}|^2}{2 \pi ^3} \left[ q^2 W_{1} + (2 E_{\ell } E_\nu - \frac{q^2}{2}) W_2 \nonumber + q^2 (E_{\ell } - E_{\nu }) W_3\right. \\ &\quad \left. \frac{1}{2} m_{\ell }^2 \left( -2 W_{1} + W_2 -2 (E_\nu + E_\ell )W_{3} + q^2 W_4 + 4 E_\nu W_5 \right) - \frac{1}{2} m_{\ell }^4 W_4 \right] . \end{aligned}$$
(6)

We have omitted explicit Heaviside \(\theta\)-functions. When considering \(\ell = e, \mu\), we set \(m_\ell \rightarrow 0\), such that \(W_{4,5}\) do not contribute. The hadronic scalars \(W_i\) are now obtained using the optical theorem. Schematically, we have

$$W^{\mu \nu } \propto \int d^4y \, e^{-iqy}\langle {\bar{B}}|J_{H}^{\dagger \mu }(y) J_{H}^{\nu }(0)|{\bar{B}}\rangle = 2\; \textrm{Im}\int d^4y \, e^{-iqy}\langle {\bar{B}}| T\left\{ J_{H}^{\dagger \mu }(y) J_{H}^{\nu }(0)\right\} |{\bar{B}}\rangle .$$
(7)

To set up the HQE, we introduce a time-like vector v, which we use to split the momentum \(p_b\) of the bottom quark as \(p_b = m_b v + k\) where \(v^2=1\) and k is a small residual momentum satisfying \(k \ll m_B\). Finally, we perform a re-definition of the heavy quark field

$$\begin{aligned} b(y) = e^{-im_b (v\cdot y)} b_v(y) \ , \end{aligned}$$
(8)

in the time-ordered product in (7). We obtain

$$\begin{aligned} W^{\mu \nu } = 2\; \textrm{Im}\int d^4y e^{iy(m_b v-q)}\langle {\bar{B}}|&T\left\{ \tilde{J}_{H}^{\dagger \mu }(y) \tilde{J}_{H}^{\nu }(0)\right\} |{\bar{B}}\rangle , \end{aligned}$$
(9)

where \(\tilde{J}\) contains the re-phased field \(b_v\). This time-ordered product form the basis of a local operator product expansion (OPE), which can be set up order by order by expanding in the residual momentum \(k \sim {\mathcal {O}}(\Lambda _{\text {QCD}})\). For \(b\rightarrow c\ell {\bar{\nu }}\) decays, the OPE is usually set up by treating the c quark as a heavy degree of freedom assuming \(m_b \sim m_c \gg \Lambda _{\textrm{QCD}}\). In this case, we obtain the standard \(1/m_b\) expansion for the total rate expressed in local hadronic matrix elements

$$d\Gamma = d\Gamma _0 + \left( \frac{\Lambda _{\textrm{QCD}}}{m_b}\right) ^2 d\Gamma _2 + \left( \frac{\Lambda _{\textrm{QCD}}}{m_b}\right) ^3 d\Gamma _3 + \left( \frac{\Lambda _{\textrm{QCD}}}{m_b}\right) ^4 d\Gamma _4$$
(10)
$$\begin{aligned}&\quad + \left[ a_0\left( \frac{\Lambda _{\textrm{QCD}}}{m_b}\right) ^5 + a_1 \left( \frac{\Lambda _{\textrm{QCD}}}{m_b}\right) ^3\left( \frac{\Lambda _\textrm{QCD}}{m_c}\right) ^2\right] d\Gamma _5 + \ldots \ . \end{aligned}$$
(11)

At each order, we have

$$\begin{aligned} d\Gamma _i = \sum _n C_i^{(n)}\langle {\bar{B}}| {\mathcal {O}}_i^{(n)} |{\bar{B}}\rangle \ , \end{aligned}$$
(12)

which separates the short-distance physics \(C_i^{(n)}\) from non-perturbative forward matrix elements containing chains of covariant derivatives (see e.g. [2, 3]). Here, \({\mathcal {O}}\) are operators with mass dimension \(i+3\) and the index n runs over all operators in the basis at a given order in \(1/m_b\).

We note that, for \(b\rightarrow c\) transitions, the treatment of \(m_c\) as a heavy degree of freedom introduces infrared sensitivities to the charm mass. At \(O\left( 1/m_b^3\right)\), these enter as \(\ln m_c^2\), while at \(O\left( 1/m_b^5\right)\) explicit power-like infrared sensitivities to \(m_c\) enter. This is shown in (10).

In Eq. (10), \(d\Gamma _0\) is the partonic rate and \(d\Gamma _1\) vanishes due to the equation of motions. At order \(1/m_b^2\), \(d\Gamma _2\) is given in terms of two hadronic matrix elements:

$$\begin{aligned} 2m_B \mu _\pi ^2 & = {} -\langle B| {\bar{b}}_v (i D)^2 b_v |B \rangle \, \nonumber \\ 2m_B \mu _G^2 &= \langle B| {\bar{b}}_v (i D_\alpha ) (i D_\beta ) (-i \sigma ^{\alpha \beta } ) b_v |B\rangle \, \end{aligned}$$
(13)

while \(d\Gamma _3\) contains

$$\begin{aligned} 2 m_B \rho _{LS}^3 & = {} \frac{1}{2}\langle B| {\bar{b}}_v \left\{ (iD_\alpha ) \,, \, \left[ i vD, \, (i D_\beta ) \right] \right\} (-i \sigma ^{\alpha \beta } ) b_v |B \rangle \, \nonumber \\ 2 m_B \rho _D^3 & = {} \frac{1}{2}\langle B|{\bar{b}}_v \left[ (iD_\mu ) \,, \, \left[ \left( i vD \right) , \, (i D^\mu ) \right] \right] b_v |B\rangle . \end{aligned}$$
(14)

We note that there are several definitions of these matrix elements in use; with or without commutators or defined through only the spatial component of the covariant derivative. The difference between these definitions is of higher order in the \(1/m_b\) expansion and becomes important when including also \(1/m_b^4\) terms. See also [4] for the conversion between the different bases. At \(1/m_b^4\) and \(1/m_b^5\), the number of matrix elements starts to proliferate [5, 6]. In Sect. 3, we discuss the current status of the QCD higher order corrections to the partonic differential rate and power suppressed terms.

For \(b\rightarrow u \ell {\bar{\nu }}\) transitions, the power counting is different and follows \(m_u \ll \Lambda _{\textrm{QCD}} \ll m_b\). This is equivalent to setting up the OPE assuming \(m_b \gg m_q \sim \Lambda _{\textrm{QCD}}\) and then taking \(m_q\) to zero. Different as for the \(b\rightarrow c\) case, we now treat the light quark q as a dynamical degree of freedom that cannot be integrated out. Therefore, the OPE contains four-quark operators with also the light-quark field q. It is important to note that the infrared sensitivity to the light degrees of freedom thus enters through additional non-perturbative parameters, which first appear at \({\mathcal {O}}(1/m_b^3)\). In the case of semileptonic decays, these are the non-perturbative weak annihilation terms, which we discuss in Sect. 4.5. We discuss the set up of inclusive \(B\rightarrow X_u\) decays in detail in Sect. 4.

Finally, we note that the HQE parameters also enter the theoretical predictions for inclusive \(B\rightarrow X_{s,d} \ell \ell\) decays [7,8,9]. At high-\(q^2\), they present the dominant uncertainty to the theoretical predictions (see [8, 9] for the most recent theoretical predictions).

2.1 Reparametrization invariance

When setting up the HQE, we introduced a unit vector v, which we associated with the velocity of the B meson. The final result should, however, be reparametrization invariant (RPI) and not depend on the choice of v. The reparametrization transformation \(\delta _{\textrm{RP}}\) on v then gives

$$\begin{aligned} \delta _{\textrm{RP}}\; v_\mu = \delta v_\mu \ , \end{aligned}$$
(15)

corresponding to an infinitesimal change in \(v_\mu \rightarrow v_\mu + \delta v_\mu\), where \(v \cdot \delta v = 0\). RPI observables R, like the total rate and those involving the dilepton invariant mass \(q^2\), thus satisfy \(\delta _{\textrm{RP}}\; R = 0\). In order to achieve that, reparametrization invariance requires certain cancellations between different orders in the \(1/m_b\) expansion [6]. This can be understood because the fields and operators carry information about the velocity v. For the b field, which is re-defined through (8), we have

$$\begin{aligned} (iD_\mu )b(x) = e^{-i m_b (v\cdot x)} (iD_\mu + m_b v_\mu ) b_v(x). \end{aligned}$$
(16)

Because the full QCD fields and operators on the left are RPI, also the redefined fields on the right must be RPI. From this, and using a Taylor expansion in the small \(\delta v_\mu\), we find

$$\begin{aligned} \delta _{\textrm{RP}}\; b_v(x) = im (x\cdot \delta x) b_v(x), \end{aligned}$$
(17)

and for the covariant derivative

$$\begin{aligned} \delta _{\textrm{RP}}\; iD_\mu = - m_b \delta v_\mu. \end{aligned}$$
(18)

Because the HQE matrix elements contain strings of covariant derivatives, an RPI transformation (18) links some elements at order \(1/m_b^n\) to those at lower order \(1/m_b^{n-1}\). The relation between the coefficients in the HQE expansion at different orders is given by (3.8) in [6]. As such, RPI fixes the coefficients of towers of operators that are related by reparametrization [6, 10,11,12,13].

These relations become specifically powerful when going to higher orders in the \(1/m_b\) expansion [6, 14]. It is known that the total rate does not depend on \(\rho _{LS}^3\) due to RPI. However, at higher order, the total rate up to \(1/m_b^4\) depends only on a restricted set of eight independent parameters, in stead of 13 in the more general case. The eight independent parameters are then given by fixed linear combination of the matrix elements defined for the general case. At \(1/m_b^5\), the RPI operators were recently studied, and only 10 additional operators arise compared to 18 in the non-RPI case [14]. We discuss the consequences of this for RPI observables more in Sect. 3.3.

3 Inclusive \(B\rightarrow X_c\) and the extraction of \(|V_{cb}|\)

The current strategy in the extraction of inclusive \(|V_{cb}|\) is based on global fits of various kinematic moments of semileptonic decays. The idea is to extract the values of the HQE parameters from experimental data of kinematic normalized distributions (independent on the value of \(|V_{cb}|\)) and then insert their values into the formula for the total decay rate of \(B \rightarrow X_c \ell {\bar{\nu }}_\ell\).Footnote 1 The comparison between the HQE prediction with the experimentally measured branching ratio allows to extract \(|V_{cb}|\).

Within the HQE, it is possible to make a prediction for the various differential rates w.r.t. the leptonic invariant mass \(q^2\), the charged-lepton energy \(E_\ell\) or the hadronic invariant mass \(M_X^2\) in terms of the heavy quark masses \(m_b\) and \(m_c\) and the HQE parameters in Eq. (13) and (14). However, the predictions for the differential rates cannot be compared point by point with data. In fact, on one hand, the phase space region allowed at the parton level is smaller than the physical one. For instance, the leptonic invariant mass spectrum has an end-point given in terms of meson masses by \(0 \le q^2 \le (M_B-M_D)^2\) while in the OPE, which depends only on \(m_b\) and \(m_c\), it is \((m_b-m_c)^2\). On the other hand, power corrections become large or even singular close to the endpoint. As an example, let us look at the HQE prediction for the the \(q^2\) spectrum at tree level:

$$\begin{aligned}\frac{1}{\Gamma _0}\frac{d\Gamma }{d {{\hat{q}}}^2} & = 2|\vec p_X| \Bigg (1-\frac{\mu _\pi ^2}{2m_b^2} \Bigg ) \Big [ 1 + {{\hat{q}}}^2- 2 {{\hat{q}}}^4 + ({{\hat{q}}}^2-2) \rho ^2 + \rho ^4 \Big ] +\frac{\mu _G^2}{m_b^2} \Bigg \{ |\vec p_X| \Big [5 (1 + {{\hat{q}}}^2- 2 {{\hat{q}}}^4) + ( 5 {{\hat{q}}}^2-18) \rho ^2 + 5 \rho ^4 \Big ] \nonumber \\&\quad -4\frac{1 - {{\hat{q}}}^2 + ( 6 {{\hat{q}}}^2-5) \rho ^2 + (3 {{\hat{q}}}^2 +7) \rho ^4 - 3 \rho ^6}{|\vec p_X|} \Bigg \} +O\left( \frac{1}{m_b^3} \right) \end{aligned}$$
(19)

where \(\Gamma _0 = m_b^5 G_F^2 |V_{cb}|^2/(192 \pi ^3)\), \(|\vec p_X|^2 = 1 - 2 {{\hat{q}}}^2 + {{\hat{q}}}^4 - 2 \rho ^2 - 2 {{\hat{q}}}^2 \rho ^2 + \rho ^4\) is the spacial momentum of the \(X_c\) system and \({{\hat{q}}}^2 = q^2/m_b^2\) is the leptonic invariant mass normalized over the bottom mass. At the end point of the \(q^2\) spectrum, the \(X_c\) is produced basically at rest in the B rest frame and, therefore, \(|\vec p_X| \rightarrow 0\). This lead to the appearance of integrable singularity in the coefficient of \(\mu _G^2\) resulting in a correction much larger compared to the leading contribution of order \(1/m_b^0\) (see also Fig. 1). At higher orders in \(1/m_b\), one encounters even more singular terms in the form of contributions proportional to a \(\delta\) distribution. While the HQE prediction for the differential rate can be compared to data far from the maximum value of \(q^2\), near the endpoint \(q^2 \simeq (m_b-m_b)^2\) the decay is dominated by one or few resonances (D and \(D^*\)) such that decay cannot be considered anymore “inclusive” and there is a breakdown of the OPE, also signalled by the prediction for a negative or divergent decay rate.

Fig. 1
figure 1

The \(q^2\) spectrum of \(B \rightarrow X_c \ell {{\bar{\nu }}}_\ell\) at tree level in the free quark approximation (blue curve) and including terms up to \(1/m_b^2\) (orange curve) with pole masses \(m_b = 4.78\) GeV, \(m_c=1.67\) GeV, and HQE parameters \(\mu _\pi ^2 = 0.43\,\hbox {GeV}^2\) and \(\mu _G^2 = 0.36\,\hbox {GeV}^2\)

Therefore, the study of inclusive \(b \rightarrow c\) decays has to rely on moments of the spectra, which are integrated quantities whose \(1/m_b\) expansion is much better behaved. The moments of the differential distribution of some observable O, where \(O = E_\ell , q^2, M_X^2\), are defined by

$$\begin{aligned} \langle (O)^n \rangle _{{\textrm{cut}}} = \int _{\textrm{cut}} (O)^n \frac{\textrm{d} \Gamma }{\textrm{d} O} \, \textrm{d} O \Bigg / \int _{\textrm{cut}} \frac{\textrm{d} \Gamma }{\textrm{d} O} \, \textrm{d} O. \end{aligned}$$
(20)

The subscript “cut” generically denotes some restriction in the lower integration limit. From the theoretical side, the dependence of the moments on a lower cut yields additional information on the HQE parameters and thus provides a better handle for their extraction via global fits. From the experimental side, the spectrum is usually not measurable entirely due to detector acceptance. For example at the B-factories a lower cut on the charged lepton energy, \(E_\ell \ge E_{\textrm{cut}}\) with \(E_{\textrm{cut}} \simeq 0.5\) GeV, is applied to suppress the background. For higher moments (\(n \ge 2\)), one considers centralized moments which are less correlated among each other and more sensitive to the power suppressed terms in the HQE.

B factories have measured centralized moments of the charged lepton energy spectrum

$$\ell _1 (E_{\textrm{cut}}) = \langle E_\ell \rangle _{E_\ell \ge E_{\textrm{cut}}}, \ell _n (E_{\textrm{cut}}) = \Big \langle (E_\ell - \langle E_\ell \rangle )^n \Big \rangle _{E_\ell \ge E_{\textrm{cut}}} \text { for } n\ge 2,$$
(21)

and the hadronic invariant mass

$$h_1(E_{\textrm{cut}}) = \langle M_X^2 \rangle _{E_\ell \ge E_{\textrm{cut}}}, h_n (E_{\textrm{cut}}) = \Big \langle (M_X^2 - \langle M_X^2 \rangle )^n \Big \rangle _{E_\ell \ge E_{\textrm{cut}}} \text { for } n\ge 2,$$

where \(E_{\textrm{cut}}\) denotes the minimum energy required for the lepton. The first moments with \(n=1\) correspond to the mean values of observables over the considered integration domain. The second centralized moments are the variance of the distributions and even higher moments are often extracted, with \(n=3\) and 4. Ref. [4] suggested also the study of the moments of the \(q^2\) spectrum, defined by

$$\begin{aligned} q_1(q^2_{\textrm{cut}})&= \langle q^2 \rangle _{q^2 \ge q^2_{\textrm{cut}}},&q_n (q^2_{\textrm{cut}})&= \Big \langle (q^2 - \langle q^2 \rangle )^n \Big \rangle _{q^2 \ge q^2_{\textrm{cut}}} \text { for } n\ge 2. \end{aligned}$$

These observables are invariant under reparametrization and therefore they depend on a reduced set of HQE parameters like the total rate (see Sect. 2.1). Reparametrization invariance would be broken if a cut on \(E_\ell\) is introduced in the \(q^2\) moment definition. For this reason, it is more advantageous to consider instead a lower cut on the \(q^2\) which preserves the invariance under reparametrization. Note that a lower cut on \(q^2\) also imposes an indirect cut on the charged lepton energy through

$$\begin{aligned} E_\ell \ge \frac{M_B^2+q^2_{\textrm{cut}}-M_D^2 - \lambda ^{1/2}(M_B^2,q^2_{\textrm{cut}}, M_D^2)}{2M_B}, \end{aligned}$$
(22)

where \(\lambda (a,b,c) = a^2+b^2+c^2-2ab-2ac-2bc\) is the Källen function.

For illustration, we present in Fig. 2 the prediction within the HQE of the first three centralized moments of \(q^2\), \(E_\ell\) and \(M_X^2\) in inclusive \(b \rightarrow c\) decays. For the \(q^2\) moments, we show the dependence of the moments on the lower cut \(q^2_{\textrm{cut}}\) while the moments of \(E_\ell\) and \(M_X^2\) are plotted as a function of the charged lepton energy cut \(E_{\textrm{cut}}\). The plots report the prediction at leading order in the free quark approximation (black dashed), the power corrections originating from terms of order \(1/m_b^2\) (blue) and \(1/m_b^3\) (red). We also include in green the contribution of the QCD NLO corrections to the \(1/m_b^0\) coefficients, which will be discussed more in details in the next session. The sum of all contribution is shown by the solid black line.

Several interesting features can be observed. The first moments are in first approximation well described by the \(1/m_b^0\) tree level prediction. Power suppressed terms and perturbative QCD corrections do not exceed the 5–10 % of the leading order, becoming larger with increasing cuts. Centralized moments of higher orders on the contrary have a stronger sensitivity on the \(1/m_b\) corrections, with \(q^2\) moments receiving in general a larger shift compared to the \(E_\ell\) moments. For the third, one can see that \(1/m_b^2\) and \(1/m_b^3\) terms can give an O(1) contribution compared to the \(1/m_b^0\) term for large values of the cuts.

One should also notice that for higher centralized moments of the \(M_X\) spectrum the \(1/m_b^0\) tree level contribution is practically zero, with the HQE prediction dominated by the power corrections and to a smaller extent also by the perturbative QCD corrections. In fact, at tree level \(M_X^2\) is equal to the charm mass \(m_c^2\) therefore the difference \(M_X^2 - \langle M_X^2 \rangle\) always vanishes at tree level. A genuine contribution to \(h_2\) and \(h_3\) arises either from perturbative NLO, where the emission of a real gluon leads to values of \(M_X\) greater than \(m_c\), or from the endpoint singularity appearing in power suppressed terms.

Fig. 2
figure 2

Dependence of the centralized moments on the different orders in the \(1/m_b\) expansion and the NLO QCD corrections. First, second and third rows show the prediction for the \(q^2\), \(E_\ell\) and \(M_X^2\) moments, respectively. For the \(q^2\) moments, the values for the HQE parameters are taken from the fit in Ref. [15] while for \(E_\ell\) and \(M_X^2\), we use the fit results from Ref. [16]

It is also possible to consider partial decay width with a constraint on \(E_\ell\) or \(q^2\),

$$\begin{aligned} \Gamma _\textrm{sl} (E_{\textrm{cut}})&= \int _{E_\ell \ge E_{\textrm{cut}}} \frac{\textrm{d} \Gamma }{ \textrm{d} E_\ell }\, \textrm{d} E_\ell , \nonumber \\ \Gamma _{\textrm{sl}} (q^2_{\textrm{cut}})&= \int _{q^2 \ge q^2_{\textrm{cut}}} \frac{\textrm{d} \Gamma }{ \textrm{d} q^2 }\, \textrm{d} q^2, \end{aligned}$$
(23)

as well as the ratio \(R^*\) between the rate with and without a cut

$$\begin{aligned} R^*(E_{\textrm{cut}})&= \int _{E_\ell \ge E_{\textrm{cut}}} \frac{\textrm{d} \Gamma }{ \textrm{d} E_\ell }\, \textrm{d} E_\ell \Bigg / \int _{E_\ell \ge 0} \frac{\textrm{d} \Gamma }{ \textrm{d} E_\ell }\, \textrm{d} E_\ell , \nonumber \\ R^* (q^2_{\textrm{cut}})&= \int _{q^2 \ge q^2_{\textrm{cut}}} \frac{\textrm{d} \Gamma }{ \textrm{d} q^2 }\, \textrm{d} q^2 \Bigg / \int _{q^2 \ge 0} \frac{\textrm{d} \Gamma }{ \textrm{d} q^2 }\, \textrm{d} q^2 . \end{aligned}$$
(24)

In addition, it is interesting to consider also the forward–backward asymmetry [17] (see also Ref. [18] for a recent work). The asymmetry is defined as

$$\begin{aligned} \displaystyle A_{FB} = \frac{\int _{-1}^0 \frac{\textrm{d}\Gamma }{\textrm{d}z} - \int _{-1}^0 \frac{\textrm{d}\Gamma }{\textrm{d}z}}{ \int _{-1}^0 \frac{\textrm{d}\Gamma }{\textrm{d}z} + \int _{-1}^0 \frac{\textrm{d}\Gamma }{\textrm{d}z} }, \end{aligned}$$
(25)

where

$$\begin{aligned} z = \cos \theta = \frac{v \cdot p_\nu - v \cdot p_\ell }{\sqrt{(v\cdot q)^2 - q^2}}, \end{aligned}$$
(26)

\(v = p_B/M_B\) and \(\theta\) is the angle between spacial momenta of the lepton and the B meson in the rest-frame of the dilepton pair (see Fig. 3). The asymmetry was first discussed in [17], including a cut on \(E_\ell\). This leads to a cusp in the differential spectrum in the variable z, which can be problematic in experimental analysis. To circumvent this issue, Ref. [18] proposed to study \(A_{FB}\) with a minimum cut on \(q^2\) instead.

Fig. 3
figure 3

(Left) The decay \(B \rightarrow X_c \ell {{\bar{\nu }}}_\ell\) in the lepton-neutrino rest frame. The angle \(\theta\) between the flight direction of the lepton and the B meson is used to define the forward-backward asymmetry. (Right) The differential decay rate w.r.t. \(z = \cos \theta\) at different orders in the HQE for \(q^2_{\textrm{cut}} = 3\) \(\hbox {GeV}^2\). Figures from Ref. [18]

3.1 Kinetic mass and perturbative corrections

The total semileptonic rate \(\Gamma _{\textrm{sl}}\) and the centralized moments are expressed as a double expansion in \(1/m_b\) and \(\alpha _s\). The expansion of the total rate can be written as

$$\begin{aligned} \Gamma _{\textrm{sl}}&= \frac{G_F^2 m_b^5A_\textrm{ew}}{192\pi ^3} |V_{cb}|^2 \nonumber \\&\quad \times \Bigg [ \left( 1-\frac{\mu _\pi ^2}{2m_b^2} \right) \left( X_0(\rho ) +\frac{\alpha _s}{\pi } X_1(\rho ) +\left( \frac{\alpha _s}{\pi }\right) ^2 X_2(\rho ) +\left( \frac{\alpha _s}{\pi }\right) ^3 X_3(\rho ) +\dots \right) \nonumber \\&\quad +\left( \frac{\mu _G^2}{m_b^2} -\frac{\rho _D^3}{m_b^3} \right) \left( g_0(\rho ) +\frac{\alpha _s}{\pi } g_1(\rho ) +\dots \right) +\frac{\rho _D^3}{m_b^3} \left( d_0(\rho ) \nonumber+\frac{\alpha _s}{\pi } d_1(\rho ) +\dots \right) +O\left( \frac{1}{m_b^4} \right) \Bigg ], \end{aligned}$$
(27)

where \(\rho =m_c/m_b\) and \(\alpha _s \equiv \alpha _s^{(n_f)}(\mu _s)\) is the strong coupling constant taken with \(n_f\) active quarks and at the renormalization scale \(\mu _s\). The factor \(A_\textrm{ew} = 1.01435\) stems from short-distance radiative corrections at the electroweak scale [19]. The functions denoted by Xgd depend on the mass ratio \(\rho = m_c/m_b\) and in general also on \(q^2_{\textrm{cut}}\) or \(E_{\textrm{cut}}\) if a phase-space cut is applied.

The first row of Eq. (27) corresponds to the prediction of a free bottom quark decay obtained in perturbative QCD. The next-to-leading (NLO) corrections are known since more than 30 years [20]. At next-to-next-to-leading order (NNLO), the coefficient \(X_2(\rho )\), is known in an expansions around \(\rho \rightarrow 0\), which corresponds to the limit \(m_c \ll m_b\). The result covers both cases \(b \rightarrow c \ell {{\bar{\nu }}}_\ell\) and \(b \rightarrow u \ell {{\bar{\nu }}}_\ell\) [21, 22]. Recently, analytic expressions for the NNLO corrections written in terms of iterated integrals were presented in [23]. The asymptotic expansion around \(\rho = 0\) is quite involved and it has not yet been extended to next-to-next-to-next-to-leading order (N3LO). At NNLO also the opposite case is studied [24], i.e. the case of a heavy charm with \(m_c \simeq m_b\). It has been shown that the asymptotic expansion in the limit \(\delta = 1-\rho \rightarrow 0\) is much simpler and leads to fast convergence of the series even at the physical value of \(\delta = 1-m_c/m_b \simeq 0.7\). At N3LO the coefficient \(X_3(\rho )\) is calculated as an expansion around \(m_c \simeq m_b\) in Ref. [25] (see also Ref. [26] where the Abelian limit have been cross-checked). The same asymptotic expansion allows to estimate the third order correction for the charmless \(b \rightarrow u\) decays with uncertainty on \(X_3\) of about 10%. Recently, the prediction for \(X_3(\rho =0)\) has been computed analytically in the leading color approximation [27]. Accurate numerical prediction in full color for the fermionic contributions are also available from Ref. [28], while the calculation of the bosonic corrections is ongoing.

The perturbative QCD corrections for the power suppressed term \(\mu _\pi\) are linked to those at order \(1/m_b^0\) because of renormalization invariance. The NLO correction \(g_1(\rho )\) to \(\mu _G^2\) of the total rate has been obtained numerically from phase-space integration of the differential rate in Ref. [29] and later given in an analytic way in Ref. [30]. Result for the NLO correction to \(\rho ^3_D\) are presented in Ref. [31] (note that the earlier calculation in [32] is superseded due to overlooked renormalization terms, later fixed in [31]).

Their double expansion for the moments of an observable O can be written in a similar way as in Eq. (27), where however the functions in front of the HQE parameters \(\mu _G^2\) and \(\rho _{LS}^3\), and also \(\mu _\pi ^2\), are different in general since reparametrization invariance is not preserved for \(E_\ell\) and \(M_X^2\) moments. Centralized moments are derived by using the binomial formula

$$\begin{aligned} \Big \langle (O - a)^n \Big \rangle = \sum _{i=0}^n \left( {\begin{array}{c}n\\ i\end{array}}\right) \left\langle O^i \right\rangle (-a)^{n-i}, \end{aligned}$$
(28)

which follows from the linearity property of integrals, and subsequently re-expanding the expression in \(\alpha _s\) and \(1/m_b\) to the relevant order. The power corrections at tree level are known up to \(O(1/m_b^5)\), see Ref. [4, 5, 14, 33].

The evaluation of higher order corrections to the moments is further complicated by the presence of the value of the cut, which enters as an additional scale in the calculation beyond \(\rho\). The NLO corrections at order \(1/m_b^0\) can be computed from the knowledge of the NLO triple differential rate [34, 35], The next-to-next-to-leading order (NNLO) corrections to the hadronic invariant mass and charged-lepton energy moments have been calculated in [36, 37] in a numerical way, for fixed values of the cut and the mass ratio \(\rho\). Extrapolation to different cut values or heavy quark masses can be done via fits, see e.g. [38]. Analytic expressions for the NNLO corrections to the \(q^2\) spectrum and its moments with cuts have been presented only recently [39]. At N3LO, there are expressions for the moments without cuts in Ref. [40].

For the triple differential rate, the NLO corrections to the power suppressed terms of order \(1/m_b^2\) are computed in [29, 41, 42], while the corrections at \({\mathcal {O}}(1/m_b^3)\) are available only for the \(q^2\) moments [31] but so far, it has not been extended to the other two types of moments.

In order to make accurate predictions, the formulas for the total rate and the moments must be expressed in terms of a short-distance mass scheme. This ensure the cancellation of the leading renormalon divergence which arises if the observabels are written in terms of pole mass for the heavy quark [43, 44]. The fits in [15, 45] make use of the so-called kinetic scheme [46]. In such scheme, the pole mass of the bottom quark is replaced by the kinetic mass in the following way:

$$\begin{aligned} m_b^\textrm{pole} = m_b^\textrm{kin}(\mu ) + [{{\overline{\Lambda }}} (\mu )]_\textrm{pert} +\frac{[\mu _\pi ^2(\mu )]_\textrm{pert}}{2 m_b^\textrm{kin}(\mu )} +O\left( \frac{1}{m_b^2} \right) , \end{aligned}$$
(29)

where the scale \(\Lambda _\textrm{QCD} \ll \mu \ll m_b\) is the Wilsonian cutoff scale, usually take of the order of 1 GeV. The latter terms in Eq. (29) are the perturbative version of the HQE parameters as determined from the Small Velocity sum rules [47]. Their explicit expressions up to order \(\alpha _s^3\) are given in [48]. At the same time, the HQE parameters entering in the expansion in Eq. (27) must be redefined in the kinetic scheme by subtracting their perturbative contribution:

$$\begin{aligned} \mu _\pi ^2(0)&= \mu _\pi ^2(\mu ) - [\mu _\pi ^2(\mu )]_\textrm{pert},&\rho _D^3(0)&= \rho _D^3(\mu ) - [\rho _D^3(\mu )]_\textrm{pert} . \end{aligned}$$
(30)

These expressions actually refer to the value of \(\mu _\pi ^2\) and \(\rho _D^3\) in the infinite \(m_b\) limit while the fits [15, 45] employ definitions of the HQE parameters at finite \(m_b\). In both cases, the setup neglects (unknown) terms of \(O(\mu ^3)\) in \([\mu _\pi ^2(\mu )]_\textrm{pert}\) and \([\mu _G^2(\mu )]_\textrm{pert}\). Since the operator bases in Ref. [45] and [15] differ in particular for the definition of \(\mu _G^2\), a mismatch of order \(\alpha _s \times \mu ^3\) appears when comparing the two frameworks.

3.2 Analysis of \({\bar{B}} \rightarrow X_c \ell {\bar{\nu }}_\ell\) decays

The semileptonic decay width is calculated in the OPE framework outlined above. The resulting expansion depends on a number of non-perturbative parameters which can be extracted using the spectral moments of the inclusive \({\bar{B}} \rightarrow X_c \ell {\bar{\nu }}_\ell\) decay rate.

These spectral moments, in particular \(\langle E_\ell ^n\rangle\), \(\langle M_X^n\rangle\), and \(\langle q^{2n}\rangle\), have been measured by the Babar [49], Belle [50,51,52], Belle II [53], CDF [54], CLEO [55], and DELPHI [56] collaborations at various orders n. The \(\langle E_\ell ^n\rangle\) and \(\langle M_X^n\rangle\) moments have been measured with different cut-offs \(E_{\textrm{cut}}\) in the \(E_\ell\) spectrum, whereas the \(\langle q^{2n}\rangle\) moments have been measured with different cut-offs \(q_{\textrm{cut}}\) in the \(q^2\) spectrum. The measured spectral moments at different cut-offs within one spectrum are highly correlated, which makes it necessary to discard some of the data points to retain an invertible covariance matrix of the measurements for the analysis. The choice of the included data points is ad hoc, nevertheless the result has to be stable with different data points selected. The correlation between measured moments of different spectra has been neglected in the past, but can be sizeable up to \(30\%-80\%\). It would be useful for a more stringent analysis of the data if experimental correlations across different moments are quoted.

The fit to the spectral moments is only sensitive to a linear combination of the bottom- and charm quark masses \(m_{b,c}\) [57]. However, precise determinations of the heavy quark masses are available from the lattice and can be included in the fits. In the following, we will briefly review the two different analyses, which are either based on the lepton energy \(E_\ell\) moments and the hadronic invariant mass \(M_X\) moments, or based on the momentum transfer squared \(q^2\) moments.

These two analyses use the kinetic mass scheme for the bottom mass \(m_b\). From \({\overline{m}}_b({\overline{m}}_b) = 4.198 (12)\)  GeV [58] and \({\overline{m}}_c(3\) GeV\()=0.988(7)\)  GeV [59], one finds [48, 60]

$$\begin{aligned} m_b^{\textrm{kin}}(1\; {\textrm{GeV}}) = 4.565 \pm 0.015 \pm 0.013\; {\textrm{GeV}} \ . \end{aligned}$$
(31)

The first error is the theoretical uncertainty of the scheme conversion [25] and the second stems from the \({\overline{m}}_b({\overline{m}}_b)\) error. The \(\overline{\textrm{MS}}\) scheme is used for the charm mass \(m_c\), whose value has been determined precisely in lattice QCD [59] and QCD sum rules [61, 62] computations. Its mass at the correct scale

$$\begin{aligned} {\overline{m}}_c(2 \text { GeV}) = 1.093 \pm 0.008\; {\textrm{GeV}} \end{aligned}$$
(32)

can be obtained using RunDec [63, 64] with 4-loop accuracy.

The chromomagnetic expectation value \(\mu _G^2\) is related to B hyperfine splitting by

$$\begin{aligned} \frac{3}{4}(m_{B^*}^2-m_B^2) = C_{cm}(\mu ) \mu _G^2(\mu ), \end{aligned}$$
(33)

where \(C_{cm}(m_b) \simeq 1.26\) is the Wilson coefficients of the chromomagnetic operators, known up to \(O(\alpha _s^3)\) [65]. The formula is valid only in the infinite \(m_b\) limit, so in general, there are power corrections of order \(\Lambda _\textrm{QCD}/m_b\), which allows to set a loose bound on \(\mu _G\) (with \(C_{cm} = 1\)) [66]:

$$\begin{aligned} \mu _G^2 = (0.36\pm 0.07)\, \textrm{GeV}^2. \end{aligned}$$
(34)

Since the contributions proportional to \(\rho _{LS}^3\) in the moments are numerically suppressed and the fit is only marginally sensitive to its size, Refs. [45, 57, 67] include also the constraint \(\rho _{LS}^3 = -(0.15 \pm 0.10)\, \textrm{GeV}^3\), which was estimated in Ref. [68], loosely based on exact inequalities of the heavy quark theory in the limit \(m_b \rightarrow \infty\) [69].

The theoretical uncertainties are estimated by varying the scale and the HQE parameters. Their corresponding correlations on the predicted spectral moments is an important component in the fits to the spectral moments. The implemented strategies for the theory uncertainties and correlations are discussed below. Besides the two analysis in the kinetic scheme, there are determinations using the 1 S scheme for the \(m_b\) [70] using inputs from [71] (see Ref. [72] for a recent analysis). This is, however, not at the same level as the below discussed fits in terms of included higher order power and perturbative corrections and, therefore, we do not discuss it in more detail. Recently, Ref. [73] performed an extraction of \(|V_{cb}|\) using estimates for the HQE parameters up to \(1/m_b^2\) in the so-called dual-space-renormalon-subtraction method [74], which allows to separate and subtract the order \(\Lambda _\textrm{QCD}^2/m_b^2\) infrared renormalon in the total width and to perform the analysis in the \(\overline{\textrm{MS}}\) mass scheme. The result \(|V_{cb}| = 41.5^{+1.0}_{-1.2} \times 10^{-2}\) is consistent with [15, 45] based on the kinetic schemes discussed below.

3.2.1 \(E_\ell\) and \(M_X\) moments analysis

This analysis uses two set of inclusive observables and has been performed in Refs. [45, 57, 67]: \(\langle M_X^n\rangle\) moments of order \(n=2,4,6\) and \(\langle E_\ell ^n\rangle\) moments of order \(n=0, 1, 2, 3\), where the 0th moment corresponds to the partial branching fraction. Each moment is measured with different cut-offs \(E_{\textrm{cut}}\) in the lepton energy spectrum. The analysis includes constraints on the charm and bottom mass and the HQE parameters \(\mu _G^2\) and \(\rho _{LS}^3\) discussed above. In the nominal fit of the most recent analysis [45], HQE parameters up to \(1/m_b^3\) are included. In addition, also the recent N3LO calculations of the decay rate in Ref. [25] were included. For the moments, \(\alpha _s^2\) corrections to the partonic rate are included, as well as \(\alpha _s\) corrections to \(\mu _G^2\). The theory uncertainty is modelled through variations and included in the fit. The HQE parameters are varied \(\pm 7\%\) for \(\mu _\pi ^2\) and \(\mu _G^2\) and \(\pm 20\%\) for \(\rho _D^3\) and \(\rho _{LS}^3\). An irreducible uncertainty of \(4\,\textrm{MeV}\) is applied to the quark masses \(m_{c,b}\), and \(\alpha _s(m_b)\) is varied by \(\pm 0.018\). The theoretical correlations for different central moments are considered to be zero. The correlation of the theoretical predictions at different cut-off values of the same central moment is modelled by the proximity of the cuts. Details on the treatment of the theoretical uncertainties can be found in Ref. [57]. The resulting \(|V_{cb}|\) is [45]

$$\begin{aligned} |V_{cb}|= (42.16\,\pm 0.30|_{\textrm{th}}\pm 0.32|_{\textrm{exp}} \pm 0.25|_\Gamma ) \cdot 10^{-3} = (42.16 \pm 0.51) \cdot 10^{-3} , \end{aligned}$$
(35)

where the first uncertainty originates from the variations of the theory parameters, the second uncertainty from the experimentally measured moments, and the third uncertainty from the predicted decay rate. The \(\chi ^2_{\textrm{min}} / \textrm{ndf} = 0.47\) indicates a good fit to the data.

3.2.2 \(q^2\) moment analysis

Recently, also the measurements of the \(q^2\) moments have been used to extract \(|V_{cb}|\) and the HQE parameters [15]. As these are RPI quantities, they depend on a reduced set of HQE operators opening in principle the way to extract higher order \(1/m_b^4\) terms [4]. The goal of [15] was to perform a first analysis of terms up to \(1/m_b^4\), to show from data if the HQE indeed converges. We discuss this in more detail in Sect. 3.3.

For the analysis, \(\langle q^{2n}\rangle\) moments of order \(n=1, 2, 3, 4\) are included [15]. Each moments is measured with different \(q^2\) cut-offs \(q^2_{\textrm{cut}}\). The analysis includes constraints on the bottom mass (31) and the charm (32) with Gaussian constraints. In order to extract \(|V_{cb}|\), information on the total branching ratio without an lepton energy cut is necessary (see Sect. 3.3). The authors of Ref. [15] perform an average of available measurements and find

$$\begin{aligned} {\mathcal {B}}(B\rightarrow X_c\ell {\bar{\nu }}_\ell ) = (10.48 \pm 0.13)\% \ , \end{aligned}$$
(36)

we note that this differs slightly from the inclusive branching fraction \({\mathcal {B}}(B\rightarrow X_c\ell {\bar{\nu }}_\ell ) = (10.65 \pm 0.16)\%\) as determined by HLFAV [72] using a fit to the lepton energy and \(M_X\) moments. In addition to the constraint on \(\mu _G^2\) in (34), a conservative constraint of \(\mu _\pi ^2 = (0.43 \pm 0.24)\, \mathrm {GeV^2}\) is added to the fit. This is necessary because \(\mu _\pi ^2\) drops out to first order in the \(\langle q^2\rangle\) moments and only enters quadratically. As in the \(E_\ell\) and \(M_X\) analysis, theoretical uncertainties enter because of the truncation of the OPE. To account for missing higher order terms in the HQE expansion, they varied \(\mu _G^2\) by \(\pm 20\%\) and \(\rho _D^3\) by \(\pm 30\%\). To account for missing NNLO corrections, \(\alpha _s(m_b)\) is varied by \(m_b^{\textrm{kin}}/2< \mu _s < 2 m_b^{\textrm{kin}}\). We note that currently, the \(\alpha _s^2\) corrections to the moments with a \(q^2\) cut are not available, as such the theoretical uncertainty on this analysis is slightly higher than for the lepton energy and \(M_X\) moments.

The a-priori unknown correlations between the theoretical predictions at different \(q^2\) cut-off values are modelled by a function depending on the proximity of the cuts, with an additional coefficient to tune the correlation strength. The correlation between moments of different orders m and n is modelled with an additional coefficient modified by the distance \(|m-n|\) that allows to tune the correlation. Both correlation coefficients are then extracted from the fit. Details on the treatment of the theoretical uncertainties and their correlations can be found in Ref. [15]. This approach allows to quantify the ad-hoc assumptions in a data-driven way. The resulting \(|V_{cb}|\) with the N3LO calculations of the decay rate in Ref. [25] isFootnote 2

$$\begin{aligned} |V_{cb}|= (41.69 \, \pm 0.27|_{\mathcal {B}} \pm 0.31|_\Gamma \pm 0.18|_{\textrm{exp}}\pm 0.17|_{\textrm{th}} \pm 0.34|_{\mathrm{Constr.}} )\cdot 10^{-3} = (41.69 \pm 0.59) \cdot 10^{-3}\,, \end{aligned}$$
(37)

where the first uncertainty originates from the measured inclusive branching fraction, the second uncertainty from the predicted rate, the third uncertainty from the experimentally measured moments, the fourth uncertainty from the variations of the theory parameters, and the firth uncertainty from the external constraints in the fit. The \(\chi ^2_{\textrm{min}} / \textrm{ndf} = 0.15\) indicates a good fit to the data.

3.2.3 Comparison and combination

The here discussed global analyses find compatible values for \(|V_{cb}|\), but one intricacy is that the determined value of \(\rho _D^3\) between the two analyses is in tension. Some caution is needed when comparing these HQE parameters, since the analyses use a different operator basis. In particular, the two bases are related up to \(1/m_b^3\) by the transformation

$$\begin{aligned} (\mu _G^2)^\perp&= \mu _G^2 + \frac{\rho _D^3+\rho _{LS}^3}{m_b},&(\mu _\pi ^2)^{\perp }&= \mu _\pi ^2, \end{aligned}$$
(38)
$$\begin{aligned} (\rho _D^3)^\perp&= \rho _D^3,&(\rho _{LS}^3)^\perp&= \rho _{LS}^3, \end{aligned}$$
(39)

where the HQE parameters denoted with “\(\perp\)” are those employed in [16] and defined with spacial covariant derivatives \(D^\perp _\mu = (g_{\mu \nu } - v_\mu v_\mu ) D^\mu\) instead of full derivatives. Respectively, the analyses of the \(E_\ell , M_X\) moments and the \(q^2\) moments yield

$$\begin{aligned} \rho _D^3 (1 \text { GeV})&= 0.185 \pm 0.031 \, \text {GeV}^3 \text { [16]},&\rho _D^3 (1 \text { GeV})&= 0.03 \pm 0.02 \, \text {GeV}^3 \text { [15]}. \end{aligned}$$
(40)

There might be several sources for the difference in the extraction of \(\rho _D^3\). As discussed below Eq. (30), these analyses employ the kinetic scheme but they do not redefine \(\mu^2 _G\). Effectively it means that there is a scheme difference in \(\mu^2 _G\) of the order of \((\alpha _s/\pi ) (\mu /m_b)^3 \simeq 0.1\%\) which might affect the extraction of \(\rho _D^3\). It is also worth noticing that the theoretical setup of Ref. [15] does not include—the currently unknown—QCD NNLO corrections or NLO corrections to the power suppressed terms for \(q^2\) moments, while the analysis [16] includes such higher-order corrections.Footnote 3 The inclusion of NNLO corrections to \(q^2\) moments can modify the fit results, in particular if large corrections appear for the moments with a rather large value of \(q^2_{\textrm{cut}}\).

Another important difference which could explain the discrepancy is the treatment of theoretical correlations. Ref. [15] showed that the value of \(|V_{cb}|\) is rather stable when different correlation scenarios are used, while the variation of HQE parameter values is more pronounced when changing assumptions for theoretical correlations. This issue needs further investigation and it is currently under study [76].

Finally, we may perform an average of the two inclusive \(|V_{cb}|\) values in Eqs. (35) and (37). To do so, we assume the same relative contribution of the uncertainty from the branching fraction in both the \(\langle E_\ell \rangle\) and \(\langle M_X\rangle\) analysis and the \(\langle q^2\rangle\) analysis. We then fully correlate this component. The leftover uncertainty originates from the experimental precision of the moments, which are independent measurements. As such, we keep this part uncorrelated. We treat the uncertainty from the theory prediction of the moments as uncorrelated and the uncertainty from the theory prediction of the rate fully correlated. We find the average

$$\begin{aligned} |V_{cb}| = (42.00 \pm 0.47) \times 10^{-3}\,. \end{aligned}$$
(41)

The two \(|V_{cb}|\) determinations and our average is shown in Fig. 4. We compare also to the very recent analysis in [75], where the first combined fit to the \(\langle E_\ell \rangle\), \(\langle M_X\rangle\), and \(\langle q^2\rangle\) moments (including newly computed \(\beta _0 \alpha _s^2\) corrections) was performed. We see excellent agreement between the two determinations. Below, we discuss in more detail the future outlook for such global fits.

Fig. 4
figure 4

Summary of the two inclusive \(|V_{cb}|\) determinations using two subsets of the available kinematic moments of the spectrum described in the text. In addition, we show our average of these two determinations, and the recent global fit to all kinematic moments from [75]

3.3 Higher order terms

Above, we focused our discussion on global analyses including HQE terms up to \(1/m_b^3\). At higher order, the number of parameters starts to proliferate, making their extraction from data challenging. A rough estimate of the size of these elements can be obtained using the Lowest Lying State Approximation (LLSA) [5, 77]. This approximation starts by presenting the matrix elements as a sum over the full set of intermediate hadronic states and then assumes that the lowest lying heavy-meson state saturates this sum. The degree of saturation by this lowest lying state determines the quality of the approximation, making its accuracy hard to quantify. A toy study in [77], estimated the uncertainty around \(50\%\). Nevertheless, the LLSA may be used to set the scale for the higher order elements as done in [78]. In this analysis, the effect of higher order terms up to \(1/m_b^5\) on the global fits were studied in detail. In an iterative approach, 9 \(1/m_b^4\), 17 \(1/m_b^5\) HQE parameters and the lower-dimensional parameters were fited to the lepton and \(M_X\) moments, starting from their LLSA value including a generous uncertainty. The authors of Ref. [78] conclude from this fit that most of the higher dimensional parameters do not change much from their initial LLSA values, indicating that there is low sensitivity to these parameters. In addition, the extracted value of \(|V_{cb}|\) changes very little even when repeating the analysis with a larger uncertainty for the higher dimensional operators. They report a \(-0.25\%\) reduction on \(|V_{cb}|\). In addition, this analysis shows no break down of the HQE at higher orders and strengthens the theoretical basis of the \(|V_{cb}|\) determinations.

More recently, the higher order terms up to \(1/m_b^4\) were studied for the first time using the \(q^2\) moments [15]. The benefit of these moments is that, like the total rate, they are RPI quantities, sensitive only to a limited set of HQE parameters. On the other hand, the \(\langle E_\ell \rangle\), \(\langle M_X\rangle\), are not RPI quantities as they are defined by choosing a specific frame or direction of velocity v. Up to \(1/m_b^4\), the latter depend on the full set of 13 parameters, while for RPI quantities, only 8 parameters contribute. In [15], two HQE parameter \(r_E^4\) and \(r_G^4\) were extracted from the data resulting in small values compatible with zero. As previously found, these results exclude large values for these parameters. On the other hand, large correlations between these two parameters and the \(\rho _D^3\) parameter were found, which is worth a further investigation. We note that including QCD corrections introduces two additional operators at \(1/m_b^4\) [13].

Finally, as mentioned in (10) the \(d\Gamma _5\) includes both \(1/m_b^5\) terms and \(1/m_b^31/m_c^2\) terms. The latter, the ``intrinsic charm” (IC) contributions, are numerically expected to contribute at the same level as the \(1/m_b^4\) terms. Very recently, a study of the RPI \(1/m_b^5\) terms and the numerical size of these corrections appeared [14]. The authors conclude that there may be cancellations between these effects and the genuine \(1/m_b^5\) terms and thus recommend a combined analysis of these terms as was done in [78] for the non-RPI moments.

3.4 Inclusive unitarity tests

In the above analyses, either only decays to electrons were used or a combination of the muon and electron final states. However, in the \(q^2\) analysis of Belle [52], also the compatibility of the electron and muon \(q^2\) moments at each order was calculated, leading to p-values close to one. For the total rate, the Belle II collaboration recently reported the most precise test of electron-muon universality in semileptonic B decays [79]

$$R_{e/\mu }(X_c)|_{\textrm{exp}}\equiv \frac{\Gamma (B\rightarrow X_c \mu {\bar{\nu }}_\mu )}{\Gamma (B\rightarrow X_c e {\bar{\nu }}_e)}1.007 \pm 0.009 ({\textrm{stat}}) \pm 0.019 (\textrm{syst}) \,$$
(42)

which agrees with the exclusive measurement in \(B\rightarrow D^*\ell \nu\) [80]. The measurement is also in agreement with the SM prediction [81]

$$\begin{aligned} R_{e/\mu }(X_c)|_{\textrm{theo}} = 1.006 \pm 0.001\ , \end{aligned}$$
(43)

which updated previous predictions in [82], see also [83]. The numerical value is obtained using the HQE elements from [45]. It includes NLO QCD corrections [84]. Very recently, Belle II also reported the ratio with \(\tau\) leptons [85]

$$\begin{aligned} R_{\tau /\ell }(X)|_{\textrm{exp}} = 0.228 \pm 0.016 ({\textrm{stat}}) \pm 0.036 (\textrm{syst}) \ , \end{aligned}$$
(44)

which agrees with the SM prediction of \(R_{\tau /\ell }(X)|_{\textrm{theo}} = 0.220 \pm 0.004\)Footnote 4 [81]. We note that this prediction does not include the NLO QCD corrections to the semitauonic decays up to \(1/m_b^3\) which were recently computed [86].

3.5 Outlook

The impressive precision reached on the inclusive \(|V_{cb}|\) determination shows the tremendous progress made on both the theoretical and experimental side. On the side of the perturbative corrections, the theory prediction has reached \(\alpha _s^3\) for the partonic total rate and the moments without cuts, and even first determinations of the \(\alpha _s\) contribution to the \(\rho _D^3\) parameter are available. The new \(q^2\)-moment analyses by both Belle and Belle II add new and valuable insights, as they have different sensitivities to the HQE parameters. Thanks to their RPI nature, they depend on a reduced set of HQE parameters allowing to probe \(1/m_b^4\) terms purely from data, indicating no break down of the HQE expansion. While the HQE parameters may depend on the specific theory correlations considered, \(|V_{cb}|\) is very robust with respect to this.

Very recently, a full analysis including all the measured moments (lepton energy, \(M_X\) and \(q^2\)) appeared [75]), where also the BLM \(\alpha _s^2\) corrections were calculated. These were found to be large. A full calculation of all the \(\alpha _s^2\) corrections with a kinematic cut would be necessary to determine the impact of such corrections, which is currently in progress [39]. We do not discuss the result in [75] here in detail as it is based on the same setup as the previous fits in [45] discussed above. We however note that the extracted value of \(\rho _D^3\) found in [75], matches the previously obtained on in [45]. Including the full \(\alpha _s^2\) results in a global analysis should confirm these results and will hopefully also clarify the puzzling differences in the \(\rho _D^3\) parameters, which has a large effect on the calculation of lifetimes [87] and other inclusive decays like \(B\rightarrow X_{s,d}\ell \ell\) [8, 9].

On the experimental side, we are expecting updates of the branching ratio, which is the dominant input for the \(|V_{cb}|\) extraction, where also measurements with a \(q^2\) cut (instead of a lepton energy cut) are under consideration. New determination of the \(M_X\) moments are also highly anticipated. The first preliminary measurement of inclusive \(B\rightarrow X_c \tau {\bar{\nu }}_\tau\) decays also opens an interesting new window to explore. We stress that a similar analysis but for the \(B_s^0\) would be highly anticipated, but at the moment is not foreseen.

From the theory side, first progress has been made to obtain the \(\alpha _s^2\) corrections to the \(q^2\) moments with a cut on the \(q^2\) in an analytic way. The methodology employed can in principle be extended the electron energy spectrum and a new evaluation of \(E_\ell\) and \(M_X\) moments with a lower cut on the electron energy, thus extending the results from [37]. Note also that the master integrals calculated at NNLO for the partonic decay rate can be also employed in the evaluation of the higher QCD corrections to the \(1/m_b\) terms, as was shown for \(q^2\) moments at NLO in [86]. The method employed in [39] can also be applied to calculate the N3LO corrections to moments with cuts if the reduction to master integrals is possible and the system of differential equations can be established. In addition to that it is desirable to study the corrections of order \(1/m_b^3\) in the relation between the bottom quark pole mass and kinetic mass. Up to \(1/m_b^2\) the perturbative version of \(\mu _\pi ^2\) and \(\rho _D^3\) entering Eq. 29 are given in terms of small velocity (SV) sum rules. At order \(1/m_b^3\) also non-local operators appear (denoted in Ref. [47] by \(\rho _{\pi \pi }, \rho _{\pi G}\), etc.), their perturbative version, however, is not given in terms of SV sum rules and a dedicated formulation of their definition in the kinetic scheme is necessary. Overall, the inclusion of higher order corrections may benefit the global fit analysis by reducing not only the theory uncertainties but also the sensitivity on the theoretical correlation assumptions in case the theory uncertainty becomes subdominant compared to the experimental one. As mentioned, such theory correlation may play a role in the extraction of the HQE parameters, although they affect \(|V_{cb}|\) marginally [15].

Also the possibility of new physics (NP) entering in the light-lepton inclusive modes has been considered using the total rate [88] and recently extended to include also the moments [89]. An analysis of the moments including NP effects using an EFT approach is in progress [76]. Given the current level of precision, also QED effects should be revisited as pointed out recently by [90].

Finally, there has been interesting progress in Lattice QCD calculation to obtain information on inclusive decays. Recently, the first comprehensive investigation of inclusive semileptonic B-meson decays was performed using the method developed in [91] and using ensembles from the EMT and JLQCD collaborations. These results do not yet include, for example, continuum and infinite-volume limits, but they show that this is certainly a promising direction to further explore. Also, inclusive \(B_s\rightarrow X_c \ell \nu\) decays may be accessible; a pilot lattice computation was recently presented [92].

4 Inclusive \(B\rightarrow X_u\) decays and extraction of \(|V_{ub}|\)

The extraction of \(|V_{ub}|\) from inclusive \(B\rightarrow X_u \ell {\bar{\nu }}_\ell\) decays is challenging due to the large \(B\rightarrow X_c\) backgrounds. In order to remove these, the experiments apply a series of cuts that destroy the convergence of the local OPE. The latter would (simply) be the \(m_c\rightarrow 0\) limit of the \(B\rightarrow X_c\) case which introduces in addition four-quark operators (see also [93] and discussion below). The applied cuts introduce a sensitivity to the effects of the Fermi motion of the heavy quark inside the B meson, described by a nonlocal OPE containing (non-perturbative) distribution function called shape functions (SFs) [94,95,96].

The \(B\rightarrow X_u \ell {\bar{\nu }}_\ell\) can be described in terms of the hadronic variables (see e.g. [97])

$$\begin{aligned} P_\ell = M_B - 2E_\ell, \quad \quad P_- = E_X + |\vec {P}_X|, \quad \quad P_+ = E_X - |\vec {P}_X|, \end{aligned}$$
(45)

where \(P_\pm\) are light-cone components of the hadronic momentum, with \(P_+P_-=M_X^2\), \(E_\ell\) is the lepton energy, \(E_X\) is the hadronic energy and \(P_X\) the hadronic momentum. In order to avoid the large \(b\rightarrow c\) backgrounds, the measurements are restricted to \(P_+P_-<M_D^2\), which can be achieved using experimental cuts ranging from cuts on the lepton energy \(E_\ell > (M_B^2-m_D^2)/(2M_B)\), the hadronic invariant-mass squared \(s_H<M_D^2\) and/or on the dilepton invariant-mass squared \(q^2>(M_B-M_D)^2\).

In the appropriate region of phase space, the differential decay rate is factorized into perturbatively calculable hard functions H and jet function J and a universal soft shape functions S. Schematically, at leading order, we have

$$\begin{aligned} d\Gamma \sim H\cdot J \otimes S \ , \end{aligned}$$
(46)

where H incorporate physics at the high scale \(\mu _h \sim m_b\), the J those at scales \(\mu _i\sim \sqrt{m_b \Lambda _{\textrm{QCD}}}\) and the S describes the effects below the intermediate scale \(\mu _i\). In the infinite mass limit, there is only one SF, while power corrections in \(1/m_b\) introduce several subleading SFs. The moments of the shape function are related to the HQE parameters that describe the local OPE discussed in Sect. 3 and can be obtained from the global \(B\rightarrow X_c\ell \nu\) fits (see [12] for relations up to the fifth moment). For the subleading SFs only the first three moments are known [98]. The leading order SF can in principle be extracted from the \(B\rightarrow X_s \gamma\) spectrum [94, 95]. However, at subleading order also non-pertubative resolved photon contribution will play a role, making the use of data from these decays more challenging [99,100,101].

4.1 Theoretical approaches

There are several theoretical approaches available to describe the differential rate, largely based on the factorization in (46). The current strategy of HFLAV [72] is to use several of these theoretical calculations to extract \(|V_{ub}|\). In the following, we briefly discuss the mostly used frameworks.


GGOU Gambino, Giordano, Ossola and Uraltsev [102] compute the triple differential decay rates of \(B \rightarrow X_u \ell \nu\) based on the OPE with a hard Wilson cutoff \(\mu\). It consistently includes perturbative corrections up to \(\alpha _s^2 \beta _0\) and non-perturbative effects up to \(1/m_b^3\). This approach does not require the introduction of subleading SFs, because the Fermi motion is parameterized by three \(q^2\)-dependent SF, with moments fixed by the \(B\rightarrow X_c\) analyses. The uncertainty due to the functional form of the SF is estimated using a large number of different forms, which account for about half of the total uncertainty [72].


DGE The Dressed Gluon Exponentiation (DGE) approach [103] provides a pertubative model for the leading SF. The approach uses an on-shell b-quark calculation as a first approximation to the meson decay spectrum using Sudakov resummation. The approach is specifically designed to obtain the triple differential rate in the small \(P^+\) region.


ADFR The model by Aglietti, Di Lodovico, Ferrera and Ricciardi [104] describes the \(B\rightarrow X_u\) spectra with a model including non-perturbative soft-gluon effects through an effective time-like QCD coupling [104]. In this work, the authors only consider analysis with a large cut on the lepton energy \(E_\ell >2.3\) GeV to avoid the charm background.


BLNP The approach by Bosch, Lange, Neubert and Paz (BLNP) [97, 105, 106] aims to smoothly interpolate between the local OPE region (i.e. where HQE works in terms of the local operators) and the non-pertubative shape-function region. The approach is, however, optimized for the SF region, to the extend that when integrating over larger regions of phase space it reduces to the OPE results up to terms of \({\mathcal {O}}(\alpha _s^2)\) corrections. The current implementation includes all the known contributions in 2005, the urgent update of this approach is in progress [107]. These will include the \(\alpha _s^2\) corrections to the hard function H [106,107,110] and two-loop corrections to the jet function J [111]. In addition, also subleading jet have been studied [112]. Also at the partonic level, \(\alpha _s^2\) [113, 114] and even \(\alpha _s^3\) [27, 28] corrections are now available. The original BLNP setup uses the shape function scheme for the \(m_b\) [105], which depends on the scale at which the SF is normalized and can be related to other short-distance mass schemes. In most experimental analyses, these parameters are obtained by converting the parameters (\(m_b\) and \(\mu _\pi ^2\)) from the \(B\rightarrow X_c\ell \nu\) analysis obtained in the kinetic scheme as discussed above to the shape-function scheme using [115, 116]. In addition, while BLNP considers a large range of models for the (subleading) SF, in most analyses only the exponential model is used.

The above approaches mainly differ in their treatment of the (subleading) SFs, which presents the large source of uncertainty in the theoretical predictions. Alternatively, the experimental data can be used to constrain the SFs beyond the constraints from the OPE. As discussed, for \(B\rightarrow X_s\gamma\) this gives additional complications, but also the \(B\rightarrow X_u\ell \nu\) differential spectra themselves could give insights into the SFs.

The SIMBA approach [117] introduces a new framework to treat the shape function, which consistently incorporates its renormalization group evolution and all constraints on its shape and moments in any short distance mass scheme. It aims to perform a combined analysis of \(B\rightarrow X_s \gamma\) and \(B\rightarrow X_u \ell \nu\), to determine \(|V_{ub}|\), the leading SF and even the \(C_7\) contribution to the radiative rate [118, 119]

The NNVub approach [120] uses artificial neutral networks to parameterize the shape function. It allows for unbiased estimates of the SF form and a straightforward implementation of the experimental data. The method was combined with the GGOU approach to extract \(|V_{ub}|\), leading to reasonable agreement with previous results obtained.

4.2 Experimental hybrid Monte-Carlo approach

We discuss here the simulation of the inclusive \(B\rightarrow X_u \ell {\bar{\nu }}_\ell\) decays, used at Belle (II), Babar and CLEO.

To generate a Monte-Carlo (MC) mixture of resonant and non-resonant final states, the inclusive rate can combined with exclusive final states (like \(B\rightarrow \pi , \rho ,\eta , \omega , \eta '\)) in a so-called ’hybrid’ approach, which was originally suggested in Ref. [121]. This currently relies mainly on the inclusive theory predictions by De Fazio and Neubert (DFN) [122], based on the pertubative calculation including a smearing function to account for the Fermi motion. These are then and exclusive predictions are combined such that the partial branching fractions in the triple differential rate of the inclusive (\({\mathcal {B}}_{jkl}^\textrm{incl}\)) and combined exclusive (\({\mathcal {B}}_{jkl}^\textrm{excl}\)) predictions reproduce the inclusive values. This is achieved by assigning weights \(w_{ijk}\) to the inclusive contributions such that

$$\begin{aligned} {\mathcal {B}}_{jkl}^\textrm{incl} = {\mathcal {B}}_{jkl}^\textrm{incl} + w_{ijk} {\mathcal {B}}_{jkl}^\textrm{incl} \,, \end{aligned}$$
(47)

with ijk denoting the corresponding bin in the three parameters \(q^2\), \(E_\ell ^B\), and \(M_X\). Typical bins of \(q^2\), \(E_\ell ^B\), and \(M_X\) are

$$\begin{aligned} \begin{aligned} q^2&= [0, 2.5, 5, 7.5, 10, 12.5, 15, 20, 25]\, \textrm{GeV}^2 \,, \\ E_\ell ^B&= [0, 0.5, 1, 1.25, 1.5, 1.75, 2, 2.25, 3]\, \textrm{GeV}^2 \,, \\ M_X&= [0, 1.4, 1.6, 1.8, 2, 2.5, 3, 3.5]\, \textrm{GeV}^2 \,. \end{aligned} \end{aligned}$$
(48)

In practical applications, such as MC generation, the prediction of the inclusive rate is evaluated point-wise by the event generator EvtGen [123] and the hadronic system is passed on to Pythia [124] for fragmentation. By introducing the Hybrid weights to inject the resonant contributions into the spectrum artificial, and nonphysical, flanks are introduced into the spectrum (See for example Ref. [125, 126]). Although this approach works well in practical applications with current experimental precision these flanks in the distribution will become visible causing discrepancies between recorded and simulated data.

4.3 Determinations of \(|V_{ub}|\)

In the current HFLAV \(|V_{ub}|\) determinations, measurements with different phase space cuts from Babar [127,128,129], Belle [126, 130,131,132] and CLEO [133] are included.

We want to highlight here the most recent determination of inclusive \(|V_{ub}|\) by the Belle Collaboration [126], where cuts on \(E_\ell\), \(q^2\) and/or \(M_X\) were applied. They extract values for \(|V_{ub}|\) using the four theoretical models described above and quote as a final result the average of these:

$$\begin{aligned} |V_{ub}| \,= (4.10\pm 0.09\pm 0.22\pm 0.15)\cdot 10^{-3}\ , \end{aligned}$$
(49)

where the uncertainties are statistical, systematical and theoretical. All four determinations give compatible results. This determination is in good agreement with the HLAV [72] averages for the BLNP and GGOU determinations,

$$|V_{ub}|_{\textrm{BLNP}} \,= (4.28 \pm 0.13\pm 0.20)\cdot 10^{-3} \, , \quad |V_{ub}|_{{\textrm{GGOU}}} = (4.19 \pm 0.12\pm 0.12)\cdot 10^{-3} \, ,$$
(50)

where the errors correspond to the experimental and theoretical uncertainties.

In addition, Belle presented the first measurements of the differential branching fractions using the full Belle data set [134] (using the same collision events as [126]). As already stressed, these measurements are very important as they can be used in the future studies of the non-perturbative decay dynamics (for example for the NNVub and Simba approaches). The measurements are, depending on the region of phase space, statistically or systematically limited, and show fair agreement to hybrid and inclusive predictions.

4.4 Ratio of exclusive over inclusive \(|V_{ub}|\)

Given the tensions between exclusive and inclusive determinations of \(|V_{ub}|\), it is interesting to directly measure the ratio of both. Recently, the Belle collaboration has performed such a simultaneous analysis of inclusive \(B\rightarrow X_u \ell {\bar{\nu }}_\ell\) and exclusive \(B\rightarrow \pi \ell {\bar{\nu }}_\ell\) decay [135]. This approach has the advantage that the experimental technique is shared for both determinations, removing potentially unknown experimental biases present in only one of the measurements. It further allows to determine the correlations between the two resulting \(|V_{ub}|\) values. However, due to required optimization of two physical quantities in the simultaneous analysis, the determined values of \(|V_{ub}|\) are less precise than their counterparts from individual inclusive and exclusive determinations. The resulting \(|V_{ub}|\)’s from the inclusive and exclusive distributions are compatible with each other. The extracted exclusive inclusive ratio is [135]

$$\begin{aligned} |V_{ub}^\mathrm {excl.} / |V_{ub}^\mathrm {incl.}| \,= 0.97 \pm 0.12 \,. \end{aligned}$$
(51)

Further, the result is compatible with the inclusive \(|V_{ub}|\) determined with the same collision data in Ref. [126], and the current world average provided by HFLAV [72]. The result is less precise but stable if only inputs from LQCD and no additional external experimental data is included to constrain the form factors of \(B\rightarrow \pi \ell {\bar{\nu }}_\ell\).

4.5 Weak annihilation

Weak annihilation effects arise in the local OPE at order \(1/m_b^3\), but are enhanced by a large prefactor [136]. These effects can cause a sizeable difference between \(B^0\) and \(B^+\) because they arise from spectator effects mainly concentrates at the endpoint of the spectrum. Interestingly, the size of WA can be estimated from inclusive semileptonic charm decays as its effect is expected to be enhanced. The way the OPE for the \(b\rightarrow u\) is set up, is rather similar as how one would set up that for the \(c\rightarrow s\) or \(c\rightarrow d\) transitions. Whether or not the HQE can be applied to charm decays has been discussed already years ago [137, 138], with a recent revision in [139]. Clearly, both the expansion parameters \(\alpha _s(m_c)\) and \(\Lambda _{\textrm{QCD}}/m_c\) are much larger than in bottom decays. At the same time, this makes charm decays more sensitive to the HQE parameters and to the four-quark (weak annihilation) operators. In the end, a thorough experimental study of the relevant HQE hadronic matrix elements has to be performed to determine if the HQE expansion for charm converges well enough. The challenge in doing so lies in finding an appropriate renormalon free mass description for the charm. In principle, one could define a kinetic mass for the charm, but this introduces terms \(~\mu /m_c\) where even for \(\mu =0.5\) GeV large corrections enter. Alternatively, one could replace the mass by observable quantities which are renormalon free [140, 141].

A study of the inclusive charm decays was done in [142], where data from CLEO-c [143] for the charm semileptonic decay spectrum was analysed to determine the weak annihilation matrix elements. Using the kinetic scheme for the charm mass using the IR cutoff scale \(\mu _{\textrm{kin}}=0.5\) GeV and \(\alpha _s = \alpha _s(m_c)\), they find a satisfactory description of the data, even in the absence of weak annihilation effects. They quote a conservative determination of [142]

$$\begin{aligned} B_{\textrm{WA}}^s = -0.0003(25) {\textrm{GeV}}^3 \ . \end{aligned}$$
(52)

Converting for the B meson as

$$\begin{aligned} B_{\textrm{WA}}^{bq}(\mu _{\textrm{WA}}) = \frac{m_B f_B^2}{m_D f_D^2} B_\textrm{WA}^{cq}(\mu _{\textrm{WA}}) \ , \end{aligned}$$
(53)

which leads to

$$\begin{aligned} |B_{\textrm{WA}}^{b}(\mu _{\textrm{WA}} = 0.8 {\textrm{GeV}})| < 0.006 \;{\textrm{GeV}}^3 \ , \end{aligned}$$
(54)

which leads to a maximum of \(2\%\) from the WA contribution to the total rate of \(B\rightarrow X_u \ell \nu\). This study shows that the weak annihilation effects are not unexpectedly large. Also in [144], the effects of the WA operators was found to be small, which gives confidence in the validity of the HQE for charm decays.

We stress that, besides extracting the WA contributions a full extraction of the HQE parameters in charm would be interesting. This could also open the way to inclusive \(|V_{cs}|\) and \(|V_{cd}|\) extractions (see e.g. [139]). In this respect, specifically BESIII can play a big role. Recently, BESIII measured the absolute branching fraction of inclusive semielectronic \(D_s^+\) spectrum [145]. Measurements of moments of this spectrum, like in \(q^2\) or the lepton energy moments, are highly anticipated in order to perform such a HQE analysis of the spectrum.

4.6 Outlook

The recent Belle measurements of the differential distribution open the road to directly compare with theoretical predictions and to obtain information on the shapefunctions directly from data. New measurements of differential spectra in \(q^2\), \(M^2_X\), \(P_\pm\) and \(E_\ell\), with improved precision are thus highly anticipated. Separating these in \(B^0\) and \(B^+\) decays would also be useful. To improve our knowledge of \(|V_{ub}|\), it will be important to provide unfolded spectra of these measurements. With this increased precision, the Monte-Carlo methods will play an even more dominant role and checking the assumptions entering the hybrid model will be crucial. This requires a close collaboration between the experimental and theoretical communities. Improving the DFN theoretical predictions that are currently entering could be one of the possibilities. For example, in [130], the hybrid Monte-Carlo framework was corrected to account for the GGOU model.

With precise Monte-Carlo predictions, it could be possible to move away from using kinetic cuts on the experimental side because the \(B\rightarrow X_c\) background could be subtracted based on MC data. This would then allow the use of the local OPE. Alternatively, the full \(B\rightarrow X\ell {\bar{\nu }}\) events could be used (without subtracting the \(b\rightarrow u\) or \(b\rightarrow c\) contribution). In this way, also the ratio of \(|V_{ub}/V_{cb}|\) would enter in the local OPE (see also [93]). It remains to be seen if this contribution could be resolved in a global fit of moments of the spectrum. Such measurements are planned at Belle (II).

On the theoretical side, the long awaited update of the BLNP framework adding all the available \(\alpha _s^2\) corrections is in progress.

Finally, very recently also the first measurement of the ratio of inclusive \(B\rightarrow X_u\) and \(B\rightarrow X_c\) was reported by the Belle collaboration [146]. Using the BLNP and GGOU methods to determine the \(B\rightarrow X_u\) rates, they obtain values for the ratio \(V_{ub}/V_{cb}\) in excellent agreement with the current inclusive and exclusive world averages of the rates. Specifically [146]:

$$\frac{|V_{ub}|^{\textrm{GGOU}}}{|V_{cb}|}= 0.0996( 1\,\pm \,4.2\%_{\textrm{stat}}\pm 3.9\%_{\textrm{syst}}\, \pm \, 2.3\%_{(B\rightarrow X_u) \textrm{theo}}\,\pm \, 2.0\%_{(B\rightarrow X_c)\textrm{theo}}) .$$
(55)

This first direct measurement of the inclusive ratios is interesting because potential bias due to mismodeling of the \(B\rightarrow X_u\) component is reduced. In addition, on the theoretical side, the \(B\rightarrow X_c\) and \(B\rightarrow X_u\) share common inputs (the HQE parameters). A dedicated theoretical could thus reduce the theoretical uncertainty.