The persistent nonperturbative charm enigma

The question of the existence and possible magnitude of nonperturbative (often called"intrinsic") charm in the proton has long confounded attempts to cleanly isolate such a contribution in global analyses of high-energy experiments. In this letter, we show that the available (non)perturbative QCD theory and hadronic data have still not developed to a sufficient level to clearly resolve this problem. We highlight a number of challenging aspects that must be confronted in extracting nonperturbative charm in PDF fits, and in so doing, present an updated next-to-next-to-leading order CT analysis of fitted charm, CT18 FC, which we also compare to recent studies. We outline the theory developments and future data needed to make progress on this subject.

1. Introduction. The possibility that the nucleon might harbor a small but nonzero nonperturbative charm component [1] was identified soon after the establishment of QCD as the microscopic theory of the strong interaction. This notion, which has variously been called "intrinsic," "nonperturbative," or "fitted" charm, has been challenging to formalize in an unambiguous fashion based on rigorous QCD. Despite this, multiple efforts have attempted to isolate a nonperturbative charm PDF through a global QCD analysis of the available hadronic data, a class of approaches we designate fitted charm (FC). These works often assume specific nonperturbative shapes for the FC PDF at the evolution starting-scale, Q 0 , based on QCD-inspired intrinsic charm (IC) models, producing inconclusive results regarding the possible magnitude. The overall magnitude and c,c asymmetry induced by nonperturbative charm is often quantified in the moments of the charm PDF [2,3,4], ⟨x n ⟩ c ± (Q) ≡ 1 0 dx x n (c ±c)[x, Q], where the n = 1 moment of the "+" combination yields the total proton momentum carried by the charm (anti)quark.
On the basis of the momentum fraction or related quantities, various definite but conflicting claims have been made in the literature with respect to nonperturbative charm. In this Letter, we highlight lingering challenges by reviewing the current status of FC on the basis of recent QCD theory, and by presenting a global QCD analysis within the next-to-next-to-leading order (NNLO) CTEQ-TEA (CT) framework that updates in-depth studies of the underlying theory and phenomenology published a few years ago [3,4]. After a discussion of open issues in the status, definition, and implications of nonperturbative charm from the perspective of QCD theory (Sec. 2), in Sec. 3 we turn to the posited sensitivity of several experimental measurements to nonperturbative charm. Following this, in Sec. 4 we introduce a new family of PDFs with FC, CT18 FC, and compare their behavior and agreement with data against analogous findings from other recent studies. We point out that nonperturbative QCD effects may lead to a difference between charm quark and antiquark PDFs at the low-energy scale, as we illustrate through the example of a particular pair of IC models. In Sec. 5, we contrast against the recent NNPDF analysis [5], which reported a 3σ-level extraction of "intrinsic charm" based on an x-dependent deviation of a FC PDF from a purely radiativelygenerated charm scenario. We assess the significance of the suggested evidence in light of subtleties in the analysis methodologies used in FC studies. Finally, we conclude (Sec. 6) with recommendations for future studies on nonperturbative charm at the Large Hadron Collider (LHC) and Electron-Ion Collider (EIC). Additional, detailed quantitative results are provided in the Supplementary Discussion (SD) section.
2. Nonperturbative charm in QCD. The argument for a finite nonperturbative charm component in the proton is grounded in low-energy QCD. Open questions nonetheless remain concerning the rigorous definition, potential process (in)dependence, and actual magnitude of this contribution [4]. An enduring challenge in studies of IC has been the absence of a universal definition derived formally from QCD and free of ambiguities related to the interpretation. Traditionally, IC has been argued [1,6] to arise from the production of charm quark-antiquark pairs in long-distance QCD interactions. Despite its small magnitude, IC may in principle still be discernible from the perturbative ("extrinsic") production of charmed final states through radiation in independent scatterings off initial-state gluons and light (anti)quarks. As IC processes are thought to generate a nonzero charm distribution at Q ∼ m c , they act over sufficiently soft momenta as to be inherently nonperturbative; as such, the associated dynamics has been formulated through models which typically couple the proton to 5-quark intermediate states [1,7,8] explicitly containing c,c. From this simplified picture, various elaborations are possible, involving, e.g., production of intermediate hadronic (meson-baryon) states [9,10] or multi-quark virtual states with different spin structure [8].
Any IC PDF is ultimately a scheme-dependent function, analogous to the MS charm mass. It is not a physical scattering contribution that can be directly measured and thereby unambiguously discovered. Borrowing an analogy to nucleon strangeness, it is tempting to freely parametrize and fit xc + at or near the charm threshold and identify the resulting FC PDF with IC. We again stress that the FC PDFs extracted in recent examples of this approach, including CT14 IC [4], NNPDF [5,11,12], and Ref. [3], are actually approximations of IC, due to the possibility that the post-fit parametrization may absorb contributions unrelated to IC. This is reinforced by the fact that there is no unambiguous mapping between the IC PDFs predicted in nonperturbative models and FC PDFs that might be extracted in global fits based on QCD factorization; this ambiguity includes the fact that IC models should apply at an energy scale Q that is indeterminate. Furthermore, compared to perturbatively-generated charm, the magnitude of FC 2 quantified in this way depends on various theoretical parameters in the PDF analysis, such as the value of the charm-quark mass, m c ; QCD coupling strength, α s ; and auxiliary energy scales introduced in massivequark factorization schemes. In fact, the FC PDF itself is analogous to the fitted-charm mass, whose best-fit value may be affected by corrections leading to deviations from the charm mass in the QCD Lagrangian. In addition, it is not clear that indications of IC are process-independent, as would be necessary to claim that the IC is a universal component of the proton wave function. Ref. [4] provides a systematic ordering of leading-power and power-suppressed (higher-twist) contributions in DIS to understand the IC component in the framework of QCD factorization, and it illustrates the IC dependence on theory parameters at NNLO. Without theory developments to connect FC extracted in PDF analyses to the IC from traditional models, it is impossible to guarantee that the fitted "IC" is in fact a universal contribution to the proton wave function, rather than a process-dependent, non-leading twist, or other effect that has been spuriously absorbed. Still, many attempts have been made to constrain IC in QCD global fits, recently including CT14 IC [4], NNPDF [5], and Ref. [3]. While these studies have elucidated many aspects of the IC, inconclusive hints from experiments have not yet reached discovery-level, particularly given the subtleties discussed above.
3. Experimental signatures of nonperturbative charm. Unraveling a possible nonperturbative charm PDF at high x is empirically challenging, owing both to its small net momentum fraction, [⟨x⟩ FC ≡ ⟨x⟩ c + (Q 0 ) ≲ 10 −2 ], and the difficulty of performing the necessary flavor separation. Apart from the classical search channel in large-x DIS charm production [13,14], few hadronic data sets have been identified in previous QCD global fits as having clear 'smoking gun' sensitivity to nonperturbative charm, though it has been suggested [5] that the bulk of recent pp data from the LHC provide sufficient sensitivity to unravel IC. Many of these hadronic data sets have been independently investigated by the CT and other fitting groups and jointly explored in the recent PDF4LHC'21 exercise [15]; for these sets, there is no clear preference for FC in CTEQ-TEA fits; we reconfirm this point in the updated CT refit of FC shown in Sec. 4. In the SD section, we provide several plots illustrating the generally marginal pull of the latest LHC data on the FC PDF; these pulls can in fact become intermixed with the light-quark sea, obfuscating the relationship with nonperturbative charm.
In terms of measurements with more direct access, pp → Z + c has been suggested [16,17] as having elevated sensitivity to the high-x charm PDF and possible IC scenarios. Fully leveraging these data requires that they be consistently treated at the current standard in PDF analyses for high-energy physics (HEP) -NNLO. While NNLO QCD predictions have been published for Z+b-jet and W+c-jet production at the LHC recently [18,19], and for Z + c production [20] after the initial release of this article, these have not been practically incorporated into PDF fits. In the Z + c process, final-state parton showering and hadronization introduce a large correction at NLO that dampens the excess at high p T typically induced by large-x IC; see a quantitative investigation of the impact of parton showers on the sensitivity to various IC models in Sec. 6 of Ref. [4], as well as the very latest work in [20]. In Fig. 2 of the SD, we show theory predictions for the 2022 LHCb σ(Zc)/σ(Zj) ratios [21] based on MCFM at NLO and several PDF sets with and without FC; these calculations show that inclusion of FC does little to enhance agreement with the experimental data, while significant uncertainties exist in current theory predictions for this process. Additionally, in the available Z + c measurements, the anti-k T jet algorithm with charm tagging applied at the experimental level differs from those needed for QCD calculations to be infrared-safe (flavor-k T [22] or flavored antik T [23]). This enhances the sensitivity to the details of treatment of charm quarks with low p T in the sample, e.g., to the specific choice of the jet cone size and p c T cut of a few GeV. In the b-jet measurement, a related uncertainty is estimated to be about 10% [23], and in the c-jet case at LHCb such uncertainties may obscure discrimination among the FC models. Pending better control of QCD uncertainties, we do not include the presently available Z + c experimental data into our analysis.
In the meantime, available deep-inelastic scattering measurements have limited sensitivity to IC. While the EMC data [13] on the charm structure function, F c 2 , have been suggested [14,24] as hinting at the existence of IC, Refs. [3,4] ran into difficulties fitting these data under a variety of physics scenarios. The CT14 IC analysis [4] examined the EMC data set in depth and did not include it in the published fit because these data did not follow modern standards in characterizing experimental systematics and were analyzed at LO. This study [4] found χ 2 /N pt ≈ 2.  Figure 1. A comparison among the four FC models used in this study, CT18 and CT18X NNLO BHPS3 as well as CT18 NNLO MBMC/MBME. We show the fitted PDFs based on these for xc + (x, Q) at Q = Q 0 = 1.27 GeV (left) and Q = 100 GeV (center). We also plot the asymmetric ratio, c − /c + , for MBMC and MBME scenarios at both scales, as well as for CT18(X) extrinsic scenarios at Q = 100 GeV augmented by a factor of ten (right).
in Ref. [3]). In the current CT18 framework, we continue to find comparably high χ 2 for the EMC F c 2 , as well as strong dependence on the prescription for the systematic uncertainty, which is not under adequate control for this data set. The CT18 theory predictions overshoot the EMC data points over the whole x range, likely reflecting the magnitude of the large-x gluon PDF, except for the two highest-Q points, for which inclusion of the FC leads to a few-unit improvement in χ 2 . In turn, the values of the gluon PDF at these x and Q reflect constraints primarily from well-fitted LHC and Tevatron jet experiments. Hence, the CT18 FC PDFs do not include the EMC F c 2 data. SLAC DIS measurements on both the proton and deuteron imposed stringent limits on any allowed FC in Ref. [3], which may be modified nevertheless by considering a wider range of PDF and higher-twist parametrizations. (Note also that IC is nominally an NNLO effect in α s [4].) In contrast, Ref. [5] discussed in Sec. 5 obtained χ 2 /N pt ≳ 1 for the EMC data. Unlike CT, Ref. [5] modified the nominal data reported by EMC. These include a larger overall correlated systematic uncertainty (15%, vs. 11% as given in the original EMC publication); a shift in the central values by a factor of 0.82 based on a reduced branching ratio; and additional nuclear uncertainties. These adjustments reduce the χ 2 /N pt for NNPDF, as does a different behavior of the high-x gluon relative to CT, which lowers the theoretical predictions over the whole range independently of the inclusion of FC. The CT default prescription is to fit the EMC data set based on the reported experimental uncertainties; an in-depth look into the CT18 and NNPDF fits may shed light on the implications of the EMC F c 2 data for FC.

CT18 Fitted Charm
PDFs. The CT18 FC family of PDFs presented in this article is obtained by repeating the CT18 NNLO and CT18X NNLO analyses using three parametrizations of the charm and anticharm PDFs at the initial scale, Q 0 = 1.265 GeV (slightly below the charm pole mass, m c = 1.27 GeV, where the transition from three to four active quark flavors in these fits takes place; we note that this was m c = 1.3 GeV for the CT18 and CT18X NNLO fits). The CT18 FC analysis indicates that the constraining power of available data remains insufficient for establishing the shape and non-zero normalization of the FC PDFs derived from high-x IC models. This conclusion is consistent with the findings of the earlier CT14 IC study as well as Ref. [3]. The CT18 FC study is based on the up-to-date NNLO QCD theory and data selections of the CT18 [25] global analysis as well as dedicated follow-up studies, including CT18As Lat, which explored an s ̸ =s initial parametrization with the inclusion of lattice data [26]. Collectively, the published CT18 FC ensemble encompasses 12 PDF sets: 4 variations of the analysis and underlying IC model -specifically, the BHPS3 [27,4] and meson-baryon model (MBM) [10] -for each of which we release 3 PDF sets corresponding to the central (best) fit and fits at intervals of ∆χ 2 = 10 and 30. The latter ∆χ 2 value approximates the standard 68% C.L. CT tolerance, while the former represents a more restrictive scenario compatible with 4 the MSHT20 tolerance [28]. These PDFs thus estimate uncertainties in FC at high x in accord with the common tolerance criteria. As discussed in Sec. 2, nonperturbative charm can be envisioned as an effective 5-quark Fock state [1,9,7,10], with the resulting charm-anticharm configuration largely co-moving with the proton. The resulting IC PDF thus possesses a valence-like shape at Q 0 ∼ m c , with an enhancement in c(x, Q) at x > 0.1 which, for sufficiently large normalizations, survives above the perturbatively-generated charm PDF to electroweak scales, Q 2 ≫ m 2 c -see the illustration in Fig. 4 of Ref. [4], as well as Fig. 1 here. The BHPS3 fits assume the BHPS3 IC model discussed in Ref. [4] under two variants, namely, using the CT18 NNLO baseline and the alternative CT18X NNLO [25] scenario in which DIS data are fitted using a Bjorken x-dependent factorization scale to model small-x saturation. For the MBM, we consider two realizations based on confining (MBMC) and effective-mass (MBME) quark models as discussed in detail in Ref. [10]. The two MBM models predict more significant differences in c andc PDFs at Q 0 , in contrast to the other scenarios in which c ̸ =c is generated only by higher-order perturbative corrections.  GeV, for each of the FC scenarios considered in this work: the BHPS3 model [27,4] within CT18 NNLO and CT18X NNLO and the meson-baryon models (MBMs) implemented in CT18 NNLO which allow c ̸ =c. (Right) The associated values of the available first and second moments of the c ± = c±c PDF combinations for the normalizations preferred by each fit are shown on the left. The intervals correspond to the increase in ∆χ 2 < 10 from the respective best-fit values. Of these moments, we note that ⟨x⟩ c + and ⟨x 2 ⟩ c − are calculable in lattice QCD.
Critically, these are the central fitted PDFs: the uncertainty of the full normalization remains considerable, as found earlier [4]. Fig. 2 (left) quantifies a part of this uncertainty by the χ 2 dependence on ⟨x⟩ c + . A very mild preference for ⟨x⟩ FC ≡ ⟨x⟩ c + (Q 0 ) ∼ 0.5% is seen in all considered CT18 FC fits, marked by a ∆χ 2 ≈ 10-unit difference with respect to the "no FC" scenario. There is some disagreement among the 5 fitted data sets regarding the FC magnitude, with the BCDMS and combined HERA DIS data exerting an upward pull on the total charm fraction, while the E866 pp cross sections and 7, 8 TeV LHCb W/Z data prefer small FC. Fig. A.1 of the SD section dissects these pulls further via Lagrange Multiplier (LM) scans over ⟨x⟩ FC . The HERA charm-tagged DIS cross sections, σ c r , while influential in constraining the low-x charm and gluon, have apparently weaker pulls with respect to high-x FC scenarios, and therefore do not appear in the plots of that figure.
Charm-anticharm asymmetry. In addition to predicting a somewhat different high-x behavior for the FC PDF relative to BHPS, the MBM involves hadronic interactions that break the c =c assumption at Q 0 , unlike the BHPS model. As an illustration, we demonstrate this using the MBM as a specific model case. We stress that the detailed x dependence we obtain for c−c should not be taken as a strict prediction, but rather as an example of the asymmetric charm-anticharm scenarios to which future precise data might be sensitive. As discussed in Sec. III of Ref. [10], MBMs typically involve the appearance of nonperturbative (anti)charm in virtual (meson) baryon states into which the proton is allowed to dissociate; the anticharm quark naturally carries a greater share of its parent meson's momentum relative to charm in the corresponding intermediate baryon, producing a harder high-x distribution forc(x) relative to c(x). When normalized to c +c, the high-x c−c asymmetry shown in Fig. 1 (right) consequently approaches −1 at Q = Q 0 and remains negative after evolving to higher energy scales. Definite observation of a significant charm-anticharm asymmetry in either phenomenology or lattice calculations would be a substantial confirmation of the presence of IC. If nonzero, considerably higher precision in both data and theory will be necessary to determine the sign, shape, and magnitude of xc − (x) in future QCD fits, much as we argue for the size and shape of xc + (x).
The magnitudes of the FC and charm-anticharm asymmetry can be alternatively quantified by the symmetric (c + ) and asymmetric (c − ) first and second Mellin moments ⟨x 1,2 ⟩, which we plot in Fig. 2 (right) as central values and uncertainties corresponding to the ∆χ 2 ≤ 10 intervals with respect to the χ 2 minima in the left panel. The allowed ranges for each moment are quite substantial, being practically consistent with zero even based on the more restrictive ∆χ 2 = 10 criterion. As for the approximate CT tolerance of ∆χ 2 = 30, we obtain ⟨x⟩ FC ≲ 0.013 in all scenarios, with the precise uncertainties quoted in Eq. (A.1) of the SD section. This represents a moderate reduction in the allowed upper value of ⟨x⟩ FC ≲ 0.02 obtained based on the BHPS3 model fitted in the CT14 IC study [4]. The negative c − of the MBM is reflected in the negative values we obtain for the asymmetric moments, ⟨x 1,2 ⟩ c − , as also shown in Fig. 2. We point out that only some moments in Fig. 2 can be computed by lattice QCD techniques: specifically, ⟨x⟩ c + and ⟨x 2 ⟩ c − . Although lattice QCD calculations remain at an early stage for quantities related to charm, precise information on either of these moments, or complementary calculations of the high-x charm PDF, would be very useful for constraining possible FC scenarios.
One might reason that the nonperturbative symmetry-breaking mechanism(s) that produce c −c ̸ = 0 should have some analogue in the strange sector; we have considered such a possibility by performing alternative fits of MBMC(E) starting from a variant fit (CT18As2) from Ref. [26] as a baseline, which allowed s ̸ =s. While we see indications of mild correlations between the strange and charm PDFs in these fits, with the inclusion of FC according to MBMC(E) causing small reductions in the high-x strange (and gluon) PDF, s−s remains largely unaffected; hence, even in fits with a strange-antistrange asymmetry, we obtain similar results for the FC PDFs themselves. We include several plots illustrating the PDFs obtained in these simultaneous fits with s ̸ =s and c ̸ =c in the SD.
The improvement of χ 2 by no more than 25 units for ⟨x⟩ FC ≈ 0.5% in all considered FC scenarios in Fig. 2 is milder than in the CT14 IC analysis, where ⟨x⟩ FC ≈ 0.8 − 1% corresponded to a reduction in χ 2 of up to ≈ 40 units for the BHPS models, cf. Fig. 5 in Ref. [4]. Ultimately, the very shallow preferences for FC in Fig. 2 comply with the findings of other PDF fitting efforts, including the observation by MSHT that, when PDFs are fitted at partial N 3 LO ′ [29], there can be an enhancement in the perturbatively-generated charm PDF at high x, a feature which reduces the parameter space available for nonperturbative charm. We also note that the constraints we obtain on large-x FC, while already shallow, depend only mildly on m pole c -the parameter whose best-fit value compensates in part for the missing N 3 LO charm-quark scattering contribution to DIS cross sections [30] -and the form of the gluon parametrization [4]. When coupled with formal ambiguities in the relation of IC to FC PDFs, the fairly weak and incoherent pulls of the few experiments suggest a clear need for better experimental constraints. 6 5. NNPDF 2022 IC analysis. In contrast, the NNPDF group has recently claimed [5] robust, 3σ evidence for "intrinsic charm" -technically, FC by the definition above -using an x-dependent deviation of their FC PDF from a "no-FC" scenario, up to a PDF uncertainty assessed using the default NNPDF framework. That this local criterion can serve as a robust hypothesis test for the presence of FC is not necessarily obvious, as it may be subject to The crucial distinction with the CT18 FC analysis is that NNPDF's substantially narrower nominal PDF uncertainties seem to disfavor a "no-FC" PDF at x > 0.2. A recent publication [31] critically assessed these uncertainties using the publicly-released NNPDF4.0 fitting code to conclude that the NNPDF4.0 uncertainties on FC are likely underestimated. Specifically, figures in Sec. 3.E of Ref. [31] make evident that the NNPDF4.0 analysis would actually allow solutions with (nearly) zero FC with high probability under a more comprehensive sampling. The publication discusses the reasons why the effective prior introduced by the NNPDF replica training may spuriously omit these well-behaved and therefore acceptable solutions, following the usual practice in CT PDF analyses. Furthermore, Ref. [31] finds that the large-x FC and s−s PDFs are correlated, introducing an additional ambiguity in the region of x ∼ 0.4 where separation from rapidly fallingū,d, s, ands PDFs is particularly challenging. When the extra small-FC solutions are included, the MC PDF uncertainty alone washes out the evidence for a non-zero FC at large x. This uncertainty is further increased by accounting for the possible differences in approximating the experimental systematic uncertainties.
Another statistical indicator of FC, based on the charm PDF's first moment, ⟨x⟩ FC , at Q = m c = 1.51 GeV, is affected by a large missing higher-order uncertainty (MHOU) evident at x < 0.4 in Fig. 1 of [5]. At x < 0.1, the central NNPDF4.0 PDF predicts a large and negative FC that is difficult to reconcile with the valence-like shape of nonperturbative IC models. The negative low-x behavior of the central FC spotlights the tenuous connection of NNPDF's FC solutions to the nonperturbative models that do not favor this. On the other hand, the MHOU remains large at x < 0.1: including the N 3 LO matching coefficients in the PDF evolution, without consistently including N 3 LO terms in the coefficient functions, does not genuinely reduce the scale uncertainty. Ref. [5] determines the first moment to be ⟨x⟩ FC = 0.62 ± 0.28% based on the PDF uncertainty (PDFU) only and 0.62 ± 0.61% after adding the MHOU. The latter, in essence, makes ⟨x⟩ FC consistent with zero at 1σ.
For the interpretation of these results, it is worth noting that CT18 and NNPDF4.0 employ very similar NNLO theoretical frameworks. Not only CT and NNPDF use respectively the SACOT-χ and FONLL-C factorization schemes that are perturbatively equivalent up to NNLO, these schemes also produce numerically close predictions for key cross sections, as has been demonstrated repeatedly in joint benchmarking exercises such as the recent one in Ref. [15]. As one of the schemes of the ACOT family [32], the SACOT-χ scheme achieves the same level of accuracy as other ACOT approaches, while using simpler approximations. The derivation of the SACOT-χ scheme to NNLO [33] and its application to the FC scenario [4] demonstrate that, when including a power-suppressed FC PDF of the kind encountered for the nucleon, predictions of the original ("full") ACOT and SACOT-χ schemes agree up to terms of order Λ 2 /Q 2 , i.e., to an accuracy exceeding the validity of the factorization theorem. Therefore, the SACOT-χ and FONLL-C schemes, being perturbatively equivalent to the full ACOT scheme, are of the same accuracy. On the other hand, it has been established that the NNPDF3.1 and, even more so, NNPDF4.0 methodologies may produce smaller PDF error bands compared to either CT18 or MSHT20, even when fitting to a similar data set [15]; in this case, the larger PDF error estimates of the CT or MSHT groups may decrease the statistical significance of the observed signal.
Regarding the setup relevant to the FC study, we point out that the CT14/CT18 and NNPDF4.0 procedures for introducing FC and evolving it over mass thresholds are equivalent, with the only distinction 7 being that CT parametrizes FC at a scale Q 0 slightly below m c in the N f = 3 scheme and evolves it to higher Q by matching to the N f = 4 scheme at Q = m c . The NNPDF4.0 analysis, on the other hand, parametrizes FC at Q 0 (= 1.65 GeV) above m c in the N f = 4 scheme and evolves it backwards to Q = m c = 1.51 GeV, where the resulting FC is converted into the N f = 3 scheme and presented. Thus, the CT18 and NNPDF4.0 FC parametrizations at a scale Q ≤ m c can be directly compared at NNLO. By admitting the enlarged PDFU+MHOU uncertainty of the NNPDF4.0 analysis, we arrive at a general consensus between the latest CT18 and NNPDF findings with regard to FC. Namely, FC may possibly improve χ 2 or a related figure of merit, although with low confidence that does not rise to the evidence level. In the same vein, the observed behavior of the NNPDF's FC would be easier to reconcile with nonperturbative IC models once these larger uncertainties are considered. We already pointed out that the negative central FC at x < 0.1 and valence-like FC shape of the models can be reconciled by accounting for a large higherorder uncertainty. On the other hand, at x > 0.1, the central NNPDF FC PDF, xc + (x, Q = 1.51 GeV), peaks at x ≳ 0.4; this is considerably harder than the shapes that many nonperturbative models naturally produce: e.g., the MBMs of Ref. [10] generally peak in the region of 0.3 ≲ x ≲ 0.4 or below, as does the BHPS model [1], assuming conventional, O(100 MeV) constituent quark masses. (We point out that the slightly higher scale, Q = 1.51 > 1. 27 GeV, at which NNPDF compute their FC PDFs, make the hard shape even more striking.) While the neural network approach notably entails a highly flexible parametrization beyond the individual model-based forms in CT18FC, we have explored a range of high-x shapes and peak locations for FC as shown in Fig. 1 (left), with no indication of significant χ 2 dependence associated with these variations in Fig. 2 (left). These findings echo the conclusion in Ref. [4], wherein the left panel of Fig. 6 illustrated the very weak dependence of χ 2 on the charm mass, which controls the position of the peak of FC for BHPS-like models. Moreover, although IC models might be fine-tuned to produce harder shapes beyond those obtained with natural parameter choices, such nominal differences again highlight the formal theory development needed to relate FC to IC as argued in Sec. 2. These discrepancies are also relieved by assuming larger uncertainties. The only guaranteed way to improve the perturbative uncertainty is to fully implement radiative contributions of the same order in α s . For DIS cross sections, the current uncertainties can be reduced by fully implementing the N 3 LO contributions, if massive-quark terms are included. For the LHCb (Z + c)/(Z + jets) ratio, the currently formidable uncertainty in predictions that we examine in the SD must be better controlled through a combination of computations beyond NLO+PS and advances in flavored jet algorithms.
6. Conclusions. The new CT18 NNLO FC global analysis concludes that evidence for nonperturbative charm continues to be elusive, counter to the recent finding in Ref. [5]. As such, the subject remains open and in need of further theory and experimental data. Specifically, a number of complementary developments are required if future PDF analyses are to be capable of discriminating nonzero IC with high statistical confidence: (1) theoretical progress relating nonperturbative correlation functions in factorization-based calculations to wave functions formulated in terms of nonperturbative charm; this includes separating contributions from twist-4 and beyond as discussed in Ref. [4]; (2) clean and sensitive experimental data at hadron-hadron and lepton-hadron colliders to test the "universality" of the IC PDF; (3) theoretical calculations at NNLO, and possibly including parton-showering effects, for relevant data like Z +c production at the LHC, to correctly extract both the central value and uncertainty of the FC PDFs; and (4) faithful estimates of PDF errors in global analyses. (See Refs. [15,31].) This more comprehensive uncertainty quantification entails improved understanding of PDF correlations. This will be indispensable, as the size of IC PDF at high x is strongly correlated with the high-x gluon due to the momentum sum rule. Since NNPDF's high-x gluon is smaller than that of CT and MSHT, NNPDF is more likely to obtain a comparative enhancement in their FC PDF at larger values of x. Joint PDF benchmarking exercises like those in Ref. [15] could include FC parametrizations to understand these issues.
Regarding Point (2), direct EIC data similar to the EMC measurements of the charm structure function, F c 2 , at high x and over a range of Q 2 values, would be invaluable for revisiting the possible largex excess suggested by CT14 BHPS, NNPDF, and the traditional nonperturbative charm models [34,8]. Moreover, manifestations of IC might affect the physics program at the proposed CERN Forward Physics Facility [35,36]. Knowledge of the Q 2 dependence of this quantity would allow tests to unravel the nature of 8 nonperturbative charm against power-suppressed contributions arising in a twist expansion. Additional discriminating inputs include possible lattice QCD calculations of ⟨x⟩ c + and/or ⟨x 2 ⟩ c − ; these would be highly informative from the perspective of having an independent determination of the total charm magnitude in the first case. In the second case, a measure of the possible c,c asymmetry would be of great interest given the fact that this asymmetry, if non-negligible, would principally originate from nonperturbative dynamics [10,37]. These inputs might be augmented by x-dependent lattice information from the quasi-or pseudo-PDF methods [38,39,40]. We will make available 12 grids for the CT18 FC NNLO PDFs described above as a part of the LHAPDF library (https://lhapdf.hepforge.org/) and at the CTEQ-TEA website (https://ct.hepforge.org/). In Fig. A.1, we illustrate the χ 2 dependence on the total momentum fraction, ⟨x⟩ FC ≡ ⟨x⟩ c + (Q = Q 0 ), carried by fitted charm (FC) for each of the scenarios examined in the main paper; for this, we use the Lagrange multiplier (LM) method adopted in the CT18 NNLO analysis [25] and related publications. The LM scans identify the most constraining experiments whose χ 2 depends strongly on ⟨x⟩ FC . In all cases, we observe that the preferred range of ⟨x⟩ FC is determined by trade-offs between the pulls of the largest NC DIS data sets from the combined HERA and BCDMS measurements, which prefer larger ⟨x⟩ FC , and vector-boson production cross sections from E866 and LHCb pp scattering, which prefer lower ⟨x⟩ FC . For the meson-baryon models (MBMs), we also observe more pronounced downward pulls from the CC DIS (F 3 ) measurements by CCFR and CDHSW, as well as some pulls from the CMS 8 TeV jet production and NMC d/p DIS ratio. The structure function F 3 depends on (s−s) and (c−c) already at leading order; for this reason, measurements of this quantity may possess more nominal sensitivity to the c ̸ =c asymmetry which tracks ⟨x⟩ FC in the MBM as reflected by the CCFR and CDHSW curves in the lower panels of Fig. A.1.
The strengths of the pulls depend on the settings used in the associated analysis, as can be seen, e.g., from the comparison of the pulls in the CT18 and CT18X NNLO fits with the BHPS3 model in the upper row of Fig. A.1. For all examined models, ⟨x⟩ FC ≈ 0.5% is preferred even more mildly than in the CT14 where the first and second uncertainties represent the ∆χ 2 = 10 and 30 intervals, respectively.
PDF uncertainties for the LHCb 13 TeV σ(Zc)/σ(Zj) measurement LHCb has recently measured Z + c production at 13 TeV, with data normalized to the corresponding Z + jet cross sections [21]. As noted in the main text, the associated production of a Z boson and charm quark, Z + c, has been suggested [16] as providing a direct probe of the charm PDF, due to the leading contribution from gc → Zc scattering at parton level. In this work, we find that any FC contribution to Z + c production in the central region cannot be distinguished from the non-FC PDF uncertainty. In contrast, forward Z + c production may in principle possess more direct sensitivity due to the Born-level relation between the measured rapidity and probed momentum fraction, x ∼ (Q/ √ s)e y , such that PDFs in the large-x region, where typical IC models predict the greatest impact, might be constrained.
In Fig. A.2, in addition to the LHCb Zc/Zj ratio data, we show several predictions to illustrate the underlying physics issues.
1. The NLO calculation using MCFM [41] (indicated by corresponding open symbols), is presented in both panels of Fig. A.2 for an array of recent PDF sets without or with the FC component. Error bars on these predictions indicate the PDF uncertainty to gauge its magnitude compared to the other contributing uncertainties. 2. In the left panel only, we also include predictions made with POWHEG-BOX interfaced to Pythia8 [16] (indicated by filled symbols) for the indicated PDF sets. The NLO+PS calculation has a significant theoretical uncertainty. To discriminate among the FC models, the theoretical uncertainty of predictions must be smaller than the FC dependence.
Now turning to the details, in Fig. A.2 (left) we include predictions based on several PDFs: PDF4LHC15 (no FC) [42]; NNPDF3.0 with FC allowed [12]; and CT14 with the BHPS3 IC model [4]. Here, we notice 2 that parton showering significantly enhances the ratio σ(Zc)/σ(Zj), reflects the important contribution from higher-order splittings g → cc that are not included at fixed NLO. This enhancement has a sizeable PDF dependence that cannot be captured by a universal K-factor approach applied to all regions of phase space. The uncertainty of the CT14+BHPS3 PDF fixed-order prediction is larger than that of the respective NLO+PS prediction, simply because the latter prediction, taken from Ref. [16], does not include the PDF uncertainty. Figure A.2 (right) shows the MCFM fixed-order NLO calculations with a number of more recent PDF ensembles: CT18 [25]; CT18 with the BHPS3 FC model presented in the main paper; NNPDF4.0 [43]; NNPDF4.0 additionally fitted to the EMC and LHCb data [5]; and the enlarged hopscotch-sampled set of Ref. [31]. Here, we wish to gauge the effect of FC before parton showering is included. Again, we see the enhancement of the σ(Zc)/σ(Zj) ratio from the inclusion of FC when comparing the CT18 and CT18+BHPS3 predictions. The (nominal) NNPDF4.0 predictions give a slightly larger result than CT18, with a smaller PDF uncertainty. Upon incorporating the LHCb data (along with the EMC F c 2 measurements), the NNPDF4.0 error band slightly shrinks, with the central value largely unchanged. As pointed out in the hopscotch scan study of Ref. [31], the NNPDF PDF uncertainty can be undersampled. Hence, we also compare to a prediction which includes an additional contribution to the NNPDF4.0 uncertainty from the hopscotch-sampled PDF solutions. As expected, this enlarges the PDF uncertainty and makes it closer to that of CT18 BHPS3 prediction.
As a final remark, we observe that the NLO and NLO+PS calculations presented here are not yet sufficient for the extraction of the FC from the LHC data. For the fiducial central region of CMS [44], Ref.
[4] examined multi-particle final-state effects using the matrix-element plus parton shower merging (MEPS) approach. It found that the sensitivity to FC could be diluted due to additional contributions from Z + non-c partonic subprocesses involving g → cc splitting in the final state. These dilution effects can be even more severe in the forward region measured by the LHCb experiment, as a result of the larger showering effects expected for the associated rapidities. Indeed, in Fig. 5 of [20], the two NLO+PS predictions based on Pythia 8 and Herwig 7 bracket the NNLO predictions, and both have twice as large uncertainties as at NNLO. Correctly combining the final-state gluon splitting (g → cc) contribution, determined via parton showers, with the hard part of the NNLO calculation is expected to be challenging. Furthermore, multiparton interactions are present in the actual LHCb measurement and modify the NLO+PS predictions in a pattern which can mimic and potentially complicate the extraction of the FC signature (Appendix B in [20]).
Correlation between PDFs and LHCb 13 TeV Z + c, Z + jet cross sections In Fig. A.3, we plot the correlation angles between the 13 TeV LHCb Z + c and Z + j cross-section ratio, R c j = σ(Zc)/σ(Zj), and the NNLO PDFs of CT18 [25], MSHT20 [28], and NNPDF4.0 [43]. The correlation angles for Hessian error sets and Monte-Carlo replicas are defined in Ref. [45]. As we see, Z + c production is strongly correlated with the charm PDF, which is reflected in all three groups. In the CT18 and MSHT20 global analyses, the charm is perturbatively generated through gluon splitting, g → cc. As a consequence, correlations with the charm PDF closely follow those of the gluon. However, this pattern does not hold for NNPDF, since the freely-fitted charm PDF they include by default is no longer directly dependent on the gluon distribution. Meanwhile, we also find sizable correlations between the Z + c data and the light-quark sea PDFs in CT18 and MSHT20, which originates from the momentum sum rule. Upon closer inspection, we see that the strange PDF correlations in CT18 roughly follow those ofū,d, whereas the MSHT s-PDF shows less correlation, presumably due to the additional parameters introduced for nucleon strangeness in the MSHT20 fit [28]. Compared to Z + c, we see in the second row of Fig. A.3 that the Z + j cross section is most strongly (anti-)correlated with the gluon PDF, particularly in the low-x region. We also point out that the correlations with the sea-quark PDFs exhibit a sharp turnover near x ≳ 0.8 for both CT18 and MSHT20; this feature is attributable to a sign-change in the extrapolated region.
In the third row of Fig. A.3, we plot the usual PDF correlations with the measured cross-section ratio, R c j = σ(Zc)/σ(Zj); complementary to this, the fourth and final row gives the analogous correlations between the LHCb R c j ratio data and the strange-and charm-suppression ratios, R s(c) ≡ [s(c) +s(c)]/[ū +d]. The behavior of the R c j correlations is mainly induced by the Z + c absolute cross section, as shown in the corresponding panels of the first row. ratios (third and fourth rows). The three curves of the same color correspond to the three LHCb rapidity bins [21], where the respective Z rapidities increase from left to right.
In conclusion, we observe strong correlations between the LHCb σ(Zc)/σ(Zj) measurements and the charm PDF, which suggest this measurement may potentially shed some light on FC, as discussed in the main Letter. However, we also find that the correlations between the various PDF flavors depend on the various parametrizations and methodologies adopted by different PDF-fitting groups. To account for these differences, theory developments at NNLO, with parton shower effects included as discussed in the main Letter, will be essential for ensuring a clean interpretation of future Z + c results.
Alternative fits with non-vanishing strangeness and charm asymmetries In Fig. A.4, we compare a number of additional fit variants to illustrate the interplay between fitted strange and FC mentioned in the main text. Of these, CT18A assumes s =s at the Q 0 scale, while all others allow s ̸ =s. Moreover, both CT18A and CT18As2 [26] assume c =c = 0 at the Q 0 scale, while the other two fits (MBMC and MBME) involve non-vanishing strangeness and charm asymmetries. In particular, these plots demonstrate that the inclusion of FC according to the MBM with c ̸ =c leads to a modest suppression of s + (x) at very high x relative to the CT18As2 fit which allowed s ̸ =s. Meanwhile, the strange asymmetry is effectively unchanged by the incorporation of FC with a charm-anticharm asymmetry.

Additional plots of PDFs
For completeness, we include several more figures further illustrating the behavior of the CT18 FC PDFs. Figure A.5 compares the charm-suppression ratio R c (x, Q) ≡ c + (x, Q)/ ū(x, Q) +d(x, Q) in the CT18 NNLO and CT18 FC models at Q = 1.27 and 100 GeV. These plots suggest that, given sufficient magnitude, FC can enhance the total charm PDF relative to the light-quark sea in a fashion that persists to high scales in the high-x region. Figure A.6 illustrates DGLAP evolution of xc + (x, Q) in the BHPS3 model between the scales Q = 1.27 and 100 GeV. Our result for xc + (x, Q) at Q 0 = 1.27 GeV reflects in part the rate of backward DGLAP evolution from higher scales Q > Q 0 where the data are fitted. As such, backward DGLAP evolution tends to increase (suppress) xc + (x, Q) at x ≳ 0.1 (x ≲ 0.1). Given that IC is an NNLO effect, control of higherorder uncertainties beyond NNLO in DGLAP evolution is necessary for accurate determination of xc + (x, Q) at Q 0 .