Simultaneous reweighting of Transverse Momentum Dependent distributions

The Bayesian reweighting procedure is extended to the case of multiple independent extractions of transverse momentum dependent parton distributions (TMDs). By exploiting the data on transverse single spin asymmetries, $A_N$, for inclusive pion production in polarized proton-proton collisions measured at RHIC, we perform a simultaneous reweighting of the quark Sivers, transversity and Collins TMD functions extracted from semi-inclusive deep inelastic scattering (SIDIS) and $e^+ e^-$ annihilation into hadron pairs. The impact of the implementation of the Soffer bound, as well as the differences between older and newer $A_N$ data, are investigated. The agreement with $A_N$ data at large-$x_F$ values, a kinematical region complementary to those explored in SIDIS measurements, is enhanced, improving the knowledge of the polarized quark TMDs in the large-$x$ region.

By extending our previous study [69], in this paper we will attempt, for the first time, a simultaneous analysis of the available experimental data for spin asymmetries in SIDIS, e + e − scattering, and proton-proton collisions, assuming factorized expressions in terms of TMDs for all those processes.We will exploit two models for the TMD description of A N : the usual Generalized Parton Model (GPM) [35][36][37] which assumes that all TMDs are universal, and the Color Gauge Invariant GPM (CGI-GPM) [70][71][72][73] that takes into account the process dependence of TMDs due to the direction of gauge links in their corresponding operator definitions.The study will be performed by extending the Bayesian reweighting technique [74][75][76][77][78][79] to simultaneously reweight the results of new and updated global extractions of the transversity and Sivers distribution functions [80,81] and of the Collins fragmentation functions (FFs) [82], using presently available data on A N in proton-proton collisions.
The paper is organized as follows: in Section 2 we recall the TMD formalism, within the GPM and CGI-GPM, while in Section 3 we summarize the basics of the reweighting procedure.A suitable method to treat Monte Carlo sets is discussed in Sec-tion 4. The new independent fits to SIDIS and e + e − data are presented in Section 5, while the results of our analysis are discussed in Section 6. Conclusions and final remarks are gathered in Section 7.

Formalism
In this Section we summarize the formalism which will guide us throughout our phenomenological analysis.
Starting with the SIDIS processes, ℓp ↑ → ℓ ′ hX, the two azimuthal asymmetries we are interested in are related to the Sivers and the Collins effects, properly defined within a TMD factorization theorem.For the Sivers asymmetry we have [83] A sin(ϕ h −ϕ S ) where F UU ∼ f q 1 ⊗ D q 1 is the TMD unpolarized structure function, and F sin(ϕ h −ϕ S ) UT ∼ f ⊥q 1T ⊗ D q 1 [3,8,84] is the azimuthal modulation originating from the correlation between the nucleon spin and the intrinsic transverse momentum of the unpolarized quark.This effect is encoded in the Sivers function.
For the Collins asymmetry, which involves both transversity and Collins functions, one has where y is the fractional energy loss of the incident lepton, and [3,8,84] is the polarized structure function of the SIDIS cross section, given as a convolution of the TMD transversity distribution, h q 1 , and the Collins FF, H ⊥q 1 .To access this TMD fragmentation function, information from another complementary process, namely e + e − → h 1 h 2 X, is necessary.Here the transverse momentum imbalance of the two hadron, produced in opposite hemispheres, is measured.In this configuration, still within a TMD factorization scheme, a convolution of two Collins FFs appears via a cos(2ϕ 0 ) modulation [7]: Experimental measurements of this process were conducted at approximately √ s ≃ 10.6 GeV by the Belle [17] and BABAR [19] collaborations, as well as by the BESIII [28] collaboration, at a lower energy of √ s ≃ 3.65 GeV.For inclusive hadron production in pp collisions, the SSA is defined as where dσ ↑(↓) ≡ E h dσ ↑(↓) /d 3 P h stands for the single-polarized cross section, in which one of the initial-state protons is transversely polarized (↑(↓)) with respect to the production plane.
Here, we adopt the GPM, a phenomenological model where a factorized formulation in terms of TMDs is assumed as the starting point, and in which one includes spin and transverse momentum correlation effects.For completeness, we will also consider an extension of this approach, the CGI-GPM, where initial-and final-state interactions are properly included in a one-gluon-exchange approximation.Note that this model allows for the well-known process dependence of the Sivers function expected when comparing SIDIS and Drell-Yan processes [85].
As discussed in Refs.[36,37], in the region of moderate and forward rapidity in p ↑ p → hX processes only two effects survive the integration over the intrinsic transverse momenta and their relative azimuthal phases: the Sivers and Collins effects.In the first case, formally, also a contribution from gluons could appear.Nonetheless, in the same kinematical region, the gluon Sivers effect can safely be ignored, as shown in Refs.[73,86].
It is important to stress that in inclusive processes these two TMD effects cannot be separated.Therefore the numerator of A N will be d∆σ ≃ d∆σ Siv + d∆σ Col . ( Starting with the Sivers effect, within the CGI-GPM, the numerator of the asymmetry can be schematically written as [70] where f b/p (x b , k ⊥b ) is the TMD distribution for an unpolarized parton b inside the unpolarized proton.Moreover, H Inc ab→cd are the perturbatively calculable hard scattering functions.In particular, the H Inc ab→cd functions where a is a quark or an antiquark can be found in Ref. [70].The GPM result can be obtained from Eq. ( 6) by simply replacing H Inc ab→cd with the standard treelevel unpolarized partonic cross sections, H U ab→cd .Finally, the unpolarized cross section, dσ, appearing in the denominator of Eq. ( 4), can be obtained by replacing the Sivers function and its phase in the GPM expression with the corresponding unpolarized TMD-PDF for parton a.
Focusing now on the Collins contribution, we recall that all FFs (T-even as well as T-odd ones) are process independent, and are not modified by the direction of the gauge links [87,88].Thus, the Collins contribution to A N is assumed to be the same in the GPM and in the CGI-GPM, and reads where d∆σ a ↑ b→c ↑ d ≡ dσ a ↑ b→c ↑ d − dσ a ↑ b→c ↓ d is the transverse spin transfer at the partonic level.

Simultaneous reweighting
We now illustrate the method we have developed for a simultaneous Bayesian reweighting of functions initially extracted from fits to independent datasets.For simplicity, we will focus on the case of two functions, although this approach can be easily generalized to n independent extractions.
Let us consider two statistically independent functions, f (a) and g(b) depending, respectively, on n a -and n b -dimensional sets of parameters a = a 1 , . . ., a n a and b = {b 1 , . . ., b n b }.The value of these parameters is determined by performing two distinct fits to independent datasets E a and E b .For each of these fits, a χ2 , defined as 1,2 : is minimized and the best fit a 0 and b 0 , corresponding to the minima χ 2 0,a and χ 2 0,b , are determined.In the equations above, T i [a] ≡ T i ( f (a)) are the theoretical estimates corresponding to the experimental data points E a i , and C a i j is the covariance matrix for the fit a (and similarly for the fit b).The fit uncertainties can then be computed via a Hessian method or with a suitable Monte Carlo (MC) procedure.Using the latter method, the probability density functions π(a) and π(b) are reconstructed by generating N a set sets a k and N b set sets b l respectively.Notice that these distributions are statistically independent from each other.Then, expectation values and variances for any quantity O depending on one of the parameter sets (e.g.a) can be computed as Let us now suppose that a new set of data E (with an associated covariance matrix C) is measured, and that these data can be described by a linear combination of f (a) and g(b) (e.g. , where α and β are real constants).Then, we can compute the χ 2 corresponding to these new data as Since f and g come from statistically independent fits, the uncertainty bands for the theoretical predictions T i [a, b] have to be built by taking all possible (N a set × N b set ) combinations of the MC parameter sets a k and b l .Thus, the χ 2 on the new data will depend on the k-th and l-th MC sets: leading to (N a set × N b set ) values of χ 2 .By using Bayes theorem, we can then evaluate the impact of these new data on our prior distributions π(a) and π(b).Since 1 If (e.g. for the fit a) only uncorrelated uncertainties σ a i are given, the new these distributions are a priori independent, we can build a factorized prior π(a, b) = π(a)π(b) and apply Bayes theorem to compute the posterior densities: where L(E|a, b) is the likelihood, and Z = P(E) is the evidence, that ensures a normalized posterior density.Various choices for the likelihood have been discussed in the literature [74,75].Here we adopt the likelihood definition as obtained by taking L(E|a, b) dE as the probability to find the new data confined in a differential volume dE around E. Following Ref. [89], we define the weights w kl as where ∆χ 2 is the tolerance at a given confidence level (CL) for n a + n b parameters.Notice that the weights coincide with those defined in the original work by Giele and Keller [74], with rescaled exponent: χ 2 kl → χ 2 kl /∆χ 2 .We will use a value of ∆χ 2 defined according to Wilks' theorem [90].For a (1 − α) CL, we have where X 2 D is a chi-squared probability density for D degrees of freedom (i.e. the number of free parameters) and F 2 X D its associated cumulative function.
The weights are at the core of the reweighting procedure.Using Eq. ( 13), one can obtain the expectation value and variance of the reweighted quantity O as If this quantity depends only on f (a) (or g(b)), one has to evaluate the corresponding weights w k (or w l ) and use again the weighted sums in Eq. ( 15) with the new, updated weights w k (or w l ) for the corresponding O(a k ) (or O(b l )).
By doing so, one is able to evaluate the impact of the new data on the two independent prior distributions π(a) and π(b) and on any quantity depending on the parameter sets a and/or b.As the weights defined in Eq. ( 13) are normalized to one, w k and w l are automatically normalized to one too.
For a generic extraction with N set MC sets and weights w k , the rescaled version is equivalent to the Hessian reweighting [89], and allows us to retain a larger effective number of sets, N eff , defined as N eff is related to the number of sets carrying a non-negligible weight, reflecting the method's efficiency.If N eff ≪ N set , the method is considered no longer reliable, signaling that either the new data require a full refitting, or that they are inconsistent with the old ones [75].
In general, the introduction of new data may lead to correlations between fits that were originally statistically independent.For example, this scenario could arise with A N , which incorporates contributions from both Sivers and Collins effects.Such correlations are encoded in the (N a set × N b set ) combinations, and are duly considered when evaluating a reweighted quantity.

Compressing the MC sets
Following the procedure illustrated in Appendix A of Ref. [91], after the initial fitting stage we generate (e.g. for the fit a) N a set MC sets a k , each with a corresponding . Again, χ 2 0,a is the minimum found by the fit and ∆χ 2 a is the tolerance that depends on the number of parameters and is given at a certain CL.These sets allow us to reliably reconstruct the parameter distribution π(a), provided a sufficiently large number of sets is generated.For instance, in Refs.[69,79,92], the number of sets needed was up to O(10 6 ).In the case we consider here, involving two independent functions, this implies up to O(10 12 ) combinations in Eq. (11).In order to reduce the computational cost, we will use a compression procedure, which we describe in what follows.
Starting from the full sample of parameters a k , we select a random sample a In other words, if the sample a ′ k ′ renders a statistically equivalent distribution to that of a k , the corresponding distributions of any quantity will not differ significantly.Thus, this procedure would help in decreasing the number of combinations to be computed for the simultaneous reweighting.
The question now is how to check that the sampled distribution is statistically equivalent to the full one.As discussed in Section C.2 of Ref. [93], we adopt as an indicator the Welch's t-statistic, defined as which quantifies the difference of the arithmetical means of two samples (in our case, the full sample a k and the compressed random sample a ′ k ′ ) with unequal variances and sizes.One has to verify that |t| is such that the corresponding p-values are ≳ 0.1.Provided this condition holds, one can conclude that the sampled distribution and the original distribution are statistically equivalent.Notice that, at variance with Ref. [93], we do not sample in the observable space (i.e.A N in our case), but rather in the parameter space.Moreover, since the Welch's t-statistic implicitly assumes underlying Gaussian distributions we also verify the compatibility between the medians and asymmetric uncertainty intervals of the samples a ′ k ′ and a k .This allows us to correctly sample asymmetric distributions.To check how this compression algorithm works, we detail below an explicit example.

An explicit example
Let us consider the reweighting performed for the quark Sivers function using STAR A N jet data [69].Here, we will reperform the reweighting using the rescaled weights as defined in Eq. ( 13).
In the original work, N a set = 2 • 10 5 MC sets were generated and used to represent the uncertainty on the up-and down-quark Sivers functions.Here, we will select a random sample a ′ k ′ of the parameter sets a k , with N a ′ set ≪ N a set , checking that their corresponding distribution π(a ′ k ′ ) is statistically equivalent to π(a k ) and re-perform the reweighting using only the reduced sample of sets.By applying the compression algorithm presented above, we select only 1% of the initial sample, i.e.N a ′ set = 2   Fig. 1 shows the comparison between reweighted curves for the GPM (upper panels) and the CGI-GPM (lower panels), together with STAR data.As in Ref. [69], the central values are the median values, and the asymmetric uncertainty bands are at 2σ CL.The plot clearly shows that median values and uncertainties of the full sample (gray bands) are correctly and satisfactorily reproduced by the reduced sample of MC sets (hatched bands).We verified that the same happens for the unweighted curves, not shown here.For completeness, and to make this comparison more explicit, in Fig. 2 we show the reweighted uncertainties (normalized to their central value) for the Sivers first moment for up-and down-quark in the GPM (left panels) and CGI-GPM (right panels).Reweighted curves from the full sample are shown in gray with black dashed borders, while the ones for the reduced sample are shown in hatched colors.These results allow us to validate the compression procedure, that we will use in what follows for simultaneously reweighting the Sivers, transversity and Collins functions.

Priors from SIDIS and e + e − data
In this Section we briefly describe the new fits to SIDIS and e + e − data for the extraction of the Sivers, transversity and Collins functions.These will represent the priors for the simultaneous reweighting procedure.In all cases we employ updated SIDIS datasets, by including the most recent data from COM-PASS [94], HERMES [95] and JLab [20].
For the up-and down-quark Sivers functions, we adopt the parametrization of Ref. [92], that consists in factorized x and k ⊥ dependences (the latter being Gaussian-like and flavor independent): where q = u, d, M p is the proton mass, and where ∆ N f (1)  q/p ↑ (x) is the Sivers first k ⊥ -moment [83]: This model depends on five parameters: N u , N d , β u , β d , and ⟨k 2 ⊥ ⟩ S .Following Section 3.1 of Ref. [92], in the computation of F UU (see Eq. ( 1)), we consistently employ the same collinear PDFs and FFs as for the unpolarized TMDs, and the corresponding Gaussian widths extracted from HERMES and COM-PASS multiplicities [96].
For the transversity and Collins functions we make use of the parametrization of Refs.[79,100].The transversity function is parametrized as where the Gaussian width is assumed to be the same as for the unpolarized TMD-PDFs.As in Refs.[79,[100][101][102], the xdependent part of the TMD transversity is parametrized at the initial scale Q 2 0 in terms of the Soffer bound [103]: where with the same α and β parameters for the valence u v and d v transversity functions, for a total of four parameters for h q 1 .We emphasize that we do not enforce the automatic fulfillment of the Soffer bound (|N T q | ≤ 1), but we apply such a constraint a posteriori on the generated MC sets.As shown in Ref. [79], this choice allows us to avoid a bias in the fitting procedure and to properly estimate the uncertainty on the transversity functions.
The Collins function is parametrized as in Refs.[79,[100][101][102]: where q = fav, unf (favored/unfavored), m h is the produced hadron mass, and where M C is a free parameter with mass dimension.D h/q (z, p 2 ⊥ ) is again the unpolarized TMD fragmentation function, while the N C q (z) factors are given by for a total of eight free parameters for the h q 1 and H ⊥q 1 extraction.To build the Soffer bound, we adopt the DSSV set [104] for the collinear helicity distributions, g 1L (x).By using an appropriately modified version [105,106] of the HOPPET code [107], a transversity DGLAP kernel is employed to evolve h 1 (x) up to higher values of Q 2 .We set Q 2 0 = 0.81 GeV 2 as the input scale in Eq. ( 22), with α S (M Z ) ≃ 0.118.
For the collinear part of the Collins function, we also adopt a DGLAP evolution.In principle scale evolution should be taken into account in a more rigorous way.In this case, in particular, the appropriate formalism would be that of TMD factorization leading to TMD evolution equations.There is, however, a general consensus based on experimental evidences that scale evolution effects appear to be mild when it comes to azimuthal or single-spin asymmetries.In fact, asymmetries are defined as ratios of cross sections, where evolution and higher order effects tend to cancel out [108].Although our parametrization does not incorporate the complete features of TMD evolution, phenomenological results based on DGLAP evolution are compatible with full TMD evolution at higher logarithmic accuracy [108,109] (see also Fig. 14 of Ref. [110]) in the kinematic region we are interested in.
Note that the updated extractions turn out to be compatible with those of Refs.[79,92,96], although the new HER-MES data induce slightly larger TMD distributions, as already observed in Ref. [68].For the two independent extractions of TMDs from the Sivers and Collins asymmetries we generate O(10 5 ) sets using a Markov chain MC that employs a Metropolis-Hastings algorithm with an auto-regressive generating density [111]), and apply the compression algorithm discussed in Section 4 to select 2 • 10 3 MC sets for each extraction.This amounts to 4 • 10 6 combinations to be computed for each of the A N bins for the simultaneous reweighting.
As mentioned above, these updated analyses will represent the priors of the reweighting procedure, which will be described in the following Section.

Simultaneous reweighting with A N data for inclusive pion production
We start by illustrating the comparison between our predictions for the A N asymmetry in inclusive production of charged and neutral pions in the GPM and CGI-GPM with the experimental data used for the simultaneous reweighting.We consider as new evidence the preliminary data for A N measured by BRAHMS for π ± production at √ s = 200 GeV [45], the data from STAR for π 0 production at √ s = 200 GeV [43,46,49] and the latest STAR data for non-isolated π 0 production from Ref. [54] at √ s = 200 GeV and √ s = 500 GeV.
In our computation of A N , the transverse momentum of the final state pion, P T , is the hard scale of the process.For the (CGI-)GPM to be applicable, we then select only data points with P T > 1 GeV.
In what follows we adopt the median as central value, and the uncertainties are estimated by determining 2σ-confidence regions.Since we have a total of 13 free parameters (5 for f ⊥ 1T , 4 for h 1 and 4 for H ⊥ 1 ), according to Eq. ( 14), we get ∆χ 2 = 22.69 entering Eq. ( 13).
Hereafter, we present the unweighted predictions, based on the information from SIDIS and e + e − asymmetries only, in gray.The reweighted curves in the GPM and the CGI-GPM are shown respectively with red and green bands.Data points corresponding to P T < 1.5 GeV are depicted in gray, to highlight the kinematic regions where the perturbative approach may be questioned, especially as far as scale uncertainties are concerned (see e.g.Ref. [112] for a recent discussion on this issue).
Let us start from the results for charged pion production at BRAHMS.Before entering into our main discussion, few comments are in order.While it is well known that π 0 data are mostly sensitive to the relative contribution of up-and downquark TMD distributions (Sivers or transversity, depending on the effect considered), π ± data allow for a more direct flavor separation, giving a larger discriminating power to any phenomenological study.Therefore, in view of their relevance, we have included the charged pion datasets in our analysis, although yet unpublished and covering a limited kinematical range.Charged pion A N measurements at future facilities, like the EIC [22,23], the JLab 22 program [113], AMBER [27] and the proposed fixed-target program at the LHC [29], will indeed help in improving future TMD analyses.N data [45] in the GPM (left panels) and the CGI-GPM (right panels) are presented.Data points in gray correspond to P T < 1.5 GeV.
In Fig. 3 we show the unweighted and reweighted bands in the GPM and the CGI-GPM, compared to A N data from BRAHMS for π + (full bullet points) and π − (empty bullet  [43,46,49] in the GPM (upper panels) and the CGI-GPM (lower panels) are presented.Data points in gray correspond to P T < 1.5 GeV.
points).As expected, the reweighted curves present reduced uncertainties.The GPM describes these data better than the CGI-GPM, and the quality of the description increases if one does not consider the aforementioned data points with P T < 1.5 GeV.A somehow larger discrepancy between our computation and the data is seen for π − in the CGI-GPM.
The comparison with the older STAR data [43,46,49], collected without separating isolated and non-isolated pion samples, is shown in Fig. 4 for the GPM (upper panels) and the CGI-GPM (lower panels), in four different ranges of pseudorapidity.Notice that, in the two kinematical configurations with largest ⟨η⟩ (right plots in the two panels), the first two data points at lower x F values correspond to P T < 1.5 GeV.Both GPM and CGI-GPM estimates are in qualitative agreement with the data.The reweighted bands are able to describe the data at moderate x F , and more interestingly, they present a shape that better represents the steady increase of the asymmetry at large-x F values, where the agreement is enhanced with respect to older analyses [114,115].
We finally move to the latest STAR data [54] for non-isolated π 0 s.The kinematics of this dataset aligns more closely to that of our initial fits in SIDIS and e + e − , as it mainly involves pions with moderate momentum fractions z, excluding those with z ∼ 1 [54].Furthermore, the A N data for non-isolated π 0 differ from the corresponding overall π 0 inclusive data sample, and from older A N measurements in similar kinematical regions [43,46,49], as they do not show the usual pronounced steady increase at large x F (see also Figs. 6, 7 and 8 of Ref. [54] for a more exhaustive comparison).We will present the outcomes of the reweighting procedure, specifically addressing this STAR π 0 dataset, at the end of Section 6.2 (omitting figures for brevity).
In Fig. 5 we show our estimates and compare them against STAR results for non-isolated pions.Both GPM and CGI-GPM describe the data rather well within uncertainties at the two different energies of 200 and 500 GeV.As the reweighting includes information from all the aforementioned datasets, we observe a steady increase at large x F .However, when the reweighting is limited to the new STAR data alone, the shape of the reweighted bands appears flatter, mirroring the trend of the non-isolated pion data.In Section 6.2, we will also discuss the uncertainties affecting the TMDs and the corresponding N eff obtained from reweighting in this specific case.

Impact of A N data on Sivers, transversity and Collins functions
We now examine the role played by A N data in the extraction of the Sivers, transversity and Collins functions.As a general feature, we anticipate that these data impact mostly on the TMD-PDFs, namely the Sivers and the transversity functions.
We start by examining the Sivers case.In Fig. 6 we compare the unweighted and reweighted first moment of the quark Sivers functions, Eq.( 21), in the GPM (left panels) and CGI-GPM (right panels).As a general trend, the reweighted curves present reduced uncertainties.This reduction is more pronounced for the d-quark than for the u-quark Sivers function.The relative reduction of uncertainty of the reweighted Sivers first moments is about 20 − 30% for f ⊥u 1T and 40 − 90% for f ⊥d 1T .The effective number of sets (see Eq. ( 17)) surviving after reweighting is N eff = 547 (706) in the GPM (CGI-GPM) case.Fig. 7 shows that, in both approaches, the parameters for the u-quark Sivers function and the Gaussian Sivers width do not change much, while the GPM appears to favor a smaller overall absolute value of the normalization for the d-quark Sivers function, with a slower decrease at large x (smaller β d parameter), while the CGI seem to prefer a larger N d (in size), but with a faster decrease at large x.Considering the transversity and Collins case, we empha-size that, though the Collins contribution to A N is formally the same in the GPM and CGI-GPM, the results for the reweighted curves for h q 1 and H ⊥q 1 are slightly different.This reflects the different role of the Sivers contribution to A N in the two approaches.
In Fig. 8 we present the comparison between unweighted and reweighted u v and d v transversity functions, along with their corresponding Soffer bound, in the GPM (left panels) and CGI-GPM (right panels) at Q 2 = 4 GeV 2 .Note that, compared to the unweighted results, A N data favor on average a slightly smaller h u v 1 in the region x ≲ 0.3 and a slightly larger h u v 1 in the large-x region.The inclusion of A N data sizeably reduces the uncertainty band in the region of x ≳ 0.3.As for h d v 1 , we observe that a larger absolute value is preferred by the data on A N .This is induced by the A π 0 N data at large x F (which are related to large x values of the functions probed upon integration), that tend to favor sets yielding large asymmetries.The uncertainty reduction is about 20 − 30% at smaller values of x, extending up to 80 − 90% at larger x values for h u v 1 , both in the GPM and in the CGI-GPM, while for h d v 1 the reduction is 30 − 40% (60%) in the GPM (CGI-GPM) at small x and up to 80 − 90% at large x in both cases.Here, the effective number of sets after the reweighting is N eff = 285 (GPM) and N eff = 110 (CGI-GPM).This might be due to the poor description of π − data (see Fig. 3).In Fig. 9 we show the unweighted and reweighted Collins first moments in the two approaches, at Q 2 = 4 GeV 2 , a typical SIDIS scale.This quantity is defined as [116] H ⊥(1) q where the last line is obtained adopting the parametrization in Eq. (25).For these functions, the impact of A N data is less strong, but it allows for a reduction of the uncertainties (in both approaches) of about 5-10% for the favored Collins function and about 15% for the unfavored.The previously mentioned slower decrease of the transversity function for increasing values of x becomes evident when examining Fig. 10, which compares unweighted (hatched gray histograms) and reweighted distributions of the fit parameters in both the GPM (red histograms) and CGI-GPM (green histograms).As noted above, A N data mainly affect the transversity function.This is clearly represented in the four top panels of Fig. 10: reweighted values of N T u v tend to be smaller while the negative N T d v values are larger in size, approaching the limiting value of the Soffer bound (|N T q | ≤ 1).At large x, h q 1 tends to decrease following the Soffer bound rather closely (see the corresponding histogram of the β parameter distribution, where the reweighted average values move close to zero).On the other hand, although the Collins parameter distributions show less sizeable variations, the uncertainties of the reweighted Collins functions are still slightly reduced.These considerations point towards the observation that the dominant contribution to A N is given by the Collins effect.This is consistent with some recent results obtained within the twist-3 approach [67,68], where it was found that the main contribution to A N comes from the fragmentation mechanism.
Let us now briefly revisit the induced correlations that emerge from the simultaneous reweighting.We verified that the correlation matrix for the unweighted parameters factorizes into two submatrices (one for the f ⊥ 1T and one for the h q 1 and H ⊥q 1 parameters).Conversely, as expected, some correlations are introduced by the reweighting procedure, as mentioned in our discussion on simultaneous reweighting.Specifically, we observe weak correlations between the Sivers and transversity normalizations and the β parameters, both within the GPM and the CGI-GPM.
We finally provide a few remarks on the results we obtained for the reweighting procedure using solely the new STAR data for non-isolated π 0 s.Interestingly, we note a more modest reduction in uncertainties of the reweighted TMDs, particularly for the Sivers and Collins functions, while for h q 1 at higher x values the reduction is more sizeable.Furthermore, we observe a higher effective number of retained sets, specifically N eff = 1807 (1961) for the Sivers fit within GPM (CGI-GPM), and N eff = 1877 and 1514 for the transversity and Collins extractions within the GPM and CGI-GPM, respectively.These results appear to suggest a better compatibility between these latest STAR data and measurements from SIDIS and e + e − experiments.

Tensor charges
We conclude our analysis by reporting the corresponding values obtained for the nucleon tensor charges, defined as: ).These values are slightly larger as compared to older analyses (see Table 1 of Ref. [79], "using SB" case), due to the new HERMES data on proton, which render larger asymmetries and hence larger fitted functions, as previously observed, for instance, in Ref. [68].
A detailed comparison of our results and various estimates of the tensor charges from phenomenological analyses is presented in Fig. 11.We note that our current analysis and the majority of the previous studies of Refs.[79,102,[117][118][119][120][121][122] yield consistent values for g T , δu, and δd.This corroborates the consistency of different extractions of transversity within different approaches exploiting a variety of experimental data.

Conclusions and outlook
In this paper we have investigated the Bayesian reweighting procedure, extending it to the case of multiple, independent fits.For the first time, we have employed this technique to simultaneously reweight two independent extractions of quark TMD parton densities.Specifically, we have focused on the Sivers function and the TMD transversity and Collins functions.To this aim we have considered transverse single spin asymmetries data for inclusive pion production in polarized pp collisions at RHIC.
The simultaneous reweighting involves two statistically independent fits, each with a substantially large MC sample (O(10 5 ) sets) reflecting their corresponding uncertainty.Since these fits contribute additively to A N , computing all possible combinations of both MC sets would in principle be necessary.However, to expedite the numerical computation and make them more efficient, we have developed and extended a compression technique for MC set samples.This innovative approach enabled us to exploit only 1% of the sets in the reweighting pro-cess, without sacrificing any statistical information on parameter distributions.Such optimization not only enhances computational efficiency but also offers enough flexibility for further application to studies involving large sample parameter distributions.
Our phenomenological study, due to its peculiarities, has required an educated selection of the experimental data to be used for the reweighting procedure.The latest A π 0 N data from STAR Collaboration differ from previous measurements at RHIC, as they are provided separating non-isolated from isolated pions.The non-isolated dataset turned out to be more compatible with SIDIS and e + e − measurements, for which TMD factorization holds and from which we extract the TMDs, i.e. our priors.Note however that, in our comprehensive analysis, we have included all available A N data for charged and neutral pions, obtaining a satisfactory global description.
The adopted dataset is dominated by π 0 production data, while only a few data points from BRAHMS for charged pions are available.As A π ± N data are more sensitive to flavor separation, they could help in disentangling the issue of the predicted Sivers sign change.The description of these data seems to favor the GPM, where all TMDs are assumed to be universal and in which, contrary to the CGI-GPM, the expected Sivers sign change is not naturally recovered.The inclusion of data from future experiments, like COMPASS/AMBER [25][26][27], JLab [21], the EIC [22,23] and the fixed-target programs at Tevatron at SpinQuest [30], and the LHC [29] would indeed be crucial in shedding light on this fundamental issue.
Our estimates exhibit improved agreement with data compared to previous analyses, owing partly to the careful incorporation of the Soffer bound in the TMD transversity distribution fit.This theoretical constraint, applied a posteriori, renders larger asymmetries at large x F , thereby favoring the dominance of the Collins mechanism, as observed in recent analyses within the collinear twist-3 formalism [67,68].
Consistently with our previous work [69], the reweighted TMD distributions present reduced uncertainties at large x, confirming once again the complementarity of A N data with SIDIS measurements.The reduction in uncertainty is about 40% (90%) for the u-(d-)quark Sivers function, and about 80% to 90% for the u v and d v TMD transversity functions.The uncertainty reduction for the Collins functions is smaller (about 10% for the favored and 20% for the unfavored Collins TMDs), confirming that e + e − data provide the strongest constraints on this polarized TMD fragmentation function.
By performing the reweighting solely on non-isolated pion data, the reduction in uncertainties is smaller for the Sivers and Collins functions, and similar for the TMD transversity at large x.The retained N eff sets in the GPM (CGI-GPM) case is ∼ 90% (∼ 95%) for the Sivers fit and ∼ 90% (∼ 75%) for the transversity and Collins extraction.This confirms an enhanced compatibility of SIDIS and e + e − data with the new A N measurements from STAR for non-isolated pions.
This work is a natural extension of our previous study [69], and a proof of concept for upcoming TMD analyses.Future studies will explore different TMD parametrizations and incorporate new data from COMPASS/AMBER [26], JLab [21], and from the future Electron-Ion Collider [22,23].Additionally, planned investigations into inclusive jet or pion-in-jet production data in pp collisions, where Sivers and Collins effects can be accessed individually, will further contribute to a more comprehensive understanding of TMD dynamics, universality and factorization breaking effects.

Figure 1 :
Figure 1: Comparison between reweighted curves for the full (gray bands) and reduced (hatched bands) samples for the reweighting analysis of Ref. [69].

Figure 2 :
Figure2: Comparison between reweighted first moments of up-(upper panels) and down-quark (lower panels) Sivers functions, normalized to their central value, using full (gray bands) and reduced (hatched bands) samples for the reweighting analysis of Ref.[69].

Figure 3 :
Figure 3: Results for the simultaneous reweighting of the Sivers, transversity and Collins functions: unweighted and reweighted predictions for BRAHMS A π ±N data[45] in the GPM (left panels) and the CGI-GPM (right panels) are presented.Data points in gray correspond to P T < 1.5 GeV.

xF 3 . 3 Figure 4 :
Figure 4: Results for the simultaneous reweighting of the Sivers, transversity and Collins functions: reweighted curves for STAR data[43,46,49] in the GPM (upper panels) and the CGI-GPM (lower panels) are presented.Data points in gray correspond to P T < 1.5 GeV.

ANFigure 5 :
Figure5: Results for the simultaneous reweighting of the Sivers, transversity and Collins functions: unweighted and reweighted predictions of STAR A N data for non isolated neutral pions[54].Comparisons of the asymmetries computed in the GPM (left panels) and in the CGI-GPM (right panels) with experimental data at √ s = 200 GeV (upper panels) and √ s = 500 GeV (lower panels) are presented.Here all data points correspond to P T > 1.5 GeV.

1 xFigure 6 :
Figure 6: Comparison of unweighted and reweighted first moments of up-(upper panels) and down-quark (lower panels) Sivers functions in the GPM (left panels) and in the CGI-GPM (right panels).The relative reduction of uncertainty is shown in the bottom panels.

Figure 7 :
Figure 7: Comparison of unweighted (gray) and reweighted distributions of parameters for the quark Sivers functions in the GPM (red) and in the CGI-GPM (green).

Figure 8 :
Figure 8: Comparison of unweighted and reweighted u v and d v transversity functions in the GPM (left panels) and in the CGI-GPM (right panels).The corresponding Soffer bounds and the relative reduction of uncertainty (same color coding in the bottom panels) are also shown.

Figure 9 :Figure 10 :
Figure 9: Comparison of unweighted and reweighted favored (upper panels) and unfavored (lower panels) first moments of the Collins functions in the GPM (left panels) and in the CGI-GPM (right panels) at Q 2 = 4 GeV 2 .The relative reduction of uncertainties is shown in the bottom plots.

Figure 11 : 11 −
Figure 11: Comparison of our results for the u and d tensor charges (left panel) and the iso-vector combination g T (right panel) with phenomenological estimates from Refs.[79, 102, 117-122] at Q 2 = 4 GeV 2 .CGI-GPM (green) and GPM (red) reweighted central values almost coincide, having also similar uncertainties.