First measurement of the Sivers asymmetry for gluons from SIDIS data

The Sivers function describes the correlation between the transverse spin of a nucleon and the transverse motion of its partons. It was extracted from measurements of the azimuthal asymmetry of hadrons produced in semi-inclusive deep inelastic scattering of leptons off transversely polarised nucleon targets, and it turned out to be non-zero for quarks. In this letter the evaluation of the Sivers asymmetry for gluons in the same process is presented. The analysis method is based on a Monte Carlo simulation that includes three hard processes: photon-gluon fusion, QCD Compton scattering and leading-order virtual-photon absorption process. The Sivers asymmetries of the three processes are simultaneously extracted using the LEPTO event generator and a neural network approach. The method is applied to samples of events containing at least two hadrons with large transverse momentum from the COMPASS data taken with a 160 GeV/$c$ muon beam scattered off transversely polarised deuterons and protons. With a significance of more than two standard deviations a negative value is obtained for the gluon Sivers asymmetry. The result of a similar analysis for a Collins-like asymmetry for gluons is consistent with zero.


Introduction
An interesting and recently examined property of the quark distribution in a nucleon that is polarised transversely to its momentum is the fact that it is not left-right symmetric with respect to the plane defined by the directions of nucleon spin and momentum. This asymmetry of the distribution function is called the Sivers effect and was first suggested [1] as an explanation for the large left-right single transverse spin asymmetries observed for pions produced in the reaction p ↑ p → πX [2][3][4]. On the basis of T-invariance arguments the existence of such an asymmetric distribution, known as Sivers distribution function, was originally excluded [5]. Ten years later it was recognised however that it was indeed possible [6]. At that time it was also predicted that the Sivers function in semi-inclusive measurements of hadron production in DIS (SIDIS) and in the Drell-Yan process have opposite sign [7], a property referred to as "restricted universality". A few years later the Sivers effect was experimentally observed in SIDIS experiments on transversely polarised proton targets, first by the HERMES Collaboration [8] and then by the COMPASS Collaboration [9]. Using the first HERMES data and the early COMPASS data taken with a transversely polarised deuteron target [10], a combined analysis soon allowed for first extractions of the Sivers function for u and d-quarks [11][12][13]. More precise measurements of the Sivers effect were performed since by the HERMES [14] and COMPASS [15][16][17] Collaborations, and new measurements with a transversely polarised 3 He target were also carried out at JLab [18,19]. More information can be found in recent reviews [20][21][22].
At this point the question arises whether the gluon distribution in a transversely polarised nucleon is leftright symmetric or exhibits a Sivers effect similar to the quark distributions. Recently, the issue has been discussed repeatedly in the literature and the properties of the gluon Sivers distributions have been studied in great detail [23,24]. While it was found that a non-zero Sivers function implies motion of partons in the nucleon, presently the connection between the Sivers function and the parton orbital angular momentum in the nucleon can only be described in a model-dependent way [25]. The correspondence between the Sivers effect and the transverse motion of partons has been originally proposed by M. Burkardt [26][27][28]. Hence it is of great interest to know whether there exists a gluon Sivers effect or not.
Presently, little is known on the gluon Sivers function. An important theoretical constraint comes from the so-called Burkardt sum rule [29]. It states, based on the presence of QCD colour-gauge links, that the total transverse momentum of all partons inside a transversely polarised proton should vanish. Fits to the Sivers asymmetry using SIDIS data [13] almost fulfil, within uncertainties, the Burkardt sum rule, leaving little space for a gluon contribution. From the null result of the COMPASS experiment for the Sivers asymmetry of positive and negative hadrons produced on a transversely polarised deuteron target [10], together with additional theoretical considerations, Brodsky and Gardner [30] stated that the gluon contribution to the parton orbital angular momentum should be negligible, and consequently that the gluon Sivers effect should be small. Also, using the so-called transverse momentum dependent (TMD) generalised parton model and the most recent phenomenological information on the quark Sivers distributions coming from SIDIS data, interesting constraints on our knowledge of the gluon Sivers function were derived [31] from the recent precise data on the transverse single spin asymmetry A N (p ↑ p → π 0 X) measured at central rapidity by the PHENIX Collaboration at RHIC [32].
In DIS, the leading-order virtual-photon absorption process (LP) does not provide direct access to the gluon distribution since the virtual-photon does not couple to the gluon, so that higher-order processes have to be studied, i.e. QCD Compton scattering (QCDC) and Photon-Gluon Fusion (PGF). It is well known that in lepton-proton scattering one of the most promising processes to directly probe the gluon is open charm production, p ↑ → ccX. This is the channel that the COMPASS Collaboration has investigated at length in order to measure ∆g/g , the gluon polarisation in a longitudinally polarised nucleon [33]. Tagging the charm quark by identifying D-mesons in the final state has the advantage that in the lowest order of the strong coupling constant there are no other contributions to the cross section and one becomes essentially sensitive to the gluon distribution function. An alternative method to tag the gluon in DIS, which has the advantage of higher statistics, has also been developed and used by COMPASS, i.e. the production of high-p T hadrons [34,35]. In the LP, the hadron transverse momentum p T with respect to the virtual photon direction (in the frame where the nucleon momentum is parallel to this direction) originates from the intrinsic transverse momentum k T of the struck quark in the nucleon and its fragmentation, which both lead to a small transverse component. On the contrary, both the QCDC and PGF hard processes can provide hadrons with high transverse momentum. Therefore, tagging events with hadrons of high transverse momentum p T enhances the contribution of higher-order processes. Nevertheless, although in the high-p T sample the PGF fraction is enriched, in order to single out the contribution of the PGF process to the measured asymmetry the contributions from LP and QCDC have to be subtracted [36].
In this letter, the gluon Sivers effect is investigated using COMPASS data collected by scattering a 160 GeV/c muon beam off transversely polarised deuterons and protons. The experimental set-up and the data selection are described in Section 2. In Section 3 the measurement is described. The details of the analysis are given in Section 4. The procedure of neural network (NN) training with a Monte Carlo data sample is shown in Section 5. Section 6 contains the overview of the systematic studies. In Section 7 the results are presented. Summary and conclusions are given in Section 8.

Experimental set-up and data samples
The COMPASS experiment uses a fixed target set-up and a polarised muon beam delivered by the M2 beam line of the CERN SPS. The transversely polarised deuteron target used for the 2003 and 2004 data taking consisted of two oppositely polarised cylindrical cells situated along the beam, each 60 cm long with a 10 cm gap in between. In 2010 the transversely polarised proton target consisted of three cells: 30 cm, 60 cm and 30 cm long with the central cell oppositely polarised to the downstream and upstream cell and 5 cm gaps between the cells. During all data taking periods the polarisation was reversed once per week, in this way systematic effects due to acceptance are cancelled. For the deuteron runs the target was filled with 6 LiD. The 6 Li nucleus can be regarded as one quasi-free deuteron and a 4 He core. The average dilution factor f d , defined as the ratio of the DIS cross section on polarisable nucleons in the target to the cross section on all target nucleons, amounts to 0.36 and includes also electromagnetic radiative corrections. The average polarisation of the deuteron was 0.50. For the asymmetry measurements on the proton, NH 3 was used as a target. Its average dilution factor f p amounts to 0.15 and the proton polarisation to 0.80. In both cases, the naturally polarised muon beam of 160 GeV/c was used. The basic features of the COMPASS spectrometer, as described in Ref. [37], are the same for 2003-4 and 2010 data taking. Several upgrades were performed in 2005, the main one being the installation of a new target magnet, which allowed to increase the polar angle acceptance from 70 mrad to 180 mrad.
A crucial point of this analysis is the search for an observable that is strongly correlated with the gluon azimuthal angle φ g . In the LEPTO generator [38], gluons are accessed via PGF with a quark-antiquark pair in the final state and the fragmentation process is described by the Lund model [39]. As a result of MC studies, the best correlation is found between φ g and φ P , where the latter denotes the azimuthal angle of the vector sum P of the two hadron momenta. For the present analysis, two charged hadrons for each event are selected. If more than two charged hadrons are reconstructed in an event, only the hadron with the largest transverse momentum, p T 1 , and the one with the second-largest transverse momentum, p T 2 , are taken into account. In order to enhance the PGF fraction in the sample and at the same time the correlation between φ g and φ P , a further requirement is applied to the transverse momenta of the two hadrons: p T 1 > 0.7 GeV/c and p T 2 > 0.4 GeV/c. Moreover, the fractional energies of the two hadrons must fulfil the following conditions: z i > 0.1 (i = 1, 2) and z 1 + z 2 < 0.9, where the last requirement rejects events from diffractive vector meson production. Hadron pairs are selected with no charge constraint. With this choice the correlation coefficient is 0.54. The Sivers asymmetry is then obtained as the sine modulation in the Sivers angle, φ Xiv = φ P − φ S . Here φ S is the azimuthal angle of the nucleon spin vector.
The same kinematic data selection is used for both deuteron and proton data. The requirement on photon virtuality, Q 2 > 1 (GeV/c) 2 , selects events in the perturbative region and the one on the mass of the hadronic final state, W > 5 GeV/c 2 , removes the region of the exclusive nucleon resonance production. The Bjorken-x variable covers the range 0.003 < x B j < 0.7. For the fractional energy of the virtual photon, y, the limit y > 0.1 removes a region sensitive to experimental biases and the requirement y < 0.9 rejects events with large electromagnetic radiative corrections.

Sivers asymmetry in two hadron production
In order to extract the gluon Sivers asymmetry, µ + N → µ + 2h + X events are selected as described in Section 2. By labelling with the symbol ↑ the cross section associated to a target cell polarised upwards in the laboratory and by ↓ the cross section of a target cell polarised downwards in the laboratory, the Sivers asymmetry can be written as All cross sections are integrated over the two azimuthal angles φ S and φ R , where φ R is the azimuthal angle of the relative momentum of the two hadrons, R = P 1 − P 2 . The number of events in a φ Siv bin is given by Here f is the dilution factor, P T the target polarisation and α = anΦσ 0 an acceptance-dependent factor, where a is the total spectrometer acceptance, n the density of scattering centres, Φ the beam flux and σ 0 the spin-averaged part of the cross section. From here on, the Sivers asymmetry A 2h T ( x, φ Siv ) is factorised into the azimuth-independent amplitude A Siv ( x) and the modulation sin φ Siv . In order to extract the Sivers asymmetry of the gluon, the amplitude of the sin φ Siv modulation is extracted from data. The general expression for the cross section of SIDIS production with at least one hadron in the final state is well known [40]. It contains eight azimuthal modulations, which are functions of the single-hadron azimuthal angle and φ S . In the absence of correlations possibly introduced by experimental effects, they are all orthogonal, so that the Sivers asymmetry can either be extracted as the amplitude of the sin φ Siv modulation or one can perform a simultaneous fit of all eight amplitudes. For the case of heavy-quark pair and dijet production in lepton-nucleon collisions, all azimuthal asymmetries associated to the gluon distribution function have been recently worked out in Ref. 41. There, the Sivers asymmetry is defined as the amplitude of the sin (φ T − φ S ) modulation, where φ T is the azimuthal angle of the transverse-momentum vector of the quark-antiquark pair, q T . In our analysis, φ T is replaced by φ P , due to its correlation with the gluon azimuthal angle φ g , and the Sivers asymmetry is extracted taking into account only the sin (φ P − φ S ) modulation in the cross section. It has been verified that including in the cross section the same eight transverse-spin modulations as in SIDIS single-hadron production [40] and extracting simultaneously all asymmetries gives the same result on the gluon Sivers asymmetry.
In order to determine the Sivers asymmetry for gluons from two-hadron production in SIDIS, it is necessary to assume that the main contributors to muon-nucleon DIS are the three processes ( Fig. 1) as presented in Ref. [38]. This model is successful in describing the unpolarised data. At COMPASS kinematics, the leading process appears at zero-order QCD in the total DIS cross section and it is the dominant process, while the other two processes, photon-gluon fusion and QCD Compton, are first-order QCD processes and hence suppressed. However, their contribution can be enhanced by constraining the transverse momentum of the produced hadrons, as mentioned above.
Introducing the process fractions R j = σ j /σ ( j ∈ {PGF, QCDC, LP}), the amplitude of the Sivers asym- metry can be expressed in terms of the amplitudes of the three contributing processes: with σ = ∑ j σ j , ∆σ = ∑ j ∆σ j and f P T A Siv j sin φ Siv = ∆σ j /σ j . The determination of R j is done on an event-by-event basis by using the neural networks (NN) trained on Monte Carlo data as described in Section 5.

Asymmetry extraction using the methods of weights
The method adopted in the present analysis was already applied to extract the gluon polarisation from the longitudinal double-spin asymmetry in the SIDIS measurement of single-hadron production [36]. Both for the deuteron data (two target cells) and the proton data (three target cells), four target configurations can be introduced. In the case of the two-cell target: 1 -upstream, 2 -downstream, 3upstream , 4 -downstream . In the case of the three-cell target: 1 -(upstream+downstream), 2 -centre, 3 -(upstream +downstream ), 4 -centre . Here upstream , centre and downstream denote the cells after the polarisation reversal and configuration 1 has the polarisation pointing upwards in the laboratory frame. Decomposing the Sivers asymmetry into the asymmetries of the contributing processes (Eq. (3)) and introducing the Sivers which is specific for process j, one can rewrite Eq. (2): where t = 1, 2, 3, 4 denotes the target configuration.
In order to minimise statistical uncertainties for each process, a weighting factor is introduced. It is known [42] that the choice ω j = β j for the weight optimises the statistical uncertainty but variations of the target polarisation P T in time may introduce a bias to the final result. Therefore, the weighting factor ω j ≡ β j /P T is used instead. Each of the four equations (4) is weighted three times with ω j depending on the process j ∈ {PGF, QCDC, LP} and integrated over φ Siv and x, yielding twelve observed quantities q t j : whereα t j is the ω j -weighted acceptance-dependent factor. The quantities {β t i } ω j and {A Siv i } β t i ω j are weighted averages, where the weight factor is denoted in the subscript. 1 The acceptance factorsα t j cancel when for asymmetry extraction one uses the double ratio as the data taking was performed such thatα 1 jα 4 j /α 2 jα 3 j = 1. If this condition is not fulfilled, false asymmetries may occur. It is checked that this is not the case (see Section 6).
In the analysis, the quantities q j and {β t i } ω j are approximated as follows: The latter approximation holds for small observed raw asymmetries, i.e. ωA 1. In order to avoid numerical inconsistencies in Eq. (8) due to a zero-pole when integrating over the full range of φ Siv , 2 two bins in φ Siv ([0; π], [π; 2π]) are introduced. In the aforementioned three double ratios given in Eq. (6) only asymmetries are unknown. However, in order to solve the system of equations one needs to assume that the weighted asymmetry for a given process i is the same for the three different weights ω j β i , i.e.
This means that the values of ω j and A i must be uncorrelated. For example, since ω j is proportional to R j , which strongly depends on the hadron transverse momentum, one has to use a kinematic region where the asymmetries A i are expected to be independent of p T . Under these assumptions, the number of unknown weighted asymmetries is three, which exactly corresponds to the number of equations of type (6). These equations are solved by a χ 2 fit that includes simultaneously both bins in φ Siv .
Assuming that A i can be approximated by a linear function of x i and that x i is not correlated with ω j , results in This approximation allows to interpret the obtained results as an asymmetry value measured at the weighted value of x i . For each process, the weighted value of x i is obtained from MC using the rela- Aαβ ωd xdφ Siv αβ ωd xdφ Siv 2 Note that ω k j , which contains sin φ Siv , is integrated in the region 0 to 2π.
Here, N i is the number of events of type i in MC data. The assumption that the values of x i are not correlated with ω j , which allows us to consider only {x i } ω i β i , was verified using MC data. The details of the analysis are given in [43].

Monte Carlo optimisation and Neural Network training
The present analysis is very similar to the one used for the ∆g/g extraction from high-p T hadron pairs [35] and single hadrons [36]. For the NN training with custom input, output and target vector the package NetMaker [44] is used. The NN is trained with a Monte Carlo sample with process identification. As input vector the following six kinematic variables are chosen: The latter two are the longitudinal components of the hadron momenta. The trained neural network is applied to the data by taking the vector of the aforementioned six variables, and its output is interpreted as probabilities that the given event is a result of one of the three contributing processes. Hence the simulated distributions of these variables need to be in agreement with the corresponding distributions in the data samples.
Using the LEPTO generator (version 6.5) [38], two separate MC data samples were produced to simulate the deuteron and proton data. The generator is tuned to the COMPASS data sample obtained with the high-p T hadron-pair selection as described in Ref. [35]. The MSTW08 parameterisation of input PDFs [45] was chosen as it gives a good description of the F 2 structure function in the COMPASS kinematic range and is valid down to Q 2 = 1 (GeV/c) 2 . Electromagnetic radiative corrections [46] were applied as a weighting factor to the MC distributions shown in Figures 2 and 3 but not in the MC samples used in NN training. This difference was studied and it was estimated to be negligible.
The generated events were processed by COMGEANT, the COMPASS detector simulation program based on GEANT3. The MC samples for the proton and deuteron data differ in the target material and in the spectrometer set-up. The FLUKA package [47] is used in order to simulate secondary interactions. As the next step, the COMPASS reconstruction program CORAL was applied. The same data selection as for real events was used for MC data. Figures 2 and 3 show the comparison between experimental and MC data for the case of the deuteron and proton data, respectively.
The main goal of the NN parameterisation is the estimation of the process fractions R j . In the typical case of signal and background separation, the expected NN output would be set to one for the signal and zero for the background. The output value returned by the NN would then correspond to the fraction of signal events in the sample in the given phase space point of the input parameter vector. In the present analysis, the process fractions were estimated simultaneously. In order to have a closure relation on the process probabilities, the sum of them must add up to one, hence only two independent output variables from the NN are needed. The estimation of the process fractions R j from the NN output is accomplished by assigning to each event the probabilities P PGF NN , P QCDC NN and P LP NN . The distribution of the NN output after training is shown in Fig. 4 on the "Mandelstam representation", i.e. as points in an equilateral triangle with unit height. Points outside of the triangle refer to one estimator being negative, which is possible because in the training the estimators are not bound to be positive. The direct separation of the PGF process using this distribution is statistically less efficient than weighting each event by the three probabilities obtained from the NN output values. These probabilities are used as values of the process fractions (R PGF , R QCDC and R LP ) in the data analysis described in Section 4.
For the validation of the NN training, a statistically independent MC sample is used to check how the NN Here P NN is the fraction of the process given by the NN and P MC is the true fraction of each process from MC in a given P NN bin. Bottom panels: Difference P MC − P NN per bin.
works on a sample different from the one used for the training. In each bin of P NN (the value assigned to every MC event by the trained NN), the true fraction obtained from LEPTO based on the process ID, P MC , is calculated. The results for the NN trained with a MC sample for the proton data are presented in Fig. 5. Altogether, the agreement between P NN and P MC for all three processes is satisfactory. However, for the two last bins of PGF, the two last bins of QCDC and the last bin of LP the neural network output does not coincide with the true fraction of the given process. This discrepancy concerns a small part of the event sample and is included as a part of the MC-dependent systematic uncertainty. Because of the good agreement between MC and real data, it is assumed that the fractions in Eq. 3 can be taken from the trained NN, which means that on average R j = P j NN .

Systematic uncertainties
The main source of systematic uncertainties is the dependence of the final results on the Monte Carlo settings and tuning. In order to estimate this uncertainty, different MC settings were used in the process of neural network training. Different combinations of fragmentation parameters were used, the default LEPTO tuning or the COMPASS tuning for the high-p T selected sample. The event generation was done with and without the 'Parton Shower' [48]. Two PDF sets were used (MSTW08 or CTEQ5L [49]). Two different parameterisations of the longitudinal structure function F L are used, either from LEPTO or the R = σ L /σ T parameterisation of Ref. [50]. For secondary interactions, either the FLUKA or the GHEISHA [51] package were used. Figure 6 shows the results for the gluon Sivers asymmetry when using eight different MC productions of the deuteron and proton data. The final result is presented on the top. These two MC productions, using FLUKA and Parton Shower, yield the best comparison between experimental and MC data, which is shown in Fig. 2 and 3. The systematic uncertainty originating from different MC tunings is calculated as (A PGF max − A PGF min )/2. The systematic uncertainty due to false asymmetries was studied by extracting the asymmetries between the two parts of the same target cell. The results are found to be compatible with zero. Furthermore it was checked how a small artificial false Sivers asymmetry influences the final result. When a false asymmetry of 1% is introduced, for both proton and deuteron data the final result changes by 25% of the statistical error. No systematic uncertainty is assigned to account for false asymmetries.
The final state of the photon-gluon-fusion process is a quark-antiquark pair. Thus most of the hadron pairs produced from this subprocess should have opposite charge. Although a selection q 1 q 2 = −1 slightly increases the (φ g , φ P ) correlation, it also reduces the statistics. The results with and without this requirement are statistically consistent. The requirement of opposite charges of the two hadrons is hence not included in the data selection.
Radiative corrections were not included in the MC production that is used in the main analysis of this letter. In order to estimate the systematic uncertainty introduced by this omission, a separate MC sample is used that was produced for the 2006 COMPASS set-up including radiative corrections based on RADGEN [52]. The difference in the final value for the gluon Sivers asymmetry for the proton is only 0.018, which corresponds to 21% of the statistical uncertainty. A corresponding systematic uncertainty is assigned due to the fact that radiative corrections are not included in the MC simulations and hence in the NN training.
Our results are obtained in only one x g bin for A Siv PGF , one x C bin for A Siv QCDC and one x B j bin for A Siv LP . As the asymmetries are strongly correlated binning in x B j affects the values of A Siv PGF and A Siv QCDC , which are extracted in a single bin as before. The A Siv PGF result changes by 0.07 for deuteron data and 0.011 for proton data when two x B j bins are introduced, and these values are taken as an estimate of the related systematic uncertainty (see Table 1).
The asymmetries A Siv j of Eq. (4) were also extracted using the unbinned maximum likelihood method that, as expected, yields the same results of A PGF as the above described analysis. Concerning the orthogonality of different modulations of the cross section, it was checked by what amount the Sivers asymmetry changes, when also the other seven asymmetries were included in the fit (see above). The change in the final result of the PGF asymmetry is negligible for both deuteron and proton data.
The systematic uncertainties on target polarisation and dilution factor are multiplicative and estimated to be about 5% and 2% of the statistical uncertainty, respectively. The final systematic uncertainty is obtained by summing all components in quadrature. All above mentioned contributions and the final systematic uncertainty are listed in Table 1.

Results
The method presented in Section 4 with the use of trained neural networks was applied to the two data sets described in Section 2. The gluon Sivers asymmetry as extracted from lepton nucleon DIS, in which at least two high-p T hadrons are detected, is shown in Fig. 7 and presented in Table 2 together with the contribution of the two other hard processes, i.e. QCD Compton and leading process. The result of the analysis of the deuteron data is A Siv,d PGF = −0.14 ± 0.15(stat.) ± 0.10(syst.) measured at x g = 0.13. The proton result, A Siv,p PGF = −0.26 ± 0.09(stat.) ± 0.06(syst.) obtained at x g = 0.15, is consistent with the deuteron result within less than one standard deviation of the combined statistical uncertainty. The two results are expected to be consistent, as presumably the transverse motion of gluons is the same in neutron and proton. Combining the proton and deuteron results, the measured effect is negative, A Siv PGF = −0.23 ± 0.08(stat) ± 0.05(syst), which is away from zero by more than two standard deviations of the quadratically combined uncertainty. This result is particularly interesting in view of the gluon contribution to the proton spin. A non-zero gluon Sivers effect is a signature of gluon transverse motion in the proton [25]. The recent analysis of the PHENIX data [31] gives a gluon Sivers effect for protons, which is compatible with zero. The COMPASS result for the proton target is negative and more than two standard deviations below zero, but it should be noted that the two results are obtained for different centre of mass energy and x g values. Even more important, one has to recall that the existence of colour gauge links complicates the picture, as they lead to two different universal gluon Sivers functions, which in the different processes combine with process-dependent calculable factors [53]. As a result, the gluon Sivers function that appears in one process can be different from the one appearing in a different process, and assessment of compatibility requires a deeper theoretical analysis. For the asymmetry of the leading process, the high-p T sample of the COMPASS proton data has provided a positive value (see Fig. 7 right-bottom panel). It can be compared with the COMPASS results on the Sivers asymmetry for charged hadrons produced in SIDIS p → h ± X single-hadron production [17], which for negative hadrons was found to be about zero and for positive hadrons different from zero and positive, so that for the two-hadron final state a positive value may indeed be expected.
The same analysis method was also applied to extract the Collins-like asymmetry for charged hadrons, i.e. the cross section dependence on the sine of the Collins angle (φ P + φ S − π). To this purpose, the asymmetries A were determined for the same COMPASS highp T deuteron and proton data samples. The results are shown in Fig. 8. The amplitude of the Collins modulation for gluons is found to be consistent with zero, in agreement with the naive expectation that is based on the fact that there is no gluon transversity distribution [54]. Recently it was suggested that a transversity-like TMD gluon distribution h g 1 could generate a sin (φ S + φ T ) modulation in leptoproduction of two jets or heavy quarks [41]. In this case the results shown in Fig. 8 provide a bound to the size of h g 1 . The results given in the present letter can also be interpreted such that no false systematic asymmetry is introduced by the rather complex analysis method used, and that the result obtained for the gluon Sivers asymmetry, which is definitely different from zero, is strengthened. It should also be noted that the Collins-like asymmetry of the leading process for the proton is found to be consistent with zero for high-p T hadron pairs, in qualitative agreement with the measurement of the Collins asymmetry in singlehadron SIDIS measurement [55], where opposite values of about equal size were observed for positive

Summary and conclusions
The Sivers asymmetry for gluons is extracted from the measurement of high-p T hadron pairs in SIDIS at COMPASS off transversely polarised deuterons and protons. The analysis is very similar to the one already used by the COMPASS collaboration in order to measure ∆g/g , the gluon polarisation in a longitudinally polarised nucleon. The large kinematic acceptance and the high energy of the muon beam make the sample containing two high-p T hadrons sufficiently large for the present analysis, which is limited to a small part of the accessible phase space. The criteria applied to select hadron pairs allow to enhance the contribution of the photon-gluon fusion process with respect to the leading-order virtual-photon absorption process. The Sivers asymmetry was then obtained as the amplitude of the sine modulation in the Sivers angle, φ Siv = φ P − φ S .
In spite of the enrichment of the PGF fraction in the high-p T hadron pair sample, in order to single out the contribution of the PGF process to the asymmetry it is necessary to subtract the contributions from the other two processes, LP and QCDC. In this analysis, the fractions of the three processes were determined from MC algorithms, and the three corresponding asymmetries were extracted from the data using a NN technique. Therefore, the analysis requires a precise MC description of the data, so that these quantities can be calculated reliably. Since the results derived from a NN approach strongly depend on the Monte Carlo sample on which the network was trained, much effort was devoted to obtain a good description of the experimental data by MC simulations, and the analysis was repeated using eight different MCs and the (small) differences in the results were included in the systematic uncertainties.
Averaging results obtained from the deuteron and proton data, the measured gluon Sivers asymmetry turns out to be −0.23 ± 0.08(stat.) ± 0.05(syst.), which is away from zero by more than two standard deviations of the total experimental uncertainty. This result supports the existence of a transverse motion of gluons in a transversely polarised nucleon, although the quantification of this motion is model-dependent.
In addition, another result obtained in this work from the same data is the extraction of the Collins-like gluon asymmetry, i.e. the amplitude of the sine modulation of the Collins angle φ Col = φ P + φ S − π.
Recent developments have hypothesised a non-zero Collins-like gluon asymmetry that however is not related to transversity. Our result on the Collins-like asymmetry, which is obtained from the same hadronpair data that we used to extract the non-zero result on the gluon Sivers asymmetry, is found to be compatible with zero.