The topological susceptibility in the large-N limit of SU(N) Yang-Mills theory

We compute the topological susceptibility of the SU(N) Yang-Mills theory in the large-N limit with a percent level accuracy. This is achieved by measuring the gradient-flow definition of the susceptibility at three values of the lattice spacing for N=3,4,5,6. Thanks to this coverage of parameter space, we can extrapolate the results to the large-N and continuum limits with confidence. Open boundary conditions are instrumental to make simulations feasible on the finer lattices at the larger N.


Introduction
The limit of large number of colors N has proved to be a fruitful tool in the study of SU(N ) Yang-Mills theories [1]. One example is the Witten-Veneziano formula explaining the large value of the mass of the η meson in the chiral limit [2,3] lim N →∞ where F π is the pion decay constant, N f the number of massless quark flavors, and q = 1 32π 2 µνρσ tr F µν F ρσ the topological charge density. This formula can be given a precise meaning in quantum field theory by properly defining the topological susceptibility χ in QCD and in the Yang-Mills theory [4][5][6][7]. The value of χ YM found in SU(3) Yang-Mills theory [8] is large enough to solve the U(1) A problem in QCD, a fact which makes it extremely interesting to study its value in the large-N limit.
Exploratory computations with cooling techniques at large N have a long tradition on the lattice [9][10][11], with quoted errors for the topological susceptibility at the 10% level. These results, however, reflect the short-comings of the techniques available at the time. In particular, a theoretically sound definition of the topological susceptibility with a welldefined and universal continuum limit had not been used. Only Ref. [12] opted for the theoretically clean but expensive definition via the index of a chiral Dirac operator, and was therefore limited to a very coarse lattice spacing and small statistics.
The second problem affecting all simulations concerned with topological quantities is the quickly freezing topological charge as the continuum limit is approached. At large values of N this makes it exceedingly hard to perform reliable simulations at small lattice spacings, since the number of updates needed rises dramatically with the inverse lattice spacing [10,13]. This comes on top of the increase of the cost of the updates growing with N 3 , such that it cannot be overcome by a brute force approach.
Taking advantage of the conceptual, algorithmic and technical developments of the last decade, we are in the position to improve significantly over these results. The exceptional slowing down of the topological modes can be avoided by using open boundary conditions in time [14]. With the introduction of the gradient flow, a theoretically clean and numerically cheap definition of the topological charge has become available [15,16]. In the continuum limit the corresponding topological susceptibility satisfies the singlet chiral Ward identities when fermions are included, and is the proper quantity to be inserted in the Witten-Veneziano formula [17].
The aim of this Letter is to compute the topological susceptibility in the large-N and continuum limits with percent accuracy. We measure χ YM for the groups SU(4), SU(5) and SU (6), and combine the results with previous ones for SU(3) [17]. Since leading corrections are expected to be O(N −2 ), this gives us a factor of four in their size. For each group the three lattice spacings simulated range from 0.096 fm to 0.065 fm with leading O(a 2 ) discretization effects decreasing by more than a factor 2 in size. This coverage of parameter space allows for a robust extrapolation of the results to the large-N and continuum limits.
This Letter starts with giving the continuum definitions of the observables in Section 2 followed by the details of the lattice setup in Section 3. The extrapolations to the continuum and large-N limit, giving the final results, are presented in Section 4 before some concluding remarks.

Observables
The Yang-Mills gradient flow has proved to be a very versatile tool to define a variety of observables with a smooth continuum limit [15,18]. It evolves, in the continuum, the gauge field B µ as a function of the flow time t ≥ 0 solving the initial value problem [15] and thus providing a Gaussian smoothing of the gauge fields with a radius √ 8t. We are interested in the energy density e t and the topological charge density q t at flow time t, which are defined as (2. 3) The power of the flow resides in the fact that at t > 0 operators made up of evolved fields, such as e t (x) and q t (x), are finite as they stand once inserted in correlation functions, i.e. no ultraviolet renormalization is required. Moreover, short-distance singularities cannot arise, and integrated correlators are well defined. These properties carry over to the discretized theory, where (integrated) correlators have a finite and universal continuum limit as they stand.
Thanks to the topological nature of q t , continuous deformations of the gauge field induced by the gradient flow do not affect the cumulants of the topological charge, which, in the continuum theory, are constant along the flow [15,17].

Definition of the reference scale t 0
In order to relate results in theories with different N , we need to define a reference scale in terms of which the observables are expressed. While different choices are logically possible, it is desirable to choose a quantity which is a (non-zero) constant at leading order in 1/N , and that can be computed with high numerical precision. We opt for generalizing t 0 proposed for N = 3 in Ref. [15] to arbitrary values of N , by requiring such that the right hand side attains the canonical value of 0.3 for SU(3). At small t, perturbation theory gives where λ t (q) = N g 2 (q) at the scale q = (8t) − 1/2 is the renormalized 't Hooft coupling, and c 1 = 1 16π 2 ( 11 3 γ E + 52 9 − 3 ln 3). The sub-leading term on the r.h.s. of Eq. (2.4) has been included following the indication of the perturbative expression.
Since SU(N ) Yang-Mills theory is not realized in Nature, any conversion of this result to physical units is a matter of convention. For the sake of clarity in the presentation,  Table 1: Parameters of the simulation. For each of the three gauge groups SU(N ) we give the inverse coupling β, the inverse of the 't Hooft coupling λ 0 = g 2 0 N to four significant digits, the dimensions of the lattice, the approximate lattice spacing using √ t 0 = 0.166 fm followed by the number of measurements and their separation in Cabibbo-Marinari updates of the lattice.
however, it is useful to assign a physical value to t 0 , which we choose to be √ t 0 = 0.166 fm for all values of N . This is motivated by the fact that in the SU(3) theory √ 8t 0 /r 0 = 0.941(7) [17], together with a value of the Sommer scale r 0 = 0.5 fm [19]. We will use this value of t 0 to express the lattice sizes and lattice spacings in physical units, but not to convert the final results, which instead will be expressed always in units of t 0 .
For completeness, it is useful to remember that in the SU(3) theory √ t 0 Λ MS = 0.200 (16) [20], where Λ MS is the lambda parameter of the theory. It would be desirable in the future to compute this quantity at higher N , and eventually take the N → ∞ limit.

Lattice details
The standard discretization of SU(N ) Yang-Mills theory on four-dimensional lattices of size T ×L 3 and lattice spacing a is used throughout this study. We use the Wilson plaquette action where U P is the ordered product of links around the plaquette P , λ 0 is the bare 't Hooft coupling, and w P = 1 everywhere except for the space-like plaquettes on the time slices 0 and T − a where w P = 1/2 [21]. This because we opted for open boundary conditions in time as implemented in Ref. [14], while spatial directions are periodic. The parameters of the simulation are collected in Table 1: for each of the three gauge groups SU(4), SU(5) and SU (6), three values of β are chosen such as to give approximately the same t 0 /a 2 . Using √ t 0 = 0.166 fm, they correspond to lattice spacings of approximately 0.096, 0.078 and 0.065 fm. The size of the boxes have been scaled such that L ≈ 1.5 fm, while the temporal extent is chosen to be T = 4L, so that a sufficiently large bulk region with negligible boundary effects is available for the measurements.

Wilson flow observables
We employ the standard discretization of the Wilson flow. It is integrated with the thirdorder Runge-Kutta integrator defined in [15] with an integration step size such that the integration error is well below the statistical accuracy of the observables. The two primary observables e t (x) and q t (x) are measured with a 0.04a 2 resolution in the flow time t and interpolated quadratically from the neighboring points to get the observables at arbitrary values of t.
Following again Ref. [15], the discretized e t (x) and q t (x) are defined through the standard "clover" field strength tensor and immediately summed over the spatial directions Because of the open boundary conditions, time translation invariance is broken and some care must be taken when averaging over the x 0 coordinate. A plateau range needs to be determined, where boundary effects can be neglected. To this end, for each observable we first perform a fit to the symmetrized data using the contribution of one excited state f (x 0 ) = A + Be −x 0 m in a region where this ansatz describes the data well.
With this result, we determine the minimal distance of the plateau fit from the boundary requiring that |f (d) − A| < σ/4, with σ being the average error of the measurement for x 0 > d. Using this criterion, the choice of d = 9.5 √ t 0 guarantees that boundary effects inē t (x 0 ) at t = t 0 are negligible with our statistics, and therefore we define

Topological susceptibility
For the topological susceptibility, we use the approach of Ref. [22]. The topological charge correlator is to be averaged over the bulk region given by a minimal distance d from the boundariesC Again we determine d such that for all values of ∆ boundary effects are negligible. Using the same strategy as for the energy density above, d = 7.5 √ t 0 turns out to be a conservative choice for all ensembles.
An estimator of the topological susceptibility is then obtained by truncating the sum over ∆ with a cut-off r where r has to be chosen such that the contribution of the neglected tail is insignificant compared to the statistical accuracy of the result. Such an r can always be found, because the correlator converges exponentially to zero for large separations ∆. Unfortunately, the combination of the smoothing by the gradient flow and the numerical errors obscure this behavior in the actual data.
Due to the smoothing, the correlation functionC t (∆) is positive for small values of ∆ and would be expected to turn negative before exponentially converging to zero for ∆ √ 8t 0 . We cannot resolve this latter feature due to the numerical uncertainties of our data, the correlator being zero within errors typically from ∆ = 5 √ t 0 on. In order to get a better handle on the contribution of the tail, we use high precision data in SU(3) [23]. Assuming that the relative contribution of the tail does not change drastically with N , and given the accuracy of our data, cutting the summation over ∆ at r = 7 √ t 0 is a conservative choice, leading to a negligible systematic error.

Finite volume
By their nature, lattice simulations are done in a finite volume, which can distort the results. For a large enough lattice dimension, these systematic effects are exponentially suppressed, but we need to verify that they are negligible given the target accuracy.
The lattices employed in this study are slightly larger than the ones used for SU(3) in Ref. [17]. Despite significantly smaller statistical errors, no significant finite size effects could be detected in the SU(3) study. In order to avoid relying only on the independence of these finite volume effects on N , we also generated lattices with L = 1.1 fm and 2.3 fm for SU(4) and SU(5) at the smallest values of β. These lattices bracket the L = 1.5 fm used in our analysis. With a numerical accuracy matching our target, no significant differences between the three sizes are found, such that we conclude that also this systematics is under control.

Autocorrelations
Simulations like the one presented here are known to be challenging due to a rapid rise of the autocorrelation times τ int , in particular of topological observables. Numerical evidence suggests they increase with a very high power or even exponentially with 1/a and N when periodic boundary conditions are implemented [10].
In our study, the gauge field is updated with the Cabibbo-Marinari scheme [24]: one update consists of a heat bath sweep of the full lattice followed by n ov ∝ a −1 overrelaxation sweeps. Both the heat bath and the overrelaxation sweeps update all the N (N − 1)/2 SU(2) subgroups of a given SU(N ) link. The number of these updates between measurements is given in Table 1 and chosen such that for all our observables autocorrelations are hardly detectable. We take them into account in our analysis.
To study the effect of the open boundaries, we have computed τ int for the coarser lattices A(4) 1 , A(5) 1 and A(6) 1 with dedicated runs in the presence of periodic and open boundaries, putting fewer updates between measurements for increased sensitivity.
In units of updates with n ov = L/(2a) , τ int of χ YM for the periodic lattices is 16(2), 54(6) and 187 (19) for N = 4, 5 and 6, respectively. With open boundaries the corresponding values are 12(1), 46(6) and 111 (10). For all values of N we observe a reduction in τ int for open compared to periodic boundary conditions. It is most significant at N = 6 and hardly statistically significant for the other values of N .
Going to finer lattices, this advantage is expected to be more pronounced. The exponential scaling observed in Ref. [10] with periodic boundary conditions would suggest a value of τ int one or two orders of magnitude larger than the one we observe with open boundaries. Therefore the finer lattice spacings would not have been feasible with our computer resources.

Results
The results for the observables for the three gauge groups are listed in Table 2. At finite lattice spacing, we reach accuracies on the percent level for χ YM and below the permille level for t 0 . The values for the dimensionless product t 2 0 χ YM are displayed in the left plot of Fig. 1, where we also add the SU(3) results from Ref. [17]. It is clear that, both, the effects of finite N and finite cut-off a are roughly at the level of our statistical errors. (7) 6.61 (6) 0.1603(4) A(4) 2 4.5207(8) 6.54(5) 0.1599(3) A(4) 3 6.4849(16) 6.68 (7) 0.1607 (4) A(5) 1 3.0636 (7) 6.47 (7) 0.1595(4) A(5) 2 4.6751(8) 6.73 (7) 0.1611(4) A(5) 3 6.8151(17) 6.62(8) 0.1604 (5) A(6) 1 3.0824(4) 6.57(6) 0.1601(4) A(6) 2 4.8239(9) 6.81(8) 0.1615(5) A(6) 3 6.9463(13) 6.80 (7) 0.1615(4) In order to extrapolate the raw data to the continuum and the N → ∞ limit, we use the functional form which takes into account the leading corrections dictated by the Symanzik and the large-N expansion. This is motivated by the observation that both corrections are small, given the statistical accuracy of our data. The fact that the N -dependence of the O(a 2 ) term can be neglected within our precision is further supported by the observation that discretization effects in the ratio χ t YM /χ t 0 YM , which can be captured to exceedingly high accuracy, turn out to be independent of N .
Our main result is obtained by fitting Eq. (4.1) to the two finer points of SU(4), SU(5) and SU(6) data together with the two finer data points for SU (3), where the latter is only used to constrain the coefficient c 2 . Discarding the coarser lattice points and the smallest N reduces the assumptions made on the scaling region of our results. This fit renders i.e. a 2% accuracy is reached. The fit quality is excellent with a χ 2 /dof = 0.94. In the continuum limit the fit gives t 2 0 χ YM (1/N, 0) = 6.68(12) · 10 −4 , 6.81(11) · 10 −4 and 6.87(11) · 10 −4 for N = 4, 5 and 6 respectively, see right plot of Fig. 1. To get a better handle on possible systematic effects of this result, many other fits to the data have been tried, all of them leading to similar results. Among them the most obvious modification is to include also the third finest point of the SU(3) data determining the discretization effects. This changes the result to t 2 0 χ YM (0, 0) = 7.13(10) · 10 −4 with χ 2 /dof = 1.1, compatible with the above number. If the three finest SU(3) points are globally fitted with the two finer points of the other groups, the results is t 2 0 χ YM (0, 0) = 7.09(7) · 10 −4 with an excellent value of χ 2 /dof = 1.0. A global fit of Eq. (4.1) to all data, including the three finer SU(3) ones, adding an a 2 /N 2 term to Eq. (4.1), gives t 2 0 χ YM (0, 0) = 7.02(13) · 10 −4 with a χ 2 /dof = 1.7. Performing the continuum limit group-by-group and applying the large-N extrapolation only in a second step also gives a compatible result.
From these analyses we conclude that the systematic effects coming from the continuum and large-N extrapolations are under control within the errors quoted.

Conclusions
This is the first investigation of the large-N behavior of the topological susceptibility in pure Yang-Mills theory using a theoretically sound definition of χ YM , and small lattice spacings which allow for control over the continuum limit. As a final result we quote for N → ∞ t 2 0 χ YM (0, 0) = 7.03(13) · 10 −4 . This result proves that the leading anomalous contribution to the η mass is large enough to solve the U(1) A problem in QCD. The bulk of the mass of the pseudoscalar singlet meson is generated by the anomaly through the Witten-Veneziano mechanism. The 1/N 2 corrections that we have found in t 2 0 χ YM (0, 0) are at most of the expected size (even a bit smaller), with no large prefactor in the expansion. This explains why the N = 3 result, t 2 0 χ YM = 6.67(7) · 10 −4 , in Ref. [17] is already large enough to explain the large value of the η mass in Nature. The difference with the N → ∞ value is barely visible within errors, despite their high accuracy.
In the Yang-Mills theory, it will be challenging to improve significantly on these results by brute force. Discretization effects and large-N effects are roughly of the same level. The much higher accuracy needed to resolve higher order effects in the large-N expansions will therefore require significantly smaller lattice spacings. These are still computationally very expensive, even with the open boundary conditions, which make those used in the present study possible.
The accuracy presented here is certainly sufficient for the completion of the proof of the Witten-Veneziano relation in Eq. (1.1). It will need to be matched by the one on the hadronic quantities entering the relation to be computed in the large-N limit of QCD.