Evaluating topological charge density with symmetric multi-probing method

We evaluated the topological charge density of SU(3) gauge fields on lattice by calculating the trace of overlap Dirac matrix employing symmetric multi-probing(SMP) method with 3 modes. Since the topological charge $Q$ for a given lattice configuration must be an integer number, it's easy to estimate the systematic error (the deviation of $Q$ to nearest integer). The results showed high efficiency and accuracy in calculating the trace of the inverse of a large sparse matrix with locality by using SMP sources, compared with that using point sources. We also showed the correlation between the errors and probing scheme parameter $r_{\mathrm{min}}$ as well as lattice volume $N_{L}$ and lattice spacing $a$. It was found that the computing time of calculating the trace by employing SMP sources is less dependent on $N_{L}$ than that by using point sources. Therefore the SMP method is very suitable for calculations on large lattices.


introduction
Topological charge Q and density q (x) are important quantities to the investigation of the structure of quantum chromodynamics(QCD) vacuum, and the susceptibility of Q is related to η ′ mass by Witten-Veneziano relation [1,2]. Lattice QCD is one of the best non-perturbative framework to study QCD properties, in which space-time is discretized on lattice. Q on lattice can be calculated by gauge field tensor F µν [3]: where the trace is over color indexes, and ǫ µνρσ is a totally antisymmetric tensor. However, the topological charge Q calculated in this way is always not an integer unlike that in the continuum, unless multiplied by a renormalization factor or calculated on smoothed configurations whose renormalization factor tends to 1. Smoothing methods or other cooling-like methods will change the topological density, and also introduce ambiguous smoothing parameter like smoothing time n sm which is always set empirically. Another way to get q (x) is evaluating the trace of chiral Dirac fermion operator in Dirac and color space [4]: where the trace is over color and Dirac indexes, γ 5 is Dirac gamma matrix, and D chiral is a chiral Dirac fermion operator on lattice. In this definition, Q will be always an integer due to Atiya-Singer index theorem [5], but hard to be exactly calculated as evaluating the trace of D chiral is a very expensive task. As one of the chiral Dirac fermions, overlap fermion operator [6,7] is widely used in the studies of chiral properties of lattice QCD. Massless overlap Dirac operator D ov is constructed by a kernal operator, usually using Wilson Dirac operator D W as the kernal: where the hopping parameter κ in the kernal D W is only a free parameter not related to bare quark mass, and κ ∈ (κ c , 0.25) with κ c refer to critical hopping parameter in Wilson fermion. We took κ = 0.21 in this work. ǫ (γ 5 D W ) is the sign function here: in which Zolotarev series expansion [8] is introduced. In our work 14th-order Zolotarev expansion was applied [9].
To extract the trace of D ov , we need to calculate the trace of D −2 W , while D W is a large sparse matrix. Its rank is 12N L , where N L is the number of lattice sites, empirically in a range of 10 5 ∼ 10 8 depending on the lattice size. So the trace of D −2 W is too expensive to calculate exactly. Fortunately D ov is a local operator [10], and there are methods to approximately calculate the trace of it by making use of this property. Multi-probing method is one of these methods [11,12] that can be used to estimating the trace of the inverse of a large sparse matrix when the inverse exhibits locality. A. Stathopoulos et al. developed the hierarchical multi-probing method and tested it in lattice QCD Monte Carlo calculation [13]. E. Endress et al. used multi-probing method to tackle the noise-to-signal problems when calculating all-to-all propagator in lattice QCD [14,15] by applying the Greedy Multicoloring Algorithm to construct the multi-probing source vectors. Their results have showed the validity and efficiency of probing method.
In this work we employed symmetric multi-probing(SMP) method to evaluating the topological charge density by calculating tr (γ 5 D ov ), and showed its efficiency comparing to point source method. We will introduce the topological charge density as well as the trace calculation with point sources and SMP sources in next section. After that, calculation results will be presented and discussed. We will summarize our work in the last section.

Topological charge density and trace calculation with point source
Overlap Dirac operator D ov (x, y) αβ,ab on a lattice L(n x , n y , n t , n z ) is a large sparse matrix, where n µ denotes the number of sites in µ direction and µ = x, y, z, t, α β are Dirac indexes and a b are color indexes. The topological charge density q (x) can be obtained by calculating the trace of massless overlap Dirac operator: where the trace is over Dirac and color indexes, |ψ(x, α, a) is a normalized point source vector which has N L ×N d ×N c components where the lattice volume N L = n x n y n z n t , N d = 4, N c = 3. Only one specific (x, α, a) component takes nonzero value in the point source. Substituting Eq. (3) and Eq. (4) into Eq. (5), the most expensive step to calculate the trace of D ov is computing the multiplication: which is equivalent to solve the following linear equation: where This equation is expensive to solve for a large matrix M i , and the huge number of point sources corresponding to sites on the whole lattice makes things even worse. The computing time to solve Eq. (7) with Conjugated Gradient(CG) algorithm for one source is more than O (N L ) (in consideration of that M i is a large sparse matrix, and its conditional number grows as N L grows), and there are 12N L point sources. So the cost of evaluating Q = x q (x) by calculating the trace of D ov with point sources will be larger than O (N 2 L ), which is too expensive especially for large N L [16,17], although it's the exact measurement of topological charge density on lattice without any approximation.

Trace calculation with multi-probing source
We know from above that calculation of the trace of D ov with point source is a quite expensive task. Introduce a new source vector φ by adding two point sources: then Eq. (5) can be rewritten as: whereD ov = 1 2 γ 5 D ov , and the second term: are space-time off-diagonal elements of D ov . Considering the space-time locality of D ov , it has upper bound [10,16]: where C and γ are constants irrelevant to gauge field configurations, and |x − y| is defined as the distance between site x and site y. Note that γ 5 only exchanges the elements of D ov in Dirac space, soD ov also satisfies this inequation. 'off-diagonal' will refer to 'space-time off-diagonal' in the following unless otherwise stated. We checked this relation on several lattice configurations, as Fig. 1 shown. γ in the inequation (12) is around 1.076 by [16] where |x − y| is defined as Euclid distance, and around 0.529 as taxi driver distance. The result means that the off-diagonal elements of D ov are exponentially suppressed as |x − y| increases, and the inequation (12) holds when the distance is not too small (larger than 3) for two definitions of distance. We can also see that γ for Euclid distance is larger which means stronger locality, so we adopted this definition in our work. So the approximate topological charge density q (x) can be written as: q approx (y) = α,a ψ(y, α, a)D ov φ(x, y, α, a).
We can see that the approximate density of both sites can be easily calculated fromD ov φ(x, y, α, a), so only one linear equation like Eq. (7) need to be solved to obtain q (x) and q (y) as long as |x − y| is sufficiently large. This method is called as multi-probing method or probing method. We firstly mark all sites in the lattice with a color label, called coloring algorithm. The sites with the same color form a subset of the whole lattice sites set. Then one multi-probing source can be constructed by a subset as follow: choose a color and Dirac index, construct point sources on sites in the subset and add all these point source vectors.
In order to reduce the calculation cost, we would like to reduce the number of MP sources by combining more point sources in one MP source. However, the errors introduced by MP source will exponentially depend on the  distance between the sites in the subset, as Eq. (12) shown. We should include the sites that as far away as possible from each other in one MP source to control the systematic errors. Therefore appropriate coloring algorithm is important to optimize the balance of the systematic errors and calculation cost. We chose a symmetric coloring algorithm and Euclid distance definition in this work, which marks the sites with same color in each direction with fixed distance. There are other algorithms like GCA(Greedy coloring Algorithm) [12] who used taxi driver distance definition. It's convenient to choose the same gap r in each direction, then the minimal distance of each two sites in the subsets r min = r. We call this type of MP sources by Normal Mode, as we show some examples in left panel of Fig. 2 and Fig. 3.
Furthermore, we can divided a subset into two subsets by marking half of sites with a new color, while each site has different color with its neighbors, see examples in Fig. 2 and Fig. 3 on 2-D and 3-D lattices. In this Split Mode, r ′ min = √ 2r for the two new subsets. We can also combine two subsets into a new subset which include two sites with the space-time offset equals (r/2, r/2, r/2, r/2), then the minimal distance remains unchanged, that is r ′ min = r for the new subset. We call this mode by Combine Mode, while number of sites doubles and errors are increased due to more nearest neighbors are introduced. Note that this combined mode only works on 4-D lattices , and r should be even. Fig. 4 and Fig. 5 show examples of Combine Mode in 2D and 3D situations, while r min = r/ √ 2 and r min = √ 3r/2 respectively after combination.
Considering this on a lattice L(n x , n y , n z , n t ). Firstly choose a symmetric coloring scheme P (m x, m y , m z , m t , mode), which means m µ sites are marked with same color in µ direction of the lattice, and the mode label can be 0, 1, 2 which respectively means Normal, Split and Combine Mode. Then pick a site X(  denotes its coordinate on the lattice, we can construct a SMP source φ P in scheme P (m x, m y , m z , m t , mode): where S (X, P ) denotes the site subset that including all sites with the same color of site X by coloring scheme P (m x, m y , m z , m t , mode). The topological charge density on any site x ∈ S (X, P ) can be calculated by this combined source: the second term sums the corresponding off-diagonal components of D ov , which satisfy Eq. (12). Now the whole lattice sites can be divided into N SMP site sets, with m x m y m z m t sites in each set for mode = 0 or 1 2 m x m y m z m t sites for mode = 1, 2×m x m y m z m t sites for mode = 2. Each site on the lattice should belong to and only belong to one set in a specific scheme P . SMP sources are constructed from these symmetric site sets that only distinct from each other with some units of coordinate shift. In this case all SMP sources include the same number of sites for a specific P , which leads to less sources needed compared with Greedy Multicoloring Algorithm for the same minimal Euclid distance |x − y|. Summing Eq. (16) over the lattice volume, the total introduced systematic error of Q will be: colored in normal symmetric coloring scheme P (2, 2, 2, 0), and in the right panel they are combined into one note by red stars in combined mode scheme P (2, 2, 2, 2). Note that the minimal distance changes from 2 to √ 3.
where Eq. (12) is used and the sum over color and Dirac indexes results a factor of 12. In practice the errors are far smaller than its upper bound. Define the minimal distance r P min for a SMP scheme P , which is the minimal Euclid distance of any two sites in a site set S (X, P ) based on arbitrary seed site X: we believe that |x − y| = r P min terms will dominate in Eq. (17). For the case of a spatial symmetric lattice L (n s , n s , n s , n t ), the SMP scheme can be chosen as P ns r , ns r , ns r , nt r , mode so that Therefore the choice of r min should be one of the common factors of n s and n t , or multiplied by √ 2 when mode = 1, such as r min = 2 in the left panel of Fig. 2, and r min = 2 √ 2 in the right panel. We chose appropriate r min and SMP schemes for different lattices in this way.

Lattice setup and results
We calculated topological charge density and the total topological charge for pure gauge lattice configurations with Iwasaki gauge action [18]: where β is the bare coupling constant, is the real part of the trace over all Wilson loop W C with closed contour C, while subscripts p and r refer to plaquette and 2 × 1 rectangle respectively. c 1 = 3.648, r β = −0.09073465 in Iwasaki action which is an O (a 2 ) improved gauge field action.
Firstly we calculated topological charge density exactly with point sources on a 16 4 lattice and compared it to that using SMP sources with r min = 4 √ 2. The results of topological charge density in X − Y plane with t = 1 and z = 1 are shown in Fig. 6. It's obvious that there are quite similar peaks and valleys between the results by using point sources and SMP sources, which indicates the topological structure of the QCD vacuum is well reserved in the SMP method.
A series of gauge field configurations were then generated with different lattice spacing a and different lattice volume. Topological charge density for each configuration was evaluated by SMP sources in different schemes, see Table 1, where the SMP scheme P (m, m, m, m, mode) is written as (m, mode) for short, and the lattice L (n, n, n, n) is written as n 4 . The number of point sources N PT and SMP sources N SMP are presented in Table 2, which are also the number of linear Eq. (7) that we had to solve by CG algorithm. It's shown that the cost of calculating the topological charge density was sharply reduced for smaller r min and larger lattices.    Since Q P (topological charge calculated in scheme P ) should approach an integer and the deviation of Q P to its exact value will be always smaller than 0.1 as long as r min is large enough, we extracted the topological charge Q and absolute value of systematic error |Q P − Q| from the series of results with different r min for each configuration, like an example in Fig. 7. The average errors are shown in Fig. 8, which decrease sharply as r min increases. In general, the average absolute value of systematic error of Q P and its variance became small enough when r min ≥ 4. One may notices that the error with r min = 3 √ 2 on 24 4 lattice was larger than that of r min = 4, which may come from more nearest neighbor terms (satisfying |x − y| = r min in Eq. (17)) in mode = 1 case than mode = 0. There are 24 nearest neighbors for each site in the former mode while only 8 nearest neighbors in the latter mode.
Considering the same lattice with different lattice spacing a, Fig. 9 shows that finer lattice spacing a leads to smaller errors. It could also be concluded from Fig. 10, which compares two lattices with the same physical volume. We finally present the errors versus efficiency in Fig. 11, which shows that for larger lattices, the calculation of trace by SMP method is more efficient to achieve sufficient accuracy.
All results of |Q P − Q| are presented in Table 3. We can see for example, when N SMP = 3072 the system error increases slowly as the lattice becomes larger. This could be understood since the number of non-zero off-diagonal elements in SMP sources is proportional to N L , while N SMP = 3072 is fixed. We can expect to apply this method on larger lattice with fixed N SMP , while N PT is proportional to the lattice volumes N L .
There is another method to calculate topological charge by calculating a few low modes of chiral Dirac operator,  counting the number of zero modes and applying the index theorem. However, to estimate the topological charge density q (x) one must calculate more low modes by applying low mode expansion [19]. We roughly compared the cost of SMP method and the calculation of low modes of the overlap operator on 12 4 lattices. We considered N SMP = 3072 with scheme P (3, 3, 3, 3, 0), r min = 4 in SMP method, and calculated the low modes of overlap operator by Arnordi algorithm with the number of low modes N mode = 100, 200. The results showed that when N mode = 100 , low modes calculation costs about 18% less than SMP method, while N mode = 200 it costs 28% more than SMP method. However, since the number of zero modes is always an integer, the systematic errors of q (x) from low mode expansion could hardly be estimated, while in SMP method the systematic errors of Q and q (x) are related. The comparison of q (x) from SMP method and low mode expansion will be a interesting topic in our further studies.

Conclusion
Topological charge Q and density q (x) were calculated from the trace of overlap Dirac operator employing symmetric multi-probing sources on pure gauge configurations. The systematic error were suppressed when r min increased as expected, due to the locality of overlap Dirac operator. Thus the estimated topological charge were close enough to an integer while r min ≥ 4 or r min ≥ 4 √ 2, in which case the number of required sources N SMP = 12r 4 min or 24r 4 min comparing to that with point sources N PT = 12N L . Notice that N SMP only depends on the chosen r min and is independent of N L . Furthermore, we found that the introduced systematic errors of SMP method for the same r min grew slowly with larger N L , which meant the time to estimate the trace is much less dependent on the lattice volumes N L . Thus it will be much more economical to employ SMP method on larger lattices. Since the introduced error of the trace comes from the off-diagonal elements of the matrix, we can expect better performance of this method by employing some techniques to suppress the off-diagonal elements of D ov in future work. This method is also supposed to work well when dealing with ultra-local matrix such as Wilson Dirac operator D W , and TrD −1 W is  very important and quite expensive to determinate exactly in lattice QCD. As mentioned above, we developed symmetric scheme with 3 modes in coloring algorithm and evaluated the topological charge with SMP method. Results of average absolute systematic error are presented, as well as the efficiency of SMP method compared to point source method. Potential applications of SMP method are expected in calculating the trace of the inverse of any large sparse matrix with locality.