Absence of $CP$ violation in the strong interactions

We derive correlation functions for massive fermions with a complex mass in the presence of a general vacuum angle. For this purpose, we first build the Green's functions in the one-instanton background and then sum over the configurations of background instantons. The quantization of topological sectors follows for saddle points of finite Euclidean action in an infinite spacetime volume and the fluctuations about these. For the resulting correlation functions, we therefore take the infinite-volume limit before summing over topological sectors. In contrast to the opposite order of limits, the chiral phases from the mass terms and from the instanton effects then are aligned so that, in absence of additional phases, these do not give rise to observables violating charge-parity symmetry. This result is confirmed when constraining the correlations at coincident points by using the index theorem instead of instanton calculus.


Introduction
The theoretical formulation of the strong interactions in general allows for a Lagrangian term 1/(16π 2 )θ tr F F (1) that is odd (i.e. it changes sign) under charge-parity (CP ) conjugation. Here, F is the gauge field strength tensor and F is its Hodge dual, with electric and magnetic components being interchanged. One may expect in general that this term also leads to phenomena that violate CP . Conceivable in particular is a permanent electric dipole moment of the neutron [1,2], which, together with other potential indications of strong CP -violation, has not been observed to date. Since in first place, there is no reason to prefer θ = 0 (or an integer multiple of π), it is therefore argued that the absence of such signals constitutes a shortcoming of the theory, referred to as the strong CP problem, and that it requires an extension of the Standard Model of particle physics. Theoretical research in this direction is extensive, and there is a number of experiments hunting for a proposed particle, the axion, that arises in many of these extensions [3].
From the Lagrangian, the action follows by integration over the spacetime. Since the CP -odd term (1) turns out to be a total derivative, the corresponding contribution to the action is determined by the boundary conditions on the gauge fields. Taking these to be vanishing physical fields, i.e. pure gauge configurations, at the boundary of spacetime, the integrals over the CP -odd term yield θ times integer values ∆nto be referred to as winding number or topological charge-corresponding to so-called homotopy classes that categorize maps of a three-dimensional sphere onto itself, where maps in different classes cannot be continuously transformed into one another [4,5].
This topological quantization is of central relevance when evaluating the effects from the term (1). One implication is, for example, that if the predictions of the theory depend on θ, they must be periodic in this parameter. This is because in the quantized theory, the action enters the path integral as a phase. The theory is therefore invariant under replacements θ → θ + 2πn, where n ∈ Z. Therefore, θ is sometimes referred to as the vacuum angle. Further, topological quantization implies that observables are to be calculated from an interference of amplitudes from different topological sectors, i.e. from path integrals for a given ∆n or homotopy class, in the infinite spacetime.
To state a principle leading to vanishing physical boundary conditions and therefore to topological quantization, we note that the nonvanishing contributions to the Euclidean path integral arise from saddle points of finite action and fluctuations around these. Saddle points correspond to solutions to the Euclidean equations of motion, and for these to exist in the infinite spacetime volume, the physical boundary conditions must vanish. As a consequence, the path integrals for the different topological sectors must then be evaluated in infinite spacetime volumes first. Otherwise, there would be no reason to assume topological quantization. In a second step, amplitudes from the different topological sectors are then to be interfered.
On the other hand, for boundary conditions imposed on finite spacetime volumes, saddle points and solutions to the equations of motion exist for nonvanishing physical fields at the boundaries as well. Moreover, the ground state configuration, that should determine the boundary conditions on finite spacetime volumes, is neither a field eigenstate nor a pure gauge configuration, i.e. it does not correspond to vanishing physical fields. In contrast, the Euclidean path integral in infinite volumes automatically projects the pure gauge field eigenstates on the corresponding accessible ground states. Nonetheless, if there were a principle that would lead to topological quantization for boundary conditions imposed on some finite surface, one could interfere the topological sectors prior to taking the spacetime volume to infinity.
Here, we show that the material consequence of the order of the limits is as follows: When taking the spacetime volume to infinity before interfering the topological sectors, CP -violating phenomena are absent in the strong interactions without extending the theory or setting the CP -odd term to zero. On the other hand, interfering the topological sectors before taking the spacetime volume to infinity, one concludes that correlation functions exhibit CP -violation that cannot be removed by field redefinitions [6].
The question of whether there is CP violation in general in the strong interactions of massive quarks should not be a matter of choice but be a prediction of the theory. Appended to this letter is therefore extensive supplementary material that addresses many aspects of the limiting procedure as well as pertaining matters such as the principle of cluster decomposition.
Technically, we arrive at our conclusions by computing the correlation functions for massive fermions, where we keep θ as well as the phase of the determinant of the matrix of quark masses general. As one of the methods, we use the leading approximation to a dilute gas of instantons so that the spacetimedependence of the correlations can be recovered. As an alternative route, using arguments based on factorization properties of path integrals and the Atiyah-Singer index theorem [7], we confirm that the coincident limit of the fermion correlations does not exhibit CP violation, provided the interference of the topolog-ical sectors takes place among infinite spacetime volumes. Hence, the main results of this work hold beyond the perturbative expansion about instanton configurations. They crucially rely on how topological quantization emerges in spacetimes of infinite volume and the order in which the pertaining limits are carried out.
2 Topological charge, massive quarks, and charge-parity violation In electrodynamics, the topological term (1) is immaterial because its volume integral can be traded for a surface integral over the boundaries of spacetime where it can be shown that finite action configurations have fields decaying fast enough such that the integral vanishes. This is not true for the strong interactions, where, due to the self-interactions, extended field configurations with finite action, so-called instantons, exist while the surface term no longer vanishes [8]. For this reason, it has been proposed that values of θ = πm (m ∈ Z) may imply CP -violation [1,2,4,5].
While the topological term is local in the first place, and while in singular gauges the topological flux can be constrained to infinitesimal surfaces about the centres of the instantons [9], Eq. (1) is nonetheless equivalent to a surface term at the boundary of the spacetime at infinite distance. It is therefore an essential point whether it affects local observables in quantum field theory. The standard view is that this is the case because of a change in the local vacuum structure imposed by the boundary term. On the other hand, as illustrated in Figure 1, one can approximate observables by including the fluctuations in a subvolume of the spacetime with all possible boundary conditions on its surface. One may expect-and it is possible to show this-that the theory in the subvolume is then independent of the boundary conditions in the infinite distance so that these have no material impact.
Intricately related with the topological term are CPodd contributions to quark masses that can be expressed throughψ j m j e iαj γ 5 ψ j where j = 1, . . . , N f and N f is the number of quark flavours. The quark fields are denoted by the spinors ψ j , γ 5 is a matrix in spinor space and the phases α j are CP -odd. The phases α j can in principle be removed by redefinitions of the quark fields. However, since the so-called chiral symmetry of the quark fields is anomalous [10,11], the quark phases are tied to the vacuum-angle θ. In particular,θ = θ +ᾱ, whereᾱ = N f j=1 α j , is a phase that remains invariant under field redefinitions.
In order to calculate the most important CPviolating effects from the topological term, one derives effective fermion interactions caused by the instantons Figure 1: For local quantum field theory, an observer is expected to be only sensitive to fluctuations in a local subvolume Ω 1 ⊂ Ω in the limit of an infinite volume of the spacetime Ω. The θ-parameter influences the conditions at the boundary ∂Ω. It can be shown that these do not affect the fluctuations in the subvolume. Fluctuations corresponding to instantons and antiinstantons are depicted as blue and orange circles, respectively.
The interaction (2) implies that there is no chiral symmetry with an overall U(1) phase. In the effective chiral Lagrangian for low energies, where quantum chromodynamics (QCD) confines, there thus is the corresponding term where f π is the pion decay constant, U is a field of the form of a unitary matrix describing the mesons and λ is a coefficient within the effective theory. The aforementioned invariance ofθ under field redefinitions leaves two possibilities for the phase ξ compatible with the chiral anomaly (assuming that ξ is a function of α and θ, and that the effective action is periodic in these parameters): • ξ = θ, i.e. in general misaligned with mass terms such that there is CP violation, • ξ = −ᾱ, i.e. aligned with mass terms such that there is no CP violation.
The restriction to the above choices can be understood in terms of a spurious chiral symmetry under which θ transforms or simply by demanding that the relative phases between the interactions of Eq. (2) and the tree-level mass terms remain invariant under field redefinitions. Based on the topological quantization of the path integral and the ensuing order of limits, we derive here the effective operator (2) and show that the second possibility, ξ = −ᾱ, is the one that is realized what implies that there is no CP violation in the strong interactions. When relating these remarks to the literature, we note that the possibility ξ = θ is implied in most of the papers without dismissing ξ = −ᾱ. The early papers, as well as literature following these, on phenomenological CP violation in the strong interactions make use of the freedom of chiral field redefinitions in order to set θ = 0 and attribute the CP -odd phases to the quark masses [1,2]. In the context of the present discussion, this corresponds to setting ξ = θ = 0 whileᾱ = 0 in general. The case of ξ = −ᾱ is apparently not pursued. Also more recent discussions of the coefficients of the operator (3), e.g. Ref. [14], do not mull over this latter possibility.
Reference [6] appears to contain the only direct calculation leading to ξ = θ, making use of the dilute instanton gas approximation. As we point out in the present work, this conclusion relies on computing the interference among topological sectors in finite spacetime volumes and taking these to infinity afterwards.
Reversing this order of limits, as it is indicated when topological quantization emerges from the requirement of finite saddles in the action in infinite spacetimes, we show in the present work that one is led to conclude that ξ = −ᾱ instead.

Fermion correlations in a dilute instanton gas
In this section we show that ξ = −ᾱ by computing the quark correlation function in the approximation of a dilute instanton gas. In order to simplify notation, we set N f = 1 and drop the index for the quark flavour. One should keep in mind that for a single quark flavour, the instanton effects amount to an addition to the quark mass. However, the generalization to the cases with N f > 1 relevant for the potentially CP -violating phenomenology follows along the lines of the simplified analysis.
To compute the correlation function, we use the following Green's function in the background of n instantons (with topological charge +1) andn anti-instantons (with topological charge −1) located at x 0,ν , x 0,ν respectively: This approximation is valid for a dilute instanton gas and quark masses such that m is small compared to 1/ , where is the radius of the instantons (which is not fixed). The spinors ϕ 0L,R are the analytic continuation of the zero modes of the Euclidean Dirac operator in the (anti-)instanton background, that determines the equation of motion for the quark fields, in the massless limit, and is the solution in a background without instantons and is approximately valid at large distances from the individual locations, i.e. in between the instantons and anti-instantons.
Further, we readily assume here Minkowski metric. The approximation (4) has been used e.g. in Ref. [16] for α = 0. The generalization to α = 0 may appear obvious but there are some complications when transforming the spectrum of the Dirac operator from Euclidean to Minkowski spacetime. Yet, these can be addressed in detail thus confirming the form of the propagator (4) (Section S2). Note that the Green's function (4) is independent of θ because the topological term has not yet entered the derivation. However, it needs to be taken into account when summing configurations corresponding to different homotopy classes in the path-integral expression for the correlation function.
Here, we use the saddle point approximation to the path integral, where we sum over all instanton and antiinstanton numbers n andn and integrate over the locations of instantons and anti-instantons as well as over the remaining collective coordinates such as the radii and gauge orientations (which are independent for each instanton and anti-instanton). The question of whether ξ = −α or ξ = θ is decided by the treatment of the summation over n andn in conjunction with how boundary conditions are imposed on the path integral. Let Ω denote the volume of spacetime and first consider Minkowski space such that Ω is infinite. The case of finite Ω is discussed below. Boundary conditions on the path integral are fixed by requiring that the physical gauge fields (as well as all other fields) vanish on the boundary of ∂Ω, such that the action takes finite values at its saddle points [17] (Section S3.2). For the gauge field, this leaves open the possibility of pure gauge configurations.
These remarks apply to field configurations that are regular in Ω. In calculations aiming for interactions beyond the dilute instanton gas [16,18], it can be advantageous to use the singular gauge [9] so that one avoids working with integrands that are not manifestly square integrable. The price to pay for this is that there are singularities at the centres of the instantons or their approximate deformations. While spacetime needs to be punctured at these singularities, there are no apparent problems in constructing saddle point approximations to the path integral. Since the singular contributions at the centres of the instantons are pure gauges, the topological flux through an infinitesimal ball around such a point is again quantized. Then, for infinite Ω the fields still must vanish on ∂Ω but there is no topological flux through ∂Ω, in contrast to the regular gauge. In effect, topological quantization results from the requirement of finite saddle point configurations in infinite spacetime volumes also in the singular gauge. In contrast, when restricting spacetime to finite Ω, there are finite saddle points for arbitrary nonsingular boundary conditions on ∂Ω. Hence, some different principle would again be necessary to impose topological quantization for finite boundaries.
Both ∂Ω and SU(2) ⊂ SU(3) (i.e. the subgroup of the group of gauge symmetries of the strong interactions) are homeomorphic to the three-dimensional sphere S 3 such that the gauge configurations fall into classes according to the third homotopy group. These characterize the number of times ∆n a threedimensional hypersurface can be wrapped around S 3 . In the context of strong interactions, the class of configurations with boundary conditions corresponding to a certain ∆n are sometimes referred to as a topological sector.
This property is of relevance for the present case because in the saddle point approximation ∆n = n −n. Furthermore, it is possible to define vacuum states |n CS with a certain integer Chern-Simons number n CS . Taking the matrix element characterized by m CS | and |n CS corresponds to fixing the topological sector ∆n = m CS − n CS . We also note that the states |n CS are not gauge invariant as the Chern-Simons number (defined on a spatial hypersurface) can change by all possible integer values through so-called large gauge transformations that are not continuously connected to the identity component. Thus, the true vacuum state should be constructed as a superposition of all Chern-Simons numbers of equal weight, but there may be relative phases proportional to n CS . These phases are effectively equivalent with the topological term in the action when calculating expectation values using the path integral approach. Since different topological sectors are distinguished by the boundary conditions which are taken at infinity, contributions to the path integral within a fixed topological sector must be evaluated for infinite spacetime volumes Ω. Note that this reasoning also applies to spacetime manifolds with compact spatial hypersurfaces yet with an infinite time direction. The possibility of restricting the integration to finite subvolumes of spacetime is discussed below.
The fermion correlator should therefore be evaluated as . (6) where Z(N, Ω) and Z ∆n (Ω) are the partition function summed for all sectors |∆n| ≤ N and that for a single topological sector, respectively. The dependence on N and Ω needs to be kept before taking these parameters to infinity. The order of the two limits in the last expression determines whether one arrives at ξ = −α or ξ = θ, as we discuss next. Now we need to consider the fermion correlator in a fixed topological sector. For a single flavour one has: In this expression,h(x, x ) is a spinor correlation that remains after the integration of the instanton and antiinstanton locations as well as the collective coordinates and κ includes the exponential suppression of the instanton action-as these correspond to tunneling processes-as well as extra factors that appear when evaluating the path integral to one-loop accuracy (Section S3.1). Finally, I α (x) is the modified Bessel function of order α.
n =1 x x n =1 The shaded blobs represent some subdiagram. On the left, there is a piece induced by a six-point fermion Green's function in the background of an instanton (corresponding toh in the two-point case). On the right, an interaction of the same chiral structure is induced by the fermion mass terms m 1,2,3 (corresponding to iS 0inst in the two-point case). When integrating over the subvolumes indicated by the thin grey boxes only, the left piece would acquire a relatively misaligned phase θ + α (α being here the sum of the quark mass phases) compared to the right piece because the phases come from the topological term and the fermion determinants. When instead correctly computing the path integral over the full spacetime volume (represented by the thick grey boundaries), the phase for both pieces is aligned and given by ∆n(θ + α). For infinite spacetime volumes, the interferences between the different sectors ∆n moreover are immaterial.
It is clear that the dilute instanton gas approximation does not apply directly to QCD. Rather, one could think of a nonabelian gauge theory whose particle content is made up such that the running coupling remains perturbative in the infrared and there is asymptotic freedom in the ultraviolet. In such a model, the scale invariance is broken radiatively such that there is no dilatational modulus and instead a preferred instanton size. That the symmetry properties with respect to CP of such a theory in principle also apply to QCD should therefore be taken as a more or less plausible assump-tion. In Section 4, we thus also present a derivation of the coincident fermion correlations that does not rely on the dilute instanton gas approximation.
The volume factors Ω in Eq. (7) are resulting here from the integration of the instanton locations over the entire spacetime. These appear in the same form even when taking these volumes to be finite for a given topological sector before interfering between these [6,16,18]. It is then understood that Ω, which is taken to infinity after interfering the topological sectors, is much larger than other scales that appear in the dilute instanton gas. This includes the mean separation between instantons and anti-instantons as well the typical size of these. In fact, restricting Ω to small volumes given by some physical length scale so that these only contain few instantons would substantially alter the results of e.g. Refs. [6,16,18] that do not impose such truncations on the path integral. The fact that the instanton locations are to be integrated over the entire spacetime is tied to translational invariance and mathematically derives from trading the translational moduli for collective coordinates [13,19]. It can also be seen in analogy with the calculation of the partition function for a classical ideal gas, where the individual positions of the particles are integrated over the entire configuration space. Beyond the dilute gas approximation, the spacetime integrations should be modified to account for the overlap of instantons and anti-instantons due to their finite size while yet, the individual locations are still to be integrated over infinite volumes [16]. For a theory to which the dilute gas approximation applies, omitting such corrections only amounts to a controllable error.
From Eq. (7), we see explicitly that in a fixed topological sector and large spacetime volumes Ω, the modulus of the coefficients of the left and right chiral contributions tends to the same value. In particular, for x → ∞ and | arg(x)| < π/2, I α (x) ∼ exp(x)/ √ 2πx, i.e. these functions become independent of their index. As a consequence, for Ω → ∞, all topological sectors contribute in precisely the same way. Moreover, the chiral phases from the mass term contained in S 0inst (see Eq. (5)) and those induced by instanton effects are aligned, as a consequence of these phases (that originate from the fermion determinants and the topological term) being fixed by the boundary conditions on the topological sector ∆n as we illustrate in Figure 2. When normalizing by the partition function, the modified Bessel functions as well as the phase proportional to ∆n cancel and we obtain such that the explicit phase can be identified with ξ = −α. In contrast, if we were turning around the order of limits in Eq. (6), we would sum over two independent exponential series for n andn and find θ rather than −α in Eq. (8) so that ξ = θ (Section S3.3).
Taking the correct order of limits, i.e. Ω → ∞ before summing over topological sectors therefore explains the absence of CP violation in the strong interactions. This result can be generalized to an arbitrary number of fermion flavours (Section S3.4).

Chiral correlations from the index theorem
In this section we provide an alternative derivation of the previous results without using instantons. The starting point are the factorization properties of the path integration when the full spacetime volume Ω is divided into subvolumes Ω 1 and Ω 2 . Following standard textbook arguments used in the context of cluster decomposition [20], the fact that the topological charge ∆n is a surface flux allows to write the partition function of the full spacetime volume Ω = Ω 1 ∪ Ω 2 as For convenience, in this section we work in Euclidean space, as this simplifies the tracking of the complex phases. First we can extract the θ-dependent phase, Z ∆n (Ω) ∝ e i∆nθ . Any additional complex phases can only come from the integration over fermionic fluctuations. To leading order in a loop expansion around saddle points, these integrations have the form of determinants of the Dirac operator in each saddle-point background. Here we make no approximation of the saddle points in terms of a dilute instanton gas. Parity transformations relate pairs of eigenfunctions of the massive Dirac operator with mutually conjugate eigenvalues, except for those eigenfunctions that, being zero modes of the massless operator, have eigenvalues given by the complex fermion masses, resulting in opposite phases for right-handed and left-handed modes (Section S2.2). Hence the phase of the full determinant within a topological class characterized by ∆n is determined by the difference between the number of right and left-handed zero modes of the massless Dirac operator, which according to the Atiyah-Singer index theorem coincides with ∆n [7]. This gives a phase of e i∆nᾱ for the product of all fermion determinants. As a consequence, we may write Z ∆n (Ω) = e i∆nθg ∆n (Ω) (10) with realg ∆n (Ω). Equation (9) gives then the relations Setting Ω i = 0 above can be seen to imply that g ∆n (0) = δ ∆n,0 . We next note that parity transformations relate ∆n with −∆n. As theg ∆n are real and not sensitive to parity-violating effects from the complex fermion masses, one hasg −∆n (Ω) =g ∆n (Ω). The former results motivate the Ansatz Remarkably, assuming analyticity in Ω (and as shown in Section S4), there is a unique solution which, upon substitution in Eq. (10), gives where β depends on the parameters of the theory and is not determined at the present level of generality. This has the same form as the result for the partition function in the dilute gas approximation (Section S3.1). Finally we note that since all dependence on the complex fermion masses is included inθ, β can only depend on the moduli of the complex fermion masses m j ≡ m j e iαj : β = β(m j m * j ). In order to obtain fermion correlators, it suffices to note that m j and m * j can be seen as sources for integrated two-point functions. Within a fixed topological sector ∆n, the volume averages of the fermionic correlators can be obtained as Using Eq. (13) and summing over topological sectors after taking the limit Ω → ∞ as before gives correlators whose phases are aligned with the tree-level masses, leading to no CP violation: By taking additional derivatives with respect to the masses m j ,m * j , the results can be extended to correlation functions involving more fermion fields.

Finite subvolumes, periodic boundary conditions and fixed topological sectors
To view this result from additional angles, we discuss what one would obtain for fixed topological sectors or for finite spacetime volumes. Taking the order of the limits as in Eq. (6), we have seen that the modified Bessel functions in Eq. (7) tend to a common limit. This can be seen as a consequence of ∆n/Ω → 0. Taking Ω → ∞ before summing over different topological sectors may therefore be viewed to be equivalent with setting ∆n = 0 from the outset. This explains why taking limits as in Eq. (6) leads to the alignment between the various chiral phases. We note that a relevant example for finite Ω and fixed ∆n is given by boundary conditions that are periodic in all four dimensions. This setup is mostly chosen in lattice simulations, where ∆n freezes in the continuum limit.
In the approximation of the dilute instanton gas, it can be shown that fixing ∆n in an infinite spacetime volume is compliant with the principle of cluster decomposition (Section S5.1). In finite spacetime volumes Ω, corrections to the asymptotic form of correlators required by the cluster decomposition principle then vanish, provided Ω is chosen large enough to meet a given precision (Section S5.2). This observation has also been made in Refs. [21,22] through different calculational methods. We therefore conclude that it is possible to describe the strong interactions in a fixed sector with finite ∆n, provided Ω is large enough or infinite, and that there are no CP -violating effects in this theory.
With the above observation and working in a single topological sector with fixed ∆n, we can evaluate the path integral in a finite subvolume Ω 1 ⊂ Ω according to Figure 1, no matter whether the full spacetime volume is finite or infinite. For such a setup, we need to sum or integrate over boundary conditions of a certain winding number ∆n 1 (which is not necessarily integer because instantons can be located at the boundary). The full winding number ∆n is however fixed by the boundary conditions on ∂Ω. In particular, let Ω 2 = Ω \ Ω 1 and ∆n 2 be the winding number within Ω 2 . Then, ∆n = ∆n 1 + ∆n 2 remains fixed such that the total phase proportional to ∆n separates just like in Eq. (7) and cancels within observables. One can then obtain expectation values from a path integration restricted to Ω 1 in which the θ dependence is absent, and once more the result (8) is recovered (Sections S5.1 and S5.2).
We emphasize that the fermion correlations evaluated according to Eq. (6) are compatible with the enhanced mass of the η -meson compared to those mesons associated with spontaneously broken symmetries that are not anomalous (Section S3.6). This can be explained in more detail when observing that the chiral susceptibility evaluated in finite subvolumes of spacetime agrees with known results from the dilute instanton gas approximation and moreover when noting that even within a fixed topological sector, there is an ηmeson with enhanced mass (Section S5.4). Then one can also show that under reasonable assumptions the mass of the η is proportional to the topological susceptibility of the pure gauge theory evaluated in finite subvolumes, which generalizes classic results derived for large numbers of colours in Refs. [23,24]. Finally, we note that arguments linking the topological susceptibility with CP violation [25] rely on assuming analyticity in θ of the partition function for the full volume, which does not apply when the infinite volume limit is taken before summing over the topological sectors (Sections S3.6 and S5.4).

Conclusions
In this work, we have derived fermion correlations in instanton backgrounds, investigated the cases of finite and infinite spacetime volumes and checked the compliance with cluster decomposition. If there were a valid principle that would allow the limit of infinite spacetime volume to be taken after the summation over topological sectors, we would recover CP -violating correlations proportional to the rephasing-invariant parameterθ. However, based on the reasoning that the quantization of the topological sectors comes from the fact that the path integral receives its nonvanishing contributions from saddle points of finite action and fluctuations about these, boundary conditions in Euclidean space should be imposed at infinity before the summation over topological sectors. The conclusion then is that the theory of strong interactions with massive fermions does not predict CP -violating phenomena, irrespective of the value ofθ.

Supplementary material S1 Outline
We present here the technical details that corroborate the statements made in the main text.
The anomalous violation of chiral fermion number through instanton and sphaleron transitions is a characteristic feature of the strong interactions, and for the weak interactions, it is likely to be of key importance for the generation of the baryon asymmetry of the Universe [8, 10-13, 26, 27]. Upon the discovery of the Belavin-Polyakov-Schwartz-Tyupkin (BPST) instanton [8], it was soon realized by 't Hooft that these instanton solutions can also solve the axial U(1) problem [28], which queries why there is no pseudo-Goldstone boson associated with flavour-diagonal chiral rephasings-the η is much heavier than the mesons in the octet. Although the Adler-Bell-Jackiw (ABJ) anomaly [10,11] implies that the axial U(1) current is not conserved, it was believed for a while that the anomalous term vanishes when integrated over the whole spacetime because it is a total derivative. However, for the BPST instanton, the anomaly turns out to be nonvanishing globally, thus providing extra breaking for the axial U(1) symmetry and giving rise to the splitting of η from the meson octet. The violation of chiral fermion number induced by instantons is typically suppressed by the tunneling exponent. At finite temperature, it is however possible to have thermal transitions instead of tunneling. These are described by the sphaleron, i.e. an unstable saddle point of the energy functional for the gauge fields [26].
In the context of thermal field theory and since the instanton corresponds to a Euclidean saddle point solution, calculations are typically carried out using imaginary time. Nonetheless, some of the main phenomenological applications are within scattering theory or kinetic theory such that it is necessary to transfer the results to the real time of Minkowski space. This is generally possible through the analytic continuation of Green's functions. Nonetheless, it remains of interest to achieve a formulation directly in Minkowski spacetime because it would allow for a first-principle derivation of kinetic theory involving instantons, e.g. in the Schwinger-Keldysh formalism [29,30], or a more systematic treatment of fermions that are not of the Dirac type, e.g. in chiral gauge theories. A real-time approach would also serve as a check for the correct interpretation of the analytically continued quantities. In view of this, we also discuss in this paper some details on the correlation functions in Minkowski spacetime.
Real-time calculations are typically only feasible when expanding about a saddle point of the action. However, there is no saddle for the action in Minkowski spacetime that would correspond to an instanton configuration. The saddle is recovered when extending the path integral over the degrees of freedom of the bosonic fields into the complex plane and deforming the integration contour. Convergent integration contours that go through the saddle of interest can be found using the Picard-Lefschetz theory [31] which has led to a number of applications and further developments, for instance, in Refs. [32][33][34][35][36][37]. Effects from the chiral anomaly for real background fields in Minkowksi space are calculated e.g. in Refs. [38][39][40].
It is advantageous to derive the Green's function for fermions from a spectral sum, this way the contribution of modes that account for the chiral anomaly, i.e. the zero modes in the massless limit, is readily isolated [6,12,13,41]. Given the spectrum of the massless Dirac operator in the instanton background, this construction is straightforward for the case of a real mass term in Euclidean space. Assuming the mass acts as a perturbation to the eigenspectrum, it is also obvious how to insert a complex mass into the zero-mode contribution to the Green's function. In Section S2.1, we therefore note this result along with some well-known generalities about analytic continuation of the problem. We focus for simplicity on setups with Dirac fermions in the fundamental representation of the gauge group, as in quantum chromodynamics (QCD). It is less clear how to construct the spectral sum in Euclidean space in the presence of a complex mass that cannot be treated as a small perturbation. This is because of the occurrence of γ 5 ; the complex mass term is not proportional to an identity matrix. In Section S2.2, we show that the spectral sum can be built in terms of the eigenfunctions of the massless Dirac operator after an additional orthogonal transformation among the pairs of modes with opposite eigenvalues. As for the eigenmodes, there is a complication in the analytic continuation because the improperly normalizable Euclidean continuum modes will in general not be normalizable when evaluated in real time [35]. In Section S2.3, we therefore discuss in detail how the spectral decomposition of the Green's function can be continued from Euclidean to Minkowski space by rotating the temporal coordinate axis by an angle ϑ. This requires a particular procedure for the continuation of the dual eigenvectors that we refer to as ϑ-conjugation, and in Section S6, we exemplify this on the Green's function for a Dirac fermion in the homogeneous and isotropic background spacetime. As a result, in Section S2.4, we then show how the spectral sum can be understood in terms of the eigenmodes of the Dirac operator directly in Minkowski spacetime, which requires discussion because this operator is non-Hermitian since the analytically continued gauge field configuration of the instanton is complex.
Having reported the results for Green's functions of fermion with complex masses (i.e. nonzero chiral phase) in (anti-)instanton backgrounds, we proceed in Section S3 to derive correlation functions, starting with two-point functions in a theory with a single fermion. The correlation functions do not trivially coincide with the Green's functions because in the path integral, the sum over the number of individual instantons as well as the integral over their locations are yet to be carried out. We observe that for a given number of instantons with positive and negative winding numbers, chiral phases from the fermion determinant as well as from the θ-vacuum of the gauge theory multiply all structures-left and right chiral contributions as well as pieces corresponding to the homogeneous background between instantons-by the same factor (see Eq. (7)). The boundary conditions on the path integral must be chosen such that there are saddle points of finite action [17]. Some comments concerning this point in the context of the present work are made in Section S3.2. Therefore, the physical fields must be vanishing at infinity, which allows pure gauge configurations of the gluon field. This implies the topological quantization of the winding number. As a consequence, the integration over the infinite spacetime volume must first be done for configurations with fixed total winding number, as we carry out in Section S3.1. The summation over the different winding numbers is then performed subsequently, as shown in Section S3.3. After the summations and integrations, the chiral phase of the mass term is aligned with the phase associated with the effects from the instantons breaking chiral symmetry. While for simplicity, the derivations are carried out in detail for the case of a single fermion flavour, we subsequently discuss the generalization to the realistic case of several flavours in Section S3.4. We also show in Section S3.5 how to calculate higher-point correlation functions in theories with several flavours and complex mass terms and demonstrate that again, the θ-angle drops out of the final result. A consequence of these findings is that there are no CP -violating effects in the strong interactions. To clarify this, in Section S3.6, we eventually discuss how the chiral phases that we have computed for the fermion correlation function determine couplings in the effective Lagrangian that governs the strong interactions at low energies.
As an alternative to computing the correlation functions from the fluctuations about the ensemble of instantons and anti-instantons, in Section S4 we constrain the dependence of the partitions Z ∆n on the spacetime volume and the fermion phases using cluster decomposition and the index theorem. Again, we verify the phase alignment between terms from topological effects and from fermion masses.
Evaluating the contributions from the single topological sectors in the infinite-volume limit may be viewed as equivalent to fixing the winding number altogether. In Section S5, we therefore verify that fixing the topological sector in large spacetime volumes does not violate the principle of cluster decomposition. We do so by deriving the expectation values from a path integral restricted to a subvolume of the full spacetime. In addition, we discuss observables such as the density of winding number and the topological susceptibility for finite or infinite spacetimes with free and fixed topological sectors.
While with Section S2, we devote a large part of this material to the discussion on the analytic continuation between Euclidean and Minkowskian Green's functions and fermionic functional determinants, we note that all of the main conclusions are equally reached when working entirely in Euclidean space. A reader not concerned with the analytic continuation may take the Green's function (S61) and the ratio of functional determinants (S64) as a starting point. Their Euclidean counterparts can be rederived straightforwardly.

S2 Green's function for fermions in a one-instanton background in
Minkowski space

S2.1 Analytic continuation of the instanton solutions and fermion fluctuations between Euclidean and Minkowski space
We discuss here some generalities of the continuation of the instanton solution, the Dirac operator and its Green's function between Euclidean and Minkowski spacetime. For definiteness, we consider Dirac fermions in the fundamental representation in the background of SU(2) BPST (anti-)instantons. We construct the fermion Green's function by regulating the divergence from the fermion zero-mode by a mass term with a nonzero chiral phase. While such a phase can straightforwardly be inserted into the well-known results for the Green's function e.g. from Ref. [41], the explicit discussion of this matter serves us to introduce the general context as well as some notation.
The strong interactions are described by QCD, where the Euclidean action reads (leaving aside the topological term for the moment) (S1) The super-and subscripts "E" indicate that a quantity is defined in Euclidean space. Our conventions for the Euclidean coordinates are such that  [10,11] and the strong CP problem, it is convenient to employ the Weyl basis for the Dirac matrices: where σ E m = ( τ , i1 2 ) andσ E m = ( τ , −i1 2 ) with 1 2 being the unit 2 × 2 matrix and τ i the Pauli matrices. The covariant derivative takes the form when ψ E i lives in the fundamental representation of the gauge group, and when ψ E i lives in the adjoint representation. Here T a ≡ τ a /2 are the generators of the gauge group and satisfy [T a , T b ] = if abc T c and trT a T b = δ ab /2. When restricting to the subgroup SU(2), the structure constants are f abc ≡ ε abc . The subscript i on the fermions is the flavor index, to be distinguished from the spatial vector index.
In four-dimensional Euclidean space the BPST instanton with the collective coordinates corresponding to the location set to zero and with winding number η = +1 is given in terms of the vector potential where the 't Hooft symbols η amn are defined as [13] η amn = The expression for the anti-instanton with η = −1, which is the parity conjugate of Eq. (S6), is obtained when replacing η amn →η amn where theη amn differ from η amn by a change in the sign of δ.
The continuation of Euclidean time to an arbitrarily rotated time contour is parametrized as (cf. Ref. [35]) where t is a real parameter. Then, for ϑ = π/2, t is just Euclidean time whereas for ϑ = 0 + , it corresponds to Minkowskian time. Here the infinitesimal 0 + that regulates the continuation of the instanton configuration to Minkowski spacetime can be understood as a prescription to ensure that the path integration captures the transition amplitude from the true vacuum state onto itself [35]. We simply take 0 + to be zero whenever it does not play a role. For a fixed value of ϑ characterizing a choice of time contour, we label the real coordinates of the (time-rotated) spacetime as where Greek indices run from 0 to 3. With this parameterization, all equations of motion as well as their solutions do in general depend on ϑ. The ϑ-dependent instanton solutions for the gauge fields can be simply obtained by performing the substitution of Eq. (S8) into Eq. (S6) or the corresponding Euclidean solution for η = +1. In particular, the solutions in Minkowski spacetime are obtained when taking ϑ = 0 + . In the following, we clarify when necessary whether we are referring to quantities for general ϑ or for a particular choice. For the remainder of this section we consider the continuation from Euclidean into Minkowski spacetime, maintaining a superscript "E" for Euclidean quantities, and omitting labels for their Minkowskian counterparts. First, one should note that when recasting expressions in terms of Minkowskian metric tensors (e.g. −δ mn → η µν ≡ diag(1, −1, −1, −1)) and Dirac matrices, it is natural to define the components A µ of the Minkowski gauge field as: When expressing A µ = (τ a /2)A a µ , this implies however that the components A a µ when evaluated for the η = −1 instanton solution (Eq. (S6)) continued to ϑ = 0 + are in general complex. Since the physical fields A a µ are however real, a deformation of the integration contour of the path integral is required in order to capture the analytically continued solution, which constitutes then a complex saddle point from which appropriate complex integration contours that lead to well-behaved integrands can be obtained by means of steepest-descent flows [31,32]. In Ref. [35] it is derived how to evaluate the path integration of bosonic fluctuations on the deformed contours using Picard-Lefschetz theory, which would have to be applied here in order to deal with the fluctuations of the gauge field. The saddle point for the fermion field is still given by the vanishing field configuration, and the path integral of the Graßmannian fermion fluctuations can be carried out as usual.
In chiral representation the Dirac matrices for Minkowski spacetime are given by Note that the form of γ 5 is the same for Euclidean and Minkowski space, and it is defined as The Minkowskian Dirac operator is then obtained from the Euclidean one by performing the analytic continuation of Eq. (S8) to ϑ = 0 + : where γ · ∇ ≡ i γ i ∂ i and accordingly for γ · A. We can generalize this continuation such as to include a complex mass me iα ≡ m R + im I , resulting in On the right-hand side, we recover the standard Dirac operator for a massive fermion in Minkowski spacetime. It is a non-Hermitian operator leading to a Lagrangian term that is however Hermitian when sandwiched between ψ = ψ † γ 0 and ψ and when A a µ is real. As noted above, the latter condition is not met for the complex saddle corresponding to the instanton.
When including a complex fermion mass, the Euclidean Green's function The most straightforward way of constructing it is from the spectral sum in the massless limit. It is constituted by the solutions to the eigenvalue problem as Since the Euclidean Dirac operator / D E is anti-Hermitian, its eigenfunctions can readily be assumed to be orthonormal and Eq. (S14) be immediately verified. Yet, Eq. (S16) is ill-defined because of the fermionic zero mode λ E = 0 in the instanton background. The Euclidean index theorem relates the winding number to the difference between the number of right-handed and left-handed zero modes. This gives one left-handed zero-mode for a η = −1 background, and a right-handed zero mode for η = 1. The former is given by and u is a 2×2 antisymmetric matrix with a Weyl index α and an index b labelling the fundamental representation of SU(2), i.e. u αb = ε αb , with ε 12 = 1. As anticipated the mode is left chiral, i.e.
are the chiral projectors. The solution ψ E 0R in the η = +1 instanton background can be obtained by switching the chiral block in Eq. (S17).
A small complex mass term can serve as a regulator of the zero-mode contribution to Eq. (S16) because, for fermions in the fundamental representation of the gauge group in the η = −1 instanton background, one obtains at first order in perturbation theory [41] From Eq. (S13), it then follows that we may analytically continue this solution as where the dependence on x ( ) is understood to refer to the components x ( )0 and x ( ) of the corresponding four-vector x ( )µ as in Eq. (S9). This Minkowski-space Green's function approximately solves the equation The above equation can be obtained from an analytic continuation of Eq. (S14), with the continuation of the delta function giving (For example, one can start with the representation of δ(x) in terms of its Fourier-transform and analytically continue x away from the real line.) On the other hand, taking Eqs. (S14) and (S20) as the definitions of the Euclidean and Minkowskian Green's functions, respectively, one can infer from the path integral the following correspondence between the Green's functions and the fermion propagators in the one-instanton background: Recalling that the mapping between Euclidean and Minkowskian fermion fields goes as , one can confirm that the Euclidean and Minkowskian Green's functions are indeed related by the analytic continuation of Eq. (S19). (The present notation differs from that used in Ref. [42] where ψ E † ( x, x 4 = ix 0 ) = iψ(x 0 , x).) Note however that, as it is elaborated upon in Section S2.4, it is not straightforward to show that this analytic continuation has a well-defined spectral representation in terms of (im)properly normalizable eigenfunctions of the Dirac operator in Minkowski spacetime [35]. Equations (S18) and (S19) show that a mass term with a complex phase can thus be perturbatively included in the leading contribution to the Green's function that corresponds to the Euclidean zero modes in the massless limit. Nonetheless, since the Euclidean Dirac operator for a massive fermion with a general chiral phase is not of definite Hermiticity, it remains of interest whether such a spectral sum in terms of orthonormal eigenfunctions is also possible for a complex mass term without resorting to perturbation theory around the massless configuration, which is what we discuss in the following section.

S2.2 Complex fermion mass in Euclidean space
In this section we focus on the Euclidean operator in Eq. (S13). The operator / D E + me iαγ 5 = / D E + m R + iγ 5 m I has the following properties in certain simplified cases. For m = 0, it is anti-Hermitian, while for m I = 0, it is "γ 5 -Hermitian", i.e.
When using the eigenmodesψ E λ from the massless problem (S15) in the presence of a real mass, these still lead to eigenmodes with the eigenvalues Hence, since the real mass term is proportional to the identity matrix in spinor space, a spectral sum can be computed in terms of the same basis vectors as for the massless case. Moreover,ψ E λ and γ 5ψE λ are orthogonal for λ E = 0 because they correspond to different eigenvalues of the anti-Hermitian operator / D E .
For a complex mass term, where in addition m I = 0, it it is less obvious that a spectral sum can be constructed in terms of the massless eigenmodes because the mass term is no longer simply proportional to an identity matrix in spinor space. Nonetheless, this can still be accomplished with an additional basis transformation among the pairsψ E λ and γ 5ψE λ . To see this, we note that for a given pair of massless eigenmodesψ E λ and γ 5ψE λ (λ E = 0), the Dirac operator takes the matrix form The eigenvalues of this matrix are and the normalized eigenvectors are The spinors ψ E ξ± are pairwise orthogonal, which can be checked explicitly when making use of the fact that / D E is anti-Hermitian such that λ E is purely imaginary. Since the zero mode is chiral, it is still an eigenfunction ψ E 0 ≡ψ E 0 for the Dirac operator when a complex mass is added. Altogether, we still have an orthonormal system such that the Green's function in the η = −1 instanton background is given by (S27) In addition, we note that (λ E ) 2 − m 2 I < 0 (λ E is purely imaginary because of the anti-Hermiticity of / D E ), such that the coefficients ofψ E λ and γ 5ψE λ in Eq. (S26) have the same phase. The basis transformation is thus orthogonal, up to an arbitrary overall phase. Hence, ψ E ξ± are also eigenvectors of the Hermitian conjugate operator with eigenvalues (ξ E ± ) * because the above operator acts on the pairψ E λ and γ 5ψE λ as the complex conjugate of the operator in Eq. (S24). (If the coefficients ofψ E λ and γ 5ψE λ did not have the same phase, the coefficients would have to be complex conjugated in order to obtain the eigenvectors of the complex conjugate matrix.) The anomalous divergence of the chiral current can now be straightforwardly verified. We first note that and that the according relation also holds for the zero mode ψ E 0 (x E ). The trace is understood to run over the spinor indices, and we have substituted the eigenvalues of the massive Dirac operator and its Hermitian conjugate as discussed above. Substituting this into Eq. (S27), we indeed obtain We note that the second term on the right-hand side vanishes because the trace of γ 5 over the nonzero modes is not anomalous. The first term on the right gives the usual anomaly upon integration over spacetime and accounting for the unit norm of the zero modes: For a η = ±1 background with a right (left)-handed zero mode, one gets a change of chirality by ±2 units. The last term in Eq. (S30) reproduces the classical divergence of the current.
From the spectral decomposition we can also observe that the phase of the determinant of the operator As a consequence, we can write One can use the fact that the instanton and anti-instanton backgrounds are simply related by parity conjugation to prove that the determinants in both backgrounds are related by the substitution α → −α. This is consistent with the phases in Eqs. (S31) and (S32). Moreover, according to Eq. (S31), |det(− / D E −me iαγ5 )| is independent of α, and thus it is identical for both backgrounds. A similar analysis can be done for the In this case, since the gauge-field background is trivial with zero winding number, according to the Atiyah-Singer index theorem the number of left-handed zero modes for / ∂ E must equal to the number of right-handed zero modes, ending up with a vanishing chiral phase in the determinant: In preparation for the extension of the spectral decomposition of the propagator (S27) to arbitrary rotations of the time contour, we consider separately the Euclidean eigenfunctions belonging to the discrete and continuum spectrum and introduce associated notation and properties. The normalizable eigenfunctions belonging to the discrete spectrum are denoted as ψ E n and their eigenvalues as ξ E n . These modes have a finite norm and are mutually orthogonal under the usual scalar product, In regards to the continuum spectrum, involving improperly normalizable eigenfunctions, it can be constructed from solutions which approach plane waves at x 4 → −∞, characterized by asymptotic momenta k m , m = 1, . . . , 4. We will thus denote the eigenfunctions as ψ E . A difference with the work of Ref. [35], which focuses on differential operators in backgrounds invariant under spatial translations like a planar domain-wall, is that the continuum modes will not be given by a single plane wave for all x 4 , due to the spatial inhomogeneity of the BPST instanton background. However, one can always choose a basis of modes approaching a single plane wave at x 4 → −∞ and given by a superposition of plane waves at x 4 → ∞. Indeed, from the results in this section it follows that generic Euclidean modes ψ E ξ with eigenvalues ξ E satisfy which gives Therefore the Euclidean eigenvalue problem implies For a solution going asymptotically as a plane wave in the infinite Euclidean past-thus being improperly normalizable and belonging to the continuum spectrum-one has and the Euclidean eigenvalues satisfy (using the fact that the instanton background A E m goes to zero at infinity) As the background also goes to zero for x 4 → +∞, the solutions will tend to a superposition of plane waves with the same value of k 2 = k m k m , fixed in terms of |ξ E {km} | 2 as above. In this sense, the eigenvalue equation is analogous to a wave-mechanical scattering problem. We expect that we can form a basis for the continuum spectrum by considering all possible plane waves at x 4 → −∞. As the solutions are eigenfunctions of a Hermitian operator, the ψ E {km} are orthogonal, and they can be normalized so that the norm is a delta function in k-space: In the massless limit, as discussed above the continuum eigenvalues must become purely imaginary. Denoting these massless eigenvalues as λ E {km} and using Eq. (S39) in the massless limit, if follows that Then, the results of Eq. (S25) imply that the continuum Euclidean eigenvalues for a general complex mass have the form

S2.3 Complex fermion mass for an arbitrary rotation of the time contour
In this section we generalize the spectral decomposition of the Euclidean propagator to the case of arbitrary rotations of the time contour, using the methods of Ref. [35] adapted to complex fermion fields in generic, rather than bosonic planar backgrounds. We use superscripts "ϑ" for objects defined for a general time contour. Under the analytic continuation of Eq. (S8), the fermionic kinetic term of the Lagrangian involves the operator with the following γ-matrices and gauge field components: Recall that x 0 is meant to be real, parameterizing the rotated time contour; one also has ∂ µ = ∂/∂x µ with x µ the components of the four-vector in Eq. (S9). The matrices γ ϑµ , which have been defined in terms of their Minkowskian counterparts γ µ , satisfy a Clifford algebra {γ ϑµ , γ ϑν } = g ϑµν , with the metric g ϑµν = diag{e 2iϑ , −1, −1, −1}. The latter coincides with the effective metric appearing in the kinetic terms for scalar fields for arbitrary ϑ in Ref. [35]. Note that here we are looking at the analytic continuation between the two operators in Eq. (S43). When taking ϑ = π/2, the γ ϑµ do not render the Euclidean γ-matrices but differ from these by a factor of i. This is due to the signature (+, −, −, −) used in Minkowski spacetime, as opposed to the positive signature in Euclidean spacetime.
As in Ref. [35], one can construct (im)properly normalizable eigenfunctions for the differential operator for arbitrary ϑ by analytic continuation of the corresponding Euclidean eigenfunctions in the time variable and, for the continuum spectrum, additionally in the asymptotic parameter k 4 . In order to obtain eigenfunctions ψ ϑ n in the discrete spectrum it suffices to perform the usual analytic continuation, for which one obtains same eigenvalues as in Euclidean space, safe for the minus sign that follows from Eq. (S43) and the fact that the Euclidean eigenvalues were defined as corresponding to the operator / D E + me iαγ 5 : The factor of √ ie −iϑ is taken to lie in the principal branch and is necessary to guarantee a unit norm, defined with an inner product that will be described below. For the continuum spectrum, in order to preserve the planewave behaviour at t → −∞, one needs to rotate the asymptotic parameter k 4 , and as a result the continuum eigenvalues in Minkowski are ϑ-dependent: In the following we denote a generic eigenfunction with eigenvalue ξ ϑ -either in the discrete or continuum spectrum-as ψ ϑ ξ . It turns out that the eigenfunctions constructed as above are orthogonal and complete with respect to the following inner product, withψ ϑ defined as We refer to this operation indicated by a tilde and to the associated inner product in Eq. (S47) as ϑ-adjoint and ϑ-adjoint inner product, respectively. In Eq. (S48), the dagger operation is to be understood assuming that the corresponding coordinates and asymptotic parameters are treated as real, i.e. ψ E { k,k4} ( x, x 4 ) † should be calculated assuming k m , x m are real, and the same goes for k 0 , k, x 0 , x when evaluating ψ ϑ n (x) † . The last equalities in both lines of Eq. (S48) follow from the fact that the transformations x 0 → −e −2iϑ x 0 , k 0 → −e 2iϑ k 0 undo the complex conjugation of the combinations ie −iθ x 0 , −ie iθ k 0 corresponding to the Euclidean variables x 4 , k 4 . A consequence of the above definition is that both ψ ϑ andψ ϑ are holomorphic functions of x 0 and k 0 . Then one can prove orthogonality and completeness of the ϑ eigenfunctions constructed as above by relating all integrals over the parameters x 0 , k 0 to their Euclidean counterparts x 4 , k 4 using the Cauchy theorem [35]. In particular, the discrete modes have the normalization where as advertised earlier the prefactors √ ie −iϑ in Eqs. (S45) and Eq. (S48) cancel the Jacobian from the rotation of the contour to the Euclidean time. On the other hand, for the eigenfunctions in the continuum one has where in this case the Jacobian from the rotation to Euclidean time is cancelled by the the one arising from the analytic continuation of the Euclidean delta function of the asymptotic momenta. Proceeding along these lines, and as explained in detail in Ref. [35], the orthogonality and completeness of the basis of eigenfunctions for arbitrary ϑ follow from the analogous properties of the Euclidean spectrum. The former implies that one can resolve the operator i / D ϑ − me iαγ 5 in terms of orthogonal projectors, and thus its inverse, i.e. the propagator, is given by The above propagator is nothing but the analytic continuation of its Euclidean counterpart, up to an overall constant: The overall minus in Eq. (S53) arises as a result of Eq. (S43) (or equivalently from the minus signs in the relations between rotated and Euclidean eigenvalues in Eqs. (S45) and (S46)). The constant ie −iϑ appears in the contribution from the discrete spectrum due to the different normalization of the modes, see Eqs. (S45) and (S48), while for the continuum spectrum the same factor arises when relating the integral over the rotated k 0 to its Euclidean counterpart k 4 = −ie iϑ k 0 . Note that for ϑ = π/2 one recovers the Euclidean result up to a minus sign, arising because the propagator S ϑ=π/2 is the inverse of / D ϑ= π D E − me iαγ 5 . For ϑ = 0 + , one recovers the relation (S19).
As an explicit application of the previous construction for ϑ = 0, in Section S6 we use a spectral sum involving the ϑ-adjoint inner product to derive the free Minkowskian propagator for a fermion with a complex mass term.

S2.4 Complex fermion mass in Minkowski spacetime
The results of the previous section can be applied to Minkowski spacetime by taking the limit ϑ → 0 + . Throughout this section, unless specified otherwise all objects are assumed to be defined in Minkowski spacetime. The relevant differential operator, is Hermitian when evaluated in a background of real A a µ and multiplied by γ 0 . This may suggest that for such real backgrounds one could define an inner product involving Dirac adjoint spinors rather than the inner product of Eq. (S47) defined in terms of the ϑ-adjoint spinors introduced in Eq. (S48). For the Dirac adjoint inner product the operator i / D − m R − iγ 5 m I would remain Hermitian, and one would naively expect orthogonal eigenvectors with real eigenvalues, giving a spectral decomposition of the propagator in terms of projectors of the form ψ ξψξ . However, this is not the case because the Dirac adjoint inner product is not positive definite, and thus the ψ ξψξ operators do not behave as projectors. This is best illustrated by considering the case of the free Minkowskian propagator, which is studied in Section S6; as shown there, when using the Dirac adjoint inner product the eigenfunctions have zero norm and are not orthogonal, while using the ϑ-adjoint inner product one recovers normalizability, orthogonality and completeness, and the usual propagator is recovered from the spectral sum of the tilde projectors. Finally, one could think of defining a propagator from the Hermitian operator γ 0 (i / D − m R − iγ 5 m I ), but this plays no role for S-matrix elements, which are constructed from Green's functions involving products of spinors ψ,ψ and thus defined in terms of the inverse of the operator in Eq. (S54). In any case, in the Minkowskian instanton background the background fields A a µ are not real, so that Hermiticity cannot be a guiding principle for the choice of operator or inner product.
From the results of the previous sections we therefore infer a spectral decomposition for the Minkowskian Dirac operator and its associated propagator, An explicit discussion of the analytic continuation of the continuum spectrum of fermionic and bosonic excitations about instantons would be of interest in the future. To this end, we only comment on the fermion zero-mode, that is normalizable in the proper sense and accountable for the effects from the chiral anomaly. By "zero mode" we refer to eigenstates with zero eigenvalue of the massless Dirac operator. As these modes have well-defined chirality, they are also eigenstates of the general Dirac operator with a complex mass, with eigenvalue ξ 0R = −me iα for right-handed modes, and ξ 0L = −me −iα for left-handed ones. As follows from the results of the previous section, these discrete zero modes are obtained by analytically continuing the corresponding Euclidean solutions. Then, as in Euclidean spacetime, this gives one right-handed zero-mode for a η = 1 background, and a left-handed zero mode for η = −1. Applying Eq. (S45) to the Euclidean expression of Eq. (S17) for the zero mode in the η = −1 background gives where u is defined below Eq. (S17). The zero mode satisfies the propertỹ as follows from the definition of the ϑ-adjoint operation in Eq. (S48) and the invariance of ϕ † 0L (x) under time reflections, as can be readily seen from Eq. (S57).
Hence the spectral decomposition of the propagator in Eq. (S55) features a contribution involving ϕ 0L (ϕ 0L ) † . Note that this structure indicates anomalous violation of chirality, as it should, which would not be the case if the spectral decomposition were constructed with the Dirac adjoint inner product. Such construction, which was discarded in the previous section, would involve terms of the form ϕ 0L ϕ 0L .
Assuming that the zero mode dominates the contributions to the Green's function in the η = −1 instanton background close to its centre x 0 , we thus arrive at the approximation which captures the dominant contributions from both close to the centre and far away from it. Here, iS cont (x, x ) is the contribution from the continuum spectrum and is the propagator in the trivial background with vanishing gauge fields, whose derivation from a spectral decomposition involving the ϑ-adjoint inner product is presented in Section S6. Furthermore, we have explicitly inserted the dependence on the translational coordinates x 0 of the instanton. Noting that iS 0inst has a spectral decomposition purely in terms of continuum modes and that iS 0inst (x, x ) ≈ iS(x, x ) for |x 2 |, |x 2 | ρ 2 is an approximation to the Green's function in the instanton background that is valid at large distances from the centre of the instanton, explains the last equality in Eq. (S59). In Eq. (S60), we have chosen the -prescription corresponding to the Feynman propagator, while of course also other boundary conditions are of interest, e.g. in view of applications within the Schwinger-Keldysh formalism. The Fourier integral can be straightforwardly evaluated, while the explicit result is not relevant to this end.
The propagator in the η = +1 instanton background follows from the η = −1 case by switching the chiral block of the zero mode in Eq. (S57), using the resulting right-handed zero mode ϕ 0R in place of ϕ 0L in Eq. (S59), and replacing α → −α. For a background consisting of a dilute gas of n instantons andn anti-instantons with centres x 0,ν , x 0,ν , the propagator can be approximated again by the ordinary contribution plus a sum over the zero-mode contributions of the instantons and anti-instantons: To end this section, we may note that, using the results of Ref. [35], the determinant of the Minkowski-space operator i / D − m R − iγ 5 m I can be obtained from the Euclidean result of Eq. (S32) by analytic continuation of the time interval T E → iT (with T E and T referring to the Euclidean and Minkowskian time intervals of the spacetime volume V T E and V T , respectively), Actually, in physical quantities it is the ratio det(i / D − me iαγ 5 )/ det(i / ∂ − me iαγ 5 ) (and the corresponding one in Euclidean space) that enters. And it turns out that for such ratios the T -dependence cancels out. It is shown in Ref. [35] that the T -dependence appears only in the integral over the collective time-coordinate of the instanton which originates from the time-translational zero mode of the gauge-field fluctuations in our case (see Eqs. (S71), (S72) below). Therefore we simply have This means, in particular, that the only dependence on the chiral phase α is again coming from the zero modes of / D E alone. We therefore define where Θ is a positive real number. As follows from the discussion in Section S2.2, Θ is the same for both instantons and anti-instantons, hence the omission of a label indicating η.

S3.1 Path integral in fixed topological sectors
In this section we consider correlation functions for massive fermions with chiral phases, working directly in Minkowski spacetime. We first derive the two-point correlator in a theory with a single fermion and after that, we generalize the result to the cases of multiple fermions and higher-order correlators. For fluctuations about a given classical background-or about a saddle point on a certain complexified contour of path integration, the Green's function can be identified with the leading order approximation to the two-point correlation function. In the case of the vacuum of a non-Abelian gauge theory, the correlation function is to be computed by summing over contributions coming from fluctuations around backgrounds from different topological sectors, i.e. of different winding number. In a dilute instanton gas approximation, such backgrounds are described by configurations with all possible numbers of (anti-)instantons, with arbitrary locations in spacetime. The required summation can be carried out along the lines of Ref. [16], though here we will track explicitly the factors of spacetime volume, rather than using instanton densities (which may be phenomenologically more accurate). In a theory with a single massive Dirac fermion, the two-point correlation function is given by where S is the Minkowskian action and Z the partition function. In order to relate this to the previously obtained Green's functions in a one-(anti-)instanton background, we denote the numbers of η = −1 and η = 1 instantons in the spacetime volume V T under consideration byn and n, respectively. Requiring that the saddle points of the action take finite values implies vanishing physical fields at the spacetime boundary at infinity [17]. For the field A, this still allows pure gauge configurations while the winding number ∆n = n −n is topologically restricted to integer values. Consequently, because the topological term is a total divergence, configurations with different values of ∆n have different boundary conditions for the gauge field configuration.
These therefore lead to separate contributions to the path integral. In order to add up these pieces to obtain the partition function or an observable, we need to take into account the fact that the vacuum state is a superposition of configurations with all Chern-Simons numbers, i.e. (up to an irrelevant normalization factor) [4,5] |vac = nCS |n CS . (S66) Here, |n CS is a state with a fixed Chern-Simons number. The states are generally also characterized by a vacuum angle, as it is reviewed in Section S5.5. The vacuum angle θ does not explicitly appear here since we choose to absorb it in the topological Lagrangian term θtrF F /(16π 2 ), where F denotes the field strength tensor of the gauge field, F its dual and θ is the vacuum angle of the gauge theory under consideration. It is easy to see that the following arguments do not rely on whether the phase is attributed to the state |vac or to the Lagrangian. We choose the latter option such as to simplify notation. There are then distinct path integrals with different boundary conditions for each winding number ∆n = n−n contained in the spacetime volume. This is because in regular gauge, the integral over the topological term is determined by the configuration of the gauge field at infinity, where the boundary conditions are imposed. It also implies that the individual contributions must be evaluated in the limit V T → ∞, which turns out to be of substantial consequence. We therefore consider these pieces separately.
First we have to specify the determinant of the Dirac operator in a general background with winding number ∆n = n −n. Naively one may write it as However, this would lead to an overcounting of the vacuum fluctuations from the domains of spacetime far away from instantons or anti-instantons, where we recall that e.g. the propagator reduces to its vacuum form in those regions, cf. Eq. (S61). In order to count these fluctuations for the trivial background one time and one time only, instead of Eq. (S67), the correct contribution is which can be seen to follow formally from Eq. (S61) and where we have used Eqs. (S33), (S62), (S64) and the fact that Θ is independent of the winding number η. Similarly for the functional determinant of the gauge and ghost fields, we have whereĀ denotes the background gauge-field configuration and a prime on the determinant indicates that factors from zero eigenvalues have been deleted. Here det A represents the functional determinant of the gauge and ghost fields in the one-instanton backgrounds. We have used that the determinants for η = 1 and η = −1 are identical, as can be seen to follow from the fact that the instanton and anti-instanton backgrounds are related by parity conjugation.
For notational convenience, we define Then for a two-point fermionic correlation function, we have to evaluate the contributions Here, |n in/out are Heisenberg states at times ∓T /2, with well-defined Chern-Simons number, DAn ,n stands for the restriction of the path integrals to fluctuations about the configuration withn instantons with η = −1 and n with η = +1, and the classical Euclidean action is S E = 8π 2 /g 2 (before adding the topological term). Note that the classical action for the ϑ-dependent instanton solution is however ϑ-independent, i.e. iS[ cf. Ref. [35]. This is also assumed for the topological contribution to the action. The collective coordinates corresponding to dilatational and gauge-orientation zero modes are integrated through dΩν ,ν , and Jν ,ν are the Jacobians that arise when trading the zero modes for collective coordinates, which are derived for Euclidean space in Refs. [13,19]. For the path integral in Minkowski spacetime, the Jacobians are purely imaginary because of the analytic continuation of the collective coordinate corresponding to time-translations [35]. Furthermore, all determinants are understood to be renormalized. In regards to the bosonic fluctuations, one can use here the results of Ref. [35], which show how the integral over the bosonic fluctuations on a thimble (i.e. an appropriately chosen contour for the bosonic path integral) about an analytically continued complex saddle, when the zero modes are separated, is related to the functional determinant evaluated at the corresponding Minkowskian saddle. The combinatorial factor 1/(n!n!) is due to the fact that exchanging any two locations x 0,ν or x 0,ν results in the same configuration. We note that when integrating over fluctuations about all the dilute instanton backgrounds with finite action, we admit contributions from fluctuations that asymptotically take the form of plane waves which, despite having infinite action, do not contribute to the integral of the topological term in the Gaußian approximation. Thus the former integral remains proportional to an integer, and is given by θ(n −n), as it appears in the result (S71). The contribution Z ∆n from the configurations with ∆n to the partition function, that is necessary for normalization, is computed as in Eq. (S71), just with the factor ψ(x)ψ(x ) deleted from the integrand: Here, we have carried out the spacetime integrals over the instanton locations, resulting in powers of the spacetime volume. Since we are considering here real time, ∆n can be interpreted as the net change in Chern-Simons number over the time T , i.e. each path integral associated with ∆n corresponds to a transition between states with Chern-Simons number m and m + ∆n, as suggested by the notation in the first line of Eq. (S72). The factors | det(− / ∂ E − me iαγ 5 )| T E →iT and (detĀ =0 ) −1/2 are common for all Z ∆n and the correlation functions in backgrounds with any fixed ∆n. They are thus total factors that cancel out in any physical quantities. To clean up notation, we will simply drop these factors below.
In order to evaluate the fermion correlation (S71), we first notice that for dilute instantons in a fixed configuration, as discussed around Eq. (S61), the correlation agrees with its form in the zero-instanton background almost everywhere, except near the locations of the anti-instantons and instantons. Now for fixed x and x , each spacetime integral dx 0,ν and dx 0,ν sweeps over the point (x + x )/2 once, thus leading ton contribution with η = −1 and n with η = +1. For a single of these integrals, e.g. for the location of a η = −1 instanton, this yields anomalous terms of the type where the dots represent the contributions to the propagator from the zero modes of the (anti)-instantons whose centres were not integrated over (see Eq. (S61)), and h(x, x ) is defined as a block-diagonal matrix (with two identical blocks) satisfying Unfortunately, we do not find an analytic expression for this matrix-valued function that depends on the invariant distance (x − x ) 2 only. Note though that this function is independent of V T as we take this spacetime volume to infinity. The overlap integral h(x, x ) as defined above depends on other collective coordinates of the instanton, e.g. the scale ρ. As such, insertions of h(x, x ) do not factor out of the integration over the collective coordinates. We choose then to approximate h(x, x ) by its average over the collective coordinates, defined as This approximation allows to carry out all spacetime integrals over the instanton locations and collective coordinates. Neglecting contributions for which two or more of these locations coincide, the result is where and I α (x) is the modified Bessel function. Recall that the Jacobian J contains an imaginary factor i and that Θ is a positive real number so that κ is defined to be a positive number as well. Correspondingly, the contributions to the partition function are found to be Notice that all terms appearing in the fermion correlation (S76) as well as the partition function (S78) are multiplied by the same global phase exp(i∆n(α + θ)). This is illustrated in Figure 2 and can be attributed to the fact that the fermion determinants and topological phases multiply all operators computed in the path integral, no matter whether these are fermionic or not or whether they are induced by instantons.

S3.2 Boundary conditions in the saddle point expansion
For an infinite spacetime volume, in order to have saddle points with finite action, one should impose on the path integral boundary conditions with vanishing physical fields at infinity [17]. This is a standard procedure. As we will see in Section S3.3, its correct implementation turns out however to be decisive for the calculation of correlation functions, in particular when α + θ = 0. We therefore consider here some aspects of boundary conditions in the present context of the saddle point expansion around instantons.
In order to appreciate the necessity of using boundary conditions at infinity, we consider in contrast how we would need to proceed when restricting the path integral to a finite region of spacetime. Fixing the boundary conditions up to gauge transformations on its finite surface, which is homeomorphic to a three sphere, leads to the quantization of topological sectors. However, while this is the procedure that would lead to the conclusion that CP violation is present in the strong interactions (see Section S3.3), there is no physical principle that would allow for only considering a single boundary configuration, for example the vanishing configuration, for the physical fields around a finite spacetime volume. This matter is reviewed in Section S5.5, where we argue instead that one should sample over a weighted range of boundary conditions that can theoretically be obtained by projecting the Schrödinger wave functional on field eigenstates. Practically, in the presence of interactions, the wave functional is not known in most of its details but yet some essential properties may be inferred. In the present case, this feature is the vacuum angle θ. Now, when sampling over a range of boundary conditions, the winding number is no longer restricted to integer values, such that it may not be clear why θ should behave as an angular variable. However, since the wave functional is not known in detail, the only practicable way of evaluating the path integral again goes via taking the boundaries to infinity. Then, even for a continuum of boundary conditions weighted by the wave functional, the only saddle points of the Euclidean action occur for vanishing physical field configurations, for which the winding number again takes integer values. We note that taking the volume of Euclidean spacetime to infinity projects on the lowest accessible energy eigenstate. (After all, given that only vanishing physical fields on the boundary at infinity lead to saddle points, it should be clear that in this approximation we cannot calculate correlation functions for different states, so that this projection does not amount to an additional restriction.) Nonetheless, the configurations of fixed ∆n do not change the vacuum angle θ such that this quantum number of the vacuum state remains conserved also in the limit of an infinite spacetime volume.
Another point of view on the boundary conditions on finite spacetime volumes is to leave these open and to integrate out the fluctuations over the remainder of spacetime. We carry this out in Section S4 and S5, where it is seen that CP phases from the two partitions of spacetime cancel. This means that after all the boundary condition imposed on the remainder at infinity remains relevant.
From these considerations, we conclude that when using the saddle point expansion to compute path integrals in the presence of gauge theory instantons-there appear to be no viable alternatives as far as analytical approximations are concerned-the boundary conditions should be given by vanishing physical fields, and for consistency these should be imposed at spacetime infinity. In the following subsection, we show that this has material consequences for the question of CP violation in the strong interactions.

S3.3 Summation over topological sectors
The total partition function, given by the transition amplitude from the vacuum |vac onto itself, is given by Above, we have regulated the sum over topological sectors by introducing a cutoff N . While we occasionally suppress the arguments of the partition functions, we have made here explicit that Z ∆n is a function of V T as per Eq. (S72). Eventually, N and V T are to be taken to infinity. This can lead to singular behaviour when considering Z in isolation, as we discuss at the end of Section S3.4, but can be carried out for normalized correlation functions. In particular, the fermion correlator in the vacuum (S66) is given by We emphasize that the order of the limits follows from the fact that integer winding numbers ∆n are a consequence of the requirement of finite saddle-point actions in infinite spacetime volumes V T → ∞ [17], as it is explained in Section S3.2. To meet this, the physical gauge fields must vanish at infinity, leaving the possibility of pure gauge configurations on the boundary. Topologically, the boundary of the spacetime is homeomorphic to S 3 . The maps between the latter and the gauge group can be characterized by equivalence classes corresponding to the winding numbers ∆n. Discrete sectors ∆n may also arise when imposing instead boundary conditions on some finite subvolume, but then there is no reason (such as finite action at the saddle points) to fix these constraints which are hence unphysical. (Instead, one should sum over all possible boundary conditions for the subvolume as done in Section S5.1.) Therefore, only the restriction to field configurations with vanishing physical boundary conditions in infinite spacetime volumes leads to integer ∆n.
For more details on the derivation of Eq. (S80), we note that the connected correlator is constructed as the limit of infinite N for a series of normalized expectation values. The latter are given by the ratios of corresponding partial sums for given N and V T → ∞. (This is stated for definiteness-one may choose to use alternative decompositions of the sums.) In effect, we take the limit of N going to infinity simultaneously in the numerator and the denominator in order to make sense of the quantities whose limit in isolation is ill-defined. Equation (S80) with its prescription of a common limit also follows more formally when introducing fermionic currents , in the partition function for fixed N and define the connected fermion correlators in terms of the derivatives of its logarithm with respect to the sources: . (S81) In regards to the limit of infinite spacetime volume,in Eq. (S80) we have used the result lim x→∞ I ∆n (ix e −i0 + )/I ∆n (ix e −i0 + ) = 1. The factor e −i0 + is due to the rotation T E → ie −i0 + T so that the Jacobian J actually contains a factor of ie −i0 + . The limit however also holds for real positive arguments in the modified Bessel functions such that the steps presented here can also be applied in Euclidean space. With the leading asymptotic behaviour of the modified Bessel functions then being independent of ∆n, the remaining terms with exponents of ∆n in Eqs. (S76) and (S78) lead to a geometric series that cancel between numerator and denominator. For some values where (α + θ) = 2πq, with q a rational number, there occur partial sums in the numerator and denominator where the geometric series evaluate to zero. In this case it is not appropriate to proceed by going beyond the leading asymptotic behaviour of the modified Bessel functions, which is suppressed by an extra power of κV T (cf. Eq. (S157)). For each sector ∆n, this corresponds to an arbitrarily small (as V T → ∞) correction to the leading contribution and therefore cannot be reliably calculated in any approximation. Rather, the problematic partial sums should be dealt with by taking V T → ∞ first for general (α + θ) and then taking the limit of the problematic rational value for the ratio of the partial sums. Around that rational value, there is always a neighbourhood where the partial sum is well defined using the leading asymptotic behaviour of the modified Bessel functions, such that the limit for the ratio of the partial sums indeed exists. Another point to note is that the fermion determinant included in κ contains a leading factor m that cancels with the explicit occurrence of m −1 in the final expression in Eq. (S80), leading to a finite nonperturbative correction even in the massless limit.
We next consider what would happen if we summed over the topological sectors in a finite spacetime volume first and took the latter to infinity in the last step. As discussed in Section S3.2, on finite surfaces one should integrate over a set of physical boundary conditions given by the wave functional. When instead using vanishing physical fields as boundary conditions, one is forced to impose these at infinity. If nonetheless the order of the limits were immaterial, one could yet consider taking the subvolume to infinity after the summation over ∆n. However, the order is of crucial relevance for the form of the final result because if we were not taking V T → ∞ first, we would instead obtain n,n≥0 1 n!n! h (x, x )(n m −1 e iα P L + n m −1 e −iα P R ) (V T )n +n−1 + iS 0inst (x, x ) (V T )n +n (−iκ)n +n e i∆n(α+θ) = − e −iθ P L + e iθ P R iκ mh (x, x ) + iS 0inst (x, x ) e −2iκV T cos(α+θ) .
Analogously, taking the V T → ∞ limit in the end, the total partition function would be Z → n,n 1 n!n! (−iκV T )n +n e −i(n−n)(α+θ) = e −2iκV T cos(α+θ) . (S83) (From this expression, one may read the θ-dependence of the vacuum energy density as E(θ)/V = 2κ cosθ, where κ > 0. For pure gauge theories without fermions the energy of the θ-vacuum is E(θ)/V = −2κ cos θ with κ > 0. The respective minus sign is due to the minus sign attached to the Dirac mass term, see Eqs. (S32) and (S64). The sign can be removed by shifting either θ or α by a value of π in their definitions.) For the twopoint function, we see that different phases are multiplying the left and right anomalous terms when compared to Eq. (S80). One may notice here that in the limit |∆n| n + n, which gives the dominant contributions to the binomial distribution for V T → ∞ [16], there are no relative chiral phases between the anomalous terms involvingh and the term containing iS 0inst (x, x ). This would indicate that any CP -violating contribution from a background with |∆n| n + n, that can e.g. be measured by an observer in the same background, is suppressed by the volume. The fact that in Eq. (S82) the CP -violation is enhanced follows from a cancellation of phases that is a consequence of the exchange of limits in Eq. (S80). We comment on the relevance of the different phases appearing in Eqs. (S80) and (S82) in the following.
We observe that in Eq. (S80) the chiral phase multiplying the anomalous term proportional toh is the same as the one that appears together with iS 0inst (see Eq. (S60)). Furthermore, the anomalous term has the expected exponential suppression compared to the contributions corresponding to regions that are not influenced by the instantons. As a consequence, this correlation function does not exhibit CP violation. The instanton effects are often approximated in terms of an effective operator [12,13], which in our case, based on Eq. (S80) reads where at leading order in a gradient expansion Γ is a real number that can in principle be inferred from Eq. (S80), in particular after an appropriate treatment of the dilatations, where the symmetry is broken radiatively. This corresponds to an effective mass with a chiral phase that is aligned with the one in the Dirac operator (S13). This alignment then leads to the absence of relative CP -odd phases in different contributions to an amplitude, as illustrated in Figure 2. Further when using the operator (S84) together with the Dirac mass in order to build an effective theory valid below the scale of chiral symmetry breaking, as we outline in Section S3.6, there is only one CP -odd phase that can be removed by a field redefinition. Consequently, the theory explains the absence of CP -violating observables, such as the vanishing permanent electric dipole moment of the neutron [1,2] or the nonobservation of the decay of an η -meson into two pions. This is to be compared with what one would infer from Eq. (S82), Here, the difference between the phase −θ and the phase α from a perturbative insertion of the mass m in a fermion line would indicate a CP -odd phase that cannot be removed by a field redefinition. We emphasize that for Eqs. (S84) and (S85), no assumption about the values of θ and α are made, which of course transform under chiral rotations of the fermion fields while leaving the sum α + θ invariant. It should be noted that the phase in the operator in Eq. (S84) is compatible with the following selection rule implied by the anomalous Ward identity: The theory should be invariant under a chiral transformation supplemented with changes in α, θ going as follows: where β is the parameter of the transformation. The previous selection rule is usually invoked as a justification of an effective operator involving the θ parameter as in Eq. (S85); however, this is not the only possibility, and the result of (S84) is equally compliant with the selection rule. We stress again that, given our results for the fermionic fluctuation determinants, our expressions capture the full dependence on the chiral angle α. It can also be observed that while Eq. (S80) shows that the breaking of the axial U(1) symmetry due to the fermion mass is enhanced by the effect of the instantons in a way that is independent of the absolute value of the mass, this still leaves open the question of how the correlations and the low-energy effective theory behave in the massless limit.
We emphasize that the results obtained by taking the limit of infinite spacetime volume before the sum over topological sectors, as it is in order for spacetimes that have boundaries at infinity, applies both for finite and infinite spatial volumes. Thus, our results hold not only for Minkowski spacetime, but also for spacetimes like R × S 3 in which space is the compact hypersphere. Crucially, the results are valid as long as there is an infinite extent of time, as required when considering the θ-vacua as in and out states.
Nonetheless, topological sectors with fixed winding number ∆n are well-defined within finite spacetimes with periodic boundary conditions [43], i.e. without boundaries. The periodicity and finiteness remove the necessity of specifying vacuum boundary conditions of a certain Chern-Simons number. This precludes the interpretation of the path integral within a sector of fixed ∆n as a transition amplitude between vacua with Chern-Simons numbers differing by ∆n. Because of this, there is no principle (save for some correspondence with the infinitevolume limit) that requires certain weights for the contributions to the path integral from different ∆n. We may note that the quantum equations of motion (that may also apply to an observer) are separately independent for each sector ∆n in the periodic spacetime, i.e. ∆n will appear fixed within each such sector. For an observer that can be understood as a local excitation of fields, one expects a unique value of ∆n, which could be measured e.g. through the correlation function (S76). This leads for example to a permanent electric dipole moment of the neutron that depends on ∆n but is independent ofθ and whether there are additional topological sectors and whatever happens in these. Interferences between the different sectors will therefore not be seen by observers defined through local quantum fields; rather, they would require "super-observers" with access to all topological sectors. From the absence of material effects from interference it follows that the predictions for the θ-vacuum in a spacetime that has boundaries at infinity, where the limit V T → ∞ is to be taken before the summation over ∆n, coincide with what is seen in a large but finite periodic spacetime with ∆n = 0. In contrast to common expectations, the path integration over a finite spacetime within a single topological sector still complies with cluster decomposition up to volume suppressed effects, as we discuss in detail in Section S5.2. Under the assumption of a finite volume of a spacetime without boundaries, the condition ∆n = 0 should be imposed in agreement with the observation that there is no spontaneous CP -violation, i.e. ∆n = 0 in the vacuum for any subvolume of physical spacetime.

S3.4 Several fermion flavours
The previous conclusions can be extended to correlation functions in theories with more fermion flavours. In a theory with N f Dirac fermions ψ j , j = 1, . . . , N f , in the fundamental representation of the gauge group and with complex masses m j e iαj γ5 , one can consider correlation functions of the form where σ = {σ(1), . . . , σ(N )} is a set containing N flavour indices (e.g. the list of all indices, a subset thereof or other variants, in case of which some indices may be repeated), and we have not specified spacetime indices or the different possible Lorentz contractions in order to simplify the notation. As before, we construct the correlation function by summing over contributions from topological sectors with fixed winding number ∆n: whereᾱ denotes the argument of the determinant of the fermionic mass matrix, where Θ j is defined for each flavour in analogy with Eq. (S64). As before, we have dropped factors involving the determinants of the free fermionic and bosonic fluctuation operators since they appear in both the (unnormalized) correlators and partition functions. Note thatΘ is also a positive real number. The partition functions Z ∆n , on the other hand, are now given by where we partly abbreviate the factors in the round bracket by iκ N f . Using propagators of the form of Eq. (S61) and approximating nontrivial integrals over the translational coordinates x 0,ν , x 0,ν by their averages over the remaining collective coordinates, as in Eqs. (S74), (S75), we have the following types of contributions: • terms with only propagators as in the zero-instanton background, • "diagonal" terms, which are obtained by summing over terms in which all zero modes correspond to a common (anti-)instanton, • "off-diagonal" contributions which mix zero modes from different (anti-)instantons.
For contributions with only propagators as in the zero-instanton background, the integrals over the centres are trivial and simply lead to Z ∆n j iS σ(j),0inst , so that the ensuing contributions to the full correlator are simply given by products of these propagators. The "diagonal" contributions involve overlap integrals over varying numbers of zero-modes of a single (anti-)instanton. When summing over (anti-)instantons, one always gets a factor of n (n), exactly as in the two-point function case analyzed before, resulting in contributions that go schematically as (for the case of instantons) In this equation σ p/q = {σ p/q (1), . . . , σ p/q (p/q)} are subsets of the set σ defined above, with p + q = N , σ p ∪ σ q = σ. P Rσq(j) are right-handed projectors for the flavour σ q (j), whileh q denotes a generalized tensorvalued overlap integral constructed from a product of q instanton zero-mode projectors, averaged over the collective coordinates of the instanton. As before, when computing contributions to the fermion correlation by taking the infinite volume limit, summing over ∆n and dividing by the partition function, the phases proportional toᾱ + θ drop out, and one ends up with contributions to the correlator of the form As in the single-flavour case, all the phases of the correlators are determined by the chiral phases in the mass matrices, and similar results hold for the diagonal anti-instanton contributions. The contributions to the correlators can be captured by effective operators whose α j -dependent phases are in accordance with the generalization of the selection rule of Eq. (S86) for N f flavours, which reads In particular, the 't Hooft interactions with N f flavours induced by (anti-)instantons correspond to diagonal contributions to correlators with N = N f pairs of fermions, p = 0 and q = N f , with the resulting effective vertices having the form where at leading order in a gradient expansion the Γ N f are constant. Note how the dependence on the chiral phases is such that all of these can be removed by the same redefinitions that get rid of the phases in the treelevel mass terms. Once again, had we done the summation over ∆n before taking the infinite volume limit, we would have obtained different phases, withᾱ replaced by −θ. For these 't Hooft interactions, the q = N f factors of m −1 σq(j) in Eq. (S93) are canceled with the factor of N f j=1 m j associated with the fermionic zero modes implicit in κ N f ∝Θ. Diagonal correlators with p = 0 but N < N f yield additional interaction vertices with fewer fermions, higher powers of m i and phases compatible again with the selection rule, confirming the symmetry arguments put forth for example in the context of SU(2) instantons in Ref. [44]. Finally, the off-diagonal terms involve contributions to the fermionic propagators coming from different instantons. These can be classified according to the number of different (anti-)instantons involved and the number of propagators corresponding to each (anti-)instanton. Each class has an associated combinatorial factor for the number of terms in the class contained in the product of fermion propagators of the form of Eq. (S61). For example, as we have seen, the diagonal class of single-(anti-)instanton contributions has an associated combinatorial factor of n(n). Now for the off-diagonal term, suppose we consider a class where m different instantons are involved. This amounts to m combinations from a set of size n and gives a combinatorial factor n!/(m!(n − m)!). In this case the integrals over the translational collective coordinates give now contributions proportional to n,n≥0 n−n=∆n Since κ N f ∝ e −SE , we see that these contributions have a higher suppression factor and are expected to be subdominant. Nevertheless, taking the limit of V T → ∞ before summing over ∆n and dividing by the partition function, the dependence on θ drops from the corresponding contribution to the correlator. Analogous results hold for other contributions involving anti-instantons, or mixed instantons and anti-instantons: In general one obtains modified Bessel functions multiplied by extra factors of κ N f and inverse powers of V T . This makes the terms subleading but also in such a way that the θ-dependence disappears from the final contributions to the correlators. Our closing remark for this section is that the partition function with Z ∆n given in Eq. (S91), exhibits a non-analytic behaviour in θ. Indeed, as pointed out above for V T → ∞, With θ ≡ᾱ + θ + N f π, the dependence on θ becomes proportional to a periodic delta function maximized at 0 and 2π. The corresponding Euclidean partition function is thus maximized at θ = 0, which complies with the general expectations argued for in Ref. [45], yet with a nonanalytic dependence on θ . This is to be contrasted with the standard result which, while having an Euclidean counterpart maximized at θ = 0, retains analyticity in θ. As discussed in Section S5.4, several standard results in the literature linking θ with CP violation through the topological susceptibility χ Ω ≡ ∆n 2 /Ω, where Ω denotes a spacetime volume, have been derived assuming an analytic dependence on θ in the partition function and thus do not necessarily apply given the ordering of limits proposed in this paper. Let us remark that the partition function itself is not observable, and the singular limit is not pathological because the resulting correlation functions have a well defined limit. The possibility of a non-analytic dependence on θ has been succinctly considered in Ref. [23], where nonanalyticity was linked to periodicity in θ. We emphasize again that the latter is related to the quantization of the topological charge, which as argued in Section S3.2 is required for an infinite spacetime.

S3.5 More general correlation functions
The 2N f -point functions discussed in Section S3 correspond to expectation values of observables, up to their gauge-covariant nature and the fact that they transform under redefinitions of the fermion fields, in particular under chiral rotations of these. A gauge-invariant observable can be obtained e.g. by taking the trace of the gauge indices. We have obtained these correlations to tree-level accuracy in an expansion around multi-instanton backgrounds (based on additional approximations spelled out in Section S3). Here, we comment on how to obtain any correlation function in addition to the 2N f -point fermion correlation, where also loop corrections may be included. As an example for such a general correlation function, one may take the stress-energy tensor. It is the source of gravitational fields and should therefore be used to infer the vacuum energy, in contrast to the logarithm of the partition function Z that is sometimes used for this purpose in the context of θ-vacua.
For simplicity, we consider again the case of a single fermion flavour. To compute the expectation value of an observable O to some approximation, Eq. (S71) generalizes to (dropping as before the contributions from the free determinants) × e −SE(n+n) (n+n) (−Θ)n +n d 4 z 1 · · · d 4 z u F (z 1 , . . . , z u ; z 1 , . . . , z t ) e i∆n(α+θ) = n,n≥0 n−n=∆n 1 n!n! d 4 z 1 · · · d 4 z u nḠ1 (z 1 , . . . , z u ; z 1 , . . . , z t ) + nḠ 1 (z 1 , . . . , z u ; z 1 , . . . , z t ) (V T )n +n−1 + G 0inst (z 1 , . . . , z u ; z 1 , . . . , z t ) (V T )n +n (iκ)n +n (−1) n+n e i∆n(α+θ) = d 4 z 1 · · · d 4 z u I ∆n+1 (2iκV T )Ḡ1 + I ∆n−1 (2iκV T )Ḡ 1 iκ + I ∆n (2iκV T ) G 0inst (−1) ∆n e i∆n(α+θ) . (S100) The function F can be represented by a sum of Feynman diagrams, i.e. as a sum of products of two-point Green's functions and their derivatives in the multi-instanton background. For the fermions, these Green's functions may be approximated by iS n,n given in Eq. (S61), but other species, e.g. gauge bosons, can contribute as well. For these additional fields we assume that, in analogy to the fermionic propagators, their two-point functions can be approximated by the free contribution plus a sum over contributions peaking at the centres of each (anti-)instanton. Each of the two-point functions is evaluated at a given pair of the spacetime arguments of F. (For gauge bosons and self-interacting scalars, these arguments may coincide according to the Feynman rules.) The integrations over the coordinates z i correspond to loop integrals. In the second step, we have carried out the integrations over the collective coordinates in analogy with Eq. (S76). Organized in powers of V T , this defines the contributions G 0inst from the bulk of the spacetime volume where there are no instantons, as well asḠ 1 andḠ1 which are obtained for one instanton sweeping over F. This givesḠ1 /1 as generalized overlap integrals averaged over the collective coordinates Ω, involving products of free propagators times one or more contributions to two-point functions-either fermionic or bosonic-arising from a single (anti-)instanton.
Contributions of lower order in V T , corresponding to more than one instanton sweeping over F at a time, are suppressed exponentially and have thus been omitted in Eq. (S100). In the third step, we have carried out the summation and suppressed the spacetime arguments of G 0inst ,Ḡ 1 andḠ1. Note also that O in general has a spinor structure. In contrast to Eq. (S76), for which this structure is presented, we do not explicitly show the chiral phases, which are e ±iα for left and right-chiral contributions, respectively, because the only phases in G 0inst ,Ḡ 1 andḠ1 can originate from the mass term. In order to evaluate the expectation value by first taking V T → ∞ and then summing over the topological sectors ∆n, we note that the volume-dependence of the loop integrand can be isolated as for |x| → ∞ and |arg(x)| < π 2 , (S101) and we can apply the same arguments as in Section S3. (Recall that the time interval is to be understood as T e −i0 + , so that we can apply the asymptotic expansion of Eq. (S101).) Taking limits in this order thus leads to Again, we observe that interferences from contributions from different topological sectors ∆n cancel when normalizing with the partition function. The only phases for the terms in square bracket are e ±iα , and they appear in accordance with the breaking of chiral symmetry by the mass term. However, unless additional CPodd phases are introduced in the theory, the phase α can be removed by field redefinitions and is unobservable. This cancellation does not hold when taking the limit V T → ∞ after the summation over the topological sectors ∆n, which leads to (S103) Here, in addition the phase θ + α appears which is independent of field redefinitions, according to Eq. (S86), and generally leads to CP -violating observables. Under certain conditions, these general correlation functions can be obtained using the effective operators from Eqs. (S84) and (S85). To see this explicitly, we assume that the loop integrals are not ultraviolet sensitive in the sense that only contributions with |(z i − z j ) 2 | 2 are relevant. We can then assume the arguments of the Green's functions to be sufficiently separated such that we do not have to account for contributions where two of the Green's functions iS n,n are to be evaluated close to the same instanton. Recalling that F depends on the two-point fermionic Green's functions, we denote F = F({iS (i) }, ...) where the dots represents all other arguments. Here the superscripts (i) (and (j) below) are used to denote the different two-point functions appearing in expansions of F. Furthermore, close to an (anti-)instanton we only collect the contributions from the corresponding fermionic zero-mode. In this case, within Eq. (S100), we can identify whereh is given in Eq. (S75). Then, following Section S3, all anomalous contributions can be approximated to linear order in κ as The term with G 0 is just the contribution that would arise in a background without instantons, while the term with G 1 represents the leading instanton-effects. When taking V T → ∞ first, we are led to substitute while, when summing over ∆n first, we take for i = j and iS (i) = iS 0inst for i = j .
(S107) Now, we can indeed observe that the result (S105) can be obtained by using effective operators of the form (S84) or, respectively, of (S85) to linear order. The effective operators cannot be used for higher-order calculations in κ. For example, the effective operator (S85) would imply that the chiral phases e ±iθγ 5 are additive in quantities in higher power of κ. However in case of summing over ∆n before taking V T → ∞ and when replacing more than one fermion line with the interaction induced by the same (anti-)instanton, one obtains only one phase factor exp (±iθγ 5 ). Only in case a contribution to the correlation function involves the effect of more than one (anti-)instanton, the phases are additive. When aiming to go beyond linear order in κ, one should note however that any explicit dependence on θ can only enter via the global phases ∆n(α + θ) that appear for the path integrals over the individual topological sectors. (We reemphasize that these global phases are immaterial when taking V T → ∞ before summing over ∆n. Then, no misalignment of chiral phases occurs, which also holds beyond the linear order in κ.) Beyond linear order, one should therefore go back to Eq. (S100) as a starting point. The same applies when considering values of z i that are not well separated, such that one cannot neglect contributions in which more than one Green's function are evaluated close to the same instanton. Similarly, using the 't Hooft vertices of Eqs. (S95) in diagrams with ordinary propagators would only capture a restricted set of contributions of order κ in which N f fermion propagators are evaluated close to the instanton. For higher-order in κ or for capturing contributions with more propagators close to the instanton, the use of the effective vertex cannot be justified.

S3.6 Chiral Lagrangian, the θ-angle and the η -mass
At low energies, QCD becomes fully nonperturbative and confines. The physical degrees of freedom are mesons and baryons. Their dynamics can be captured by an effective theory defined by the chiral Lagrangian, see e.g. Refs. [46,47] for reviews. The lightest mesons can be embedded into a matrix-valued field U -with the matrix indices associated with the light quark flavours u, d, s-which can be written as where f π is the pion decay constant. Here we are neglecting mixing effects among the mesons η 8,1 and readily approximate these with η and η . The lowest-order terms in the chiral Lagrangian are given by It is understood here that U 0 is a unitary field expecation value and Φ describes the meson expectaions about U 0 so that U = U 0 . Above, M is a diagonal matrix with diagM = {m u e iαu , m d e iα d , m s e iαs } being fixed by the light quark masses. The parameter B 0 in Eq. (S109) is real, while the matrices U and M inherit the following transformations under the selection rule of Eq. (S94): where the relation to the quark fields of the ultraviolet theory is given by arg U = arg ψ (x)P R ψ(x) . The effective Lagrangian (S109) is thus constructed in such a way that its properties under chiral transformations are determined by the quark mass terms and the 't Hooft operator from Eq. (2). It does not appear to be widely acknowledged that invariance of the chiral Lagrangian under the former transformations can be achieved for two choices of the phase ξ: ξ = θ-the standard choice, leading to CPviolating effects [1,2]-and ξ = −ᾱ = −α u − α d − α s , as it is required by the topological quantization of the winding number in infinite spacetime volumes. (Recall that θ andᾱ transform as given in Eq. (S94).) The latter result for ξ leads to no CP -violating effects, as in this case it can be easily seen that all the phases can be removed by a field redefinition. Thus, our results for the 't Hooft vertices in QCD implying ξ = −ᾱ, leads to no CP violation and no electric dipole moment for the neutron. It should be pointed out that in some of the literature assuming the standard choice ξ = θ, the anomalous terms is not written in terms of det U, det U † but keeping the first terms in an expansion of the determinants of Eq. (S109) in terms of tr log U, tr log U † . Such expressions are obtained after integrating out an auxiliary field related to the topological charge density [48].
Let us further comment that the 't Hooft operator (S95) does not vanish in the limit of massless quarksdue to the cancellation of powers of the fermion masses in the fluctuation determinants and the zero mode contributions to the propagators, as discussed with the details on Eq. (S80)-a property that is shared with the chiral Lagrangian above. The absence of CP violation inherent in the result ξ = −ᾱ is therefore not in conflict with the observed enhancement in the mass of the η -boson. From the Lagrangian (S109), one still gets a nonzero mass for η even for massless quarks, going as m 2 η = 8|λ| f 2 π . Furthermore, as discussed in Section S5.4, our results are not in conflict with the relation between the mass of the η boson and the infrared regulated topological susceptibility χ Ω = ∆n 2 /Ω in the pure gauge theory, which is found in the limit of a large number of colours in Refs. [23,48]. Matching the coefficient |λ| in Eq. (S109) to the instanton calculations, we do find proportionality between the mass of the η meson and the topological susceptibility defined in finite subvolumes of the pure gauge theory for arbitrary number of colours. Crucially, a nonzero value of χ Ω for finite subvolumes does not imply θ-dependence of the partition function in the full volume. This is due to the nonanalytic dependence of Z on θ, as discussed at the end of Section S3.4.
Given the potential from Eq. (S109), the field U and accordingly Φ acquire vacuum expectation values. Depending on whether ξ = −ᾱ or ξ = θ, the determinant of U is aligned with the determinant of the mass matrix M or not. Just as for the ultraviolet theory, this implies the absence or respectively the presence of CP -violating effects.
To see this in more detail, we note that M being diagonal implies that U is diagonal as well. We therefore parametrize the expectation values as (S111) The minimization of the potential in Eq. (S109) leads to the system of equations for i = u, d, s, which implies that m u sin ϕ u = m d sin ϕ d = m s sin ϕ s . Assuming that |λ| B 0 f 2 π m i , a leading order solution is obtained when setting ξ + α u + α d + α s − ϕ u − ϕ d − ϕ s = 0. Neglecting the contributions from the strange quark, one finds the solution [14,15] m u,d sin ϕ u,d = sin(ξ + α u + α d ) 1/m 2 u + 1/m 2 d + 2 cos(ξ + α u + α d )/(m u m d ) . (S113) For three flavours, provided ξ + α u + α d + α s 1, one can approximate m u ϕ u = m d ϕ d = m s ϕ s , what leads to [1] This result serves in Refs. [1,2] as the input to calculate CP -violating observables using current algebra techniques. Here, we discuss the consequences in the framework of effective chiral perturbation theory, which is presented in view of CP -violation in the strong interactions in Ref. [14]. Substituting U with these angles ϕ i into the term involving B 0 in Eq. (S109) yields the CP -odd effective interactions These lead to CP -violating decays η → 2π. The main observable of interest is the permanent electric dipole moment of the neutron [1,2]. Considering the transformation of the nucleon multiplet N , the quark mass matrix M and the meson fields U under chiral field redefinitions leads to an effective theory of the interactions of mesons and nucleons that is consistent with these symmetries and the way in which these are broken. Given the quark mass matrix M and the expectation value U 0 , it can be shown that this includes a CP -odd operator [14] where c + is a parameter of the effective theory. This includes an interaction between the neutrons, protons and charged pions that leads to a permanent electric dipole moment of the neutron. Setting ξ = θ in Eq. (S114) and consequently in the interactions (S116) and (S115), these results are proportional toθ when identifyingᾱ = α u + α d + α s . In general, CP -violating effects would then follow. Taking instead ξ = −ᾱ, as it is indicated by taking the spacetime volume to infinity before summing over topological sectors, these interactions vanish in the effective theory so that there are no CP -violating effects.

S4 General correlation functions from cluster decomposition
In this section it is shown that the above conclusions regarding the phases of general correlation functions can be derived without resorting to the dilute instanton approximation. Rather, working in Euclidean space for simplicity, in a theory with N f flavours in the fundamental representation, we constrain the form of the partition functions Z ∆n from arguments based on cluster decomposition, the index theorem and parity. From this one can derive integrated fermionic correlation functions, which are sensitive to the constant phases of the correlators discussed in the previous sections. Again, when the infinite volume limit is taken before the sum over topological sectors, no CP -violating relative phases remain.

S4.1 The θ-angle and cluster decomposition
We start by recalling that, alternative to a coupling constant in the Lagrangian, the θ-angle can be understood in terms of relative weights f (∆n) among the contributions from the different topological sectors to the path integral, constrained by the requirement of cluster decomposition [20].
Let us consider the expectation value of an operator O in an infinite spacetime volume Ω = V T , and interfere different topological sectors ∆n as . (S117) Here, the action S Ω is defined to arise from integrating the Lagrangian over Ω. The Lagrangian does not include the topological term θtrFF /(16π) which, as it turns out, can be attributed to the weights f (∆n) of the individual topological sectors. Further, writing ∆n under the integration symbol, we specify the topologically conserved winding number which can be imposed on vanishing physical gauge fields at the boundary of Ω at infinity. Next, we divide the volume of the spacetime as Ω = Ω 1 ∪ Ω 2 according to Figure 1. Accordingly, the winding numbers behave additively ∆n(Ω) = ∆n 1 (Ω 1 ) + ∆n 2 (Ω 2 ). Now suppose we consider a local operator O 1 whose spacetime arguments are restricted to lie within Ω 1 . Then, we can separate off the contribution from Ω 2 as . (S118) We note that for this partition, ∆n 1,2 cannot strictly be assumed to be integers because the field configurations at the boundary between the two subvolumes do not give rise to topologically conserved winding numbers within each of Ω 1,2 . We proceed nonetheless, taking above expression as a suitable approximation for sparse populations of instantons. Independence of O 1 Ω from the fluctuations in Ω 2 is achieved if both the numerator and the denominator factorize into contributions that separately depend on ∆n 1 , Ω 1 and ∆n 2 , Ω 2 , respectively. Then the fluctuations within the volume Ω 2 cancel. For this to occur, without using particular properties of the path integral factors to this end, f (∆n) needs to satisfy the following functional relation: (S119) We have attributed here the phase θ such that the topological term in the action is indeed recovered. In Section S5, we present more aspects of cluster decomposition based on the correlation and partition functions that are derived in Section S3 using the dilute instanton gas approximation or, alternatively, from the constraints derived in the present section.

S4.2 Constraining the partition functions from cluster decomposition, the index theorem and parity
From the denominator of Eq. (S117) one recovers the partition function in the volume Ω = V T as (S120) The above factorization assumption gives and the property (S119) of the weight factors simply leads to In the following we use Eq. (S122), which can be thought of as a formulation of the cluster decomposition principle at the level of the partition function, to constrain g ∆n (Ω). First, we isolate the possible complex phases in g ∆n (Ω). The latter can be understood in terms of fluctuation determinants of gauge fields and fermions about a gauge field background with topological charge ∆n, where we do not assume here a construction from a dilute gas of instantons and anti-instantons. The Euclidean gauge determinants are real, while the fermion fluctuation determinants pick up a phase from unpaired right-handed and left-handed zero modes of the massless Dirac operator. Indeed, as discussed in Section S2.2, eigenvalues of the massive Dirac operator come in pairs with mutually conjugate eigenvalues (see Eq. (S2.2)) whose product is always real; this applies to arbitrary backgrounds, not just for ∆n = ±1). The Atiyah-Singer index theorem relates the topological charge ∆n to the difference in the number of left-handed and right-handed zero modes. This implies that the product of the fermionic fluctuation determinants for the flavours j = 1, . . . , N f with complex masses m j e iαj γ5 = m j e iαj P R + m j e −iαj P L ≡ mP R + m * P L (S124) in a sector with fixed ∆n carries a phase given by ∆nᾱ, withᾱ defined as in Eq. (S89). Therefore we can write g ∆n (Ω) = e i∆nᾱg ∆n (Ω),g ∆n (Ω) ∈ R. (S125) The requirement of cluster decomposition as in Eq. (S122), together with Eq. (S123) and Eq. (S125) implies nowg Next, we assume that, as in standard instanton calculations, parity relates the sectors of opposite charges ±∆n. The functionsg ∆n (Ω) capture the fluctuation determinants for real fermion masses, since when α i = 0 one has g ∆n (Ω)| αi→0 =g ∆n (Ω) (see (S125)). But for real fermion masses parity is conserved, so that g −∆n (Ω) =g ∆n (Ω). (S127) In order to find a solution forg ∆n (Ω) that complies with cluster decomposition through satisfying Eq. (S126), we consider first the limiting case Ω 1 = Ω 2 = 0 that implies This brings us to the following ansatz: The dependence on |∆n| follows from the parity constraint (S127), while factoring out Ω |∆n| guarantees that condition (S128) is met. The latter condition implies now The Ansatz further yields Taking the derivative of the cluster decomposition relation (S126) with respect to Ω 1 gives Setting now Ω 1 = 0 givesg (S133) Applying Eq. (S131) one getsg ∆n (Ω 2 ) = f 1 (0)(g ∆n+1 (Ω 2 ) +g ∆n−1 (Ω 2 )). (S134) This allows to solve recursively for all the higher order derivatives ofg ∆n (Ω 2 ) in terms of functions without derivatives. Renaming Ω 2 → Ω, one has d n dΩ ng ∆n (Ω) = (f 1 (0)) n n m=0 n m g ∆n−n+2m (Ω). (S135) In particular, setting Ω = 0 and using Eq. (S128) gives (S136) Using the analyticity ofg ∆n (Ω), knowing all the derivatives at the origin allows to recover the function from its Taylor expansion around Ω = 0:g With Eq. (S136) implying nonzero derivatives only for n = |∆n| + 2k, k = 0, 1, 2, . . . , one has We can thus verify that, while we did not impose the parity property beyond using it to derive Eq. (S131), the solution we find satisfies the parity property (S127) indeed. Renaming f 1 (0) ≡ β, we can rewrite the solution as:g Thus, we recover the modified Bessel functions of the first kind that have been found in the computations with the dilute instanton gas. The partition function for a fixed topological sector is then Z ∆n = e iθ∆n g ∆n (Ω) = e i(θ+ᾱ)∆ng ∆n (Ω) = I ∆n (2βΩ) e i(θ+ᾱ)∆n . (S140) This matches the result of Eq. (S91) from the dilute instanton gas approximation, up to a redefinition θ → θ + N f π. Note also that the solutions forg ∆n satisfy indeed the desired property under parity transformations as I ∆n (x) = I |∆n| (x) for integer ∆n.
Since in the derivation we have used the trick of setting Ω 1 = 0, it may remain to check that the modified Bessel functions satisfy the full requirement of the cluster decomposition principle. But the Bessel functions are readily known to satisfy the required identity (S141) This relation can be proven e.g. using analyticity and the following two properties of the Bessel functions: ). Using the former, it can be seen that Eq. (S141) and the identities obtained by taking derivatives with respect to Ω 1 to arbitrary order are always satisfied at Ω 1 = 0. Analyticity then implies that eq. (S141) holds for arbitrary Ω 1 .
To derive the fermion correlation functions, we note that β can still depend on the masses of the quarks. As discussed earlierg ∆n is real, and corresponds to the powers of m k = m k m * k coming from the zero modes plus contributions from nonzero modes of the massless Dirac operator to the product of fermion determinants. As discussed earlier, these nonzero modes come in pairs with mutually conjugate eigenvalues, whose product only depends on m 2 k = m k m * k . Then, it follows that β in Eqs. (S139), (S140) can only be a function of m k m * k , β = β(m k m * k ). That is, Z ∆n (Ω) = e i∆n(θ+ᾱ) I ∆n (2β(m k m * k ) Ω). (S142) Noting thatᾱ depends on the masses asᾱ = −i/2 k log(m k /m * k ), one can write Z ∆n (Ω) = e i∆n(θ−i/2 k log(m k /m * k )) I ∆n (2β(m k m * k ) Ω). (S143) Given the mass terms in the Euclidean Lagrangian, one can interpret m i as a "current" pertaining to the correlatorψ i P R ψ i , and m * i toψ i P L ψ i , where the correlations are evaluated at coincident points. From the Euclidean path integral formulation it follows that, within a topological sector ∆n, Applying this to Eq. (S143), noting thatᾱ depends on the masses asᾱ = −i i log( m i /m * i ), gives (S146) Using the identities d dz I ∆n (z) = 1 2 (I ∆n+1 (z) + I ∆n−1 (z)), ∆nI ∆n (z) = − z 2 (I ∆n+1 (z) − I ∆n−1 (z)), (S147) and dividing by Ω = V T , we get the following spacetime averages of the fermionic correlators: (S148) Note that the correlators carry the correct amount of ±2 units of spurious chiral charge, where the latter is defined from the transformation rules of Eq. (S94), which imply m j → e −2iβ m j , m * j → e 2iβ m * j . From the previous expressions it is not clear how to exactly recover our previous results found in the dilute instanton gas approximation-e.g. identify the free piece vs. instanton-like corrections-but it should be kept in mind that β is meant to capture perturbative as well as non-perturbative results. Furthermore, we are computing integrated and coincident correlators, rather than the correlators evaluated at some arbitrary x, x .
The final values of the spacetime averaged correlators are: Taking the limit Ω → ∞ before the sum over ∆n, as corresponds to an infinite flat spacetime with topological sectors defined by the boundary conditions at infinity that are required to have a finite action, we have that I ∆n (2βΩ) = I 0 (2βΩ)(1 + O(1/Ω)). Then the contributions to the correlators proportional to β vanish, while the ones proportional to the derivatives of β survive, giving (S150) Thus, the spacetime average of the full coincident correlator has no θ-dependent phases in the infinite volume limit. The constant phases of the correlators (i.e. insensitive to the integration over spacetime) are all set by the tree-level masses, and therefore there is no CP violation. By taking higher derivatives of Z ∆n , the previous results can be generalized to products of spacetime-averages of coincident two-point correlation functions for different flavours. These correspond to particular spacetime averages of arbitrary correlation functions, and should display the same constant phases as the full spacetime dependent correlators. The relations (S147) allow to link derivatives of the I ∆n to the I ∆n themselves, which in the infinite volume limit all match I 0 asymptotically. Analogously, contributions going as ∆n m I ∆n arising from derivatives with respect toᾱ can be traded for linear combinations of Bessel functions without ∆n factors. In the infinite volume limit, again one has that the ∆n m I ∆n are zero or proportional to I 0 . With all contributions proportional to I 0 , the interferences of the θ-dependent phases disappear again and one ends up with θ-independent correlators.
The fact that the phase of the correlation functions is aligned with the phase of the quark masses when the infinite volume limit is taken before the sum over topological sectors has also been noted in Ref. [49]. The argument given in that work does not rely on the dilute instanton gas approximation either. However, it is restricted to the case of real masses such that the phases can only be multiples of π. That result is discarded however since it is further assumed in Ref. [49] that the correlation function should be aligned with θ instead. The latter however is derived there from a partition function of the form (S83) that, according to the calculations in this work, is valid only if the spacetime volume is taken to infinity last. The argument of Ref. [49] advocating for the opposite order of limits as proposed here is therefore circular.

S5 Cluster decomposition with or without summation over topological sectors
In this section we show that the factorization properties of the path integration imply that one can recover the results for the correlators in an infinite volume Ω by carrying out path integrals in a finite volume Ω 1 ⊂ Ω. While the topological term can be cast into a boundary term, we need to carefully track its effects in order to check whether these are physical or not. In order to do so, one cannot only calculate the path integral in Ω 1 and ignore the volume complement Ω \ Ω 1 -if boundary terms are important for Ω 1 this should be the case for the larger volume Ω as well. That is, one must integrate the fluctuations in the complement in order to arrive at an effective path integral for the subvolume (cf. Figure 1). In this effective path integral over Ω 1 that enters the correlation functions, the usual chiral phases from the fermion determinants and the θ-angle turn out to cancel and therefore there are again no CP -violating effects, regardless of whether or not one sums over the topological sectors in the infinite volume Ω. Moreover, it will be seen that the previous results can be generalized to a large (i.e. κΩ 1) but finite volume Ω ⊃ Ω 1 , as long as the path integration in Ω is restricted to a single topological sector. Such a restriction is motivated by the considerations on local observers made at the end of Section S3.3, and the results are in keeping with the expectation that the observables should not depend on whether the calculation is carried out in an infinite or a very large but finite spacetime volume. In particular, the cluster-decomposition property is maintained in such a setup with large Ω.
For simplicity, the calculations in this section are also carried out in Euclidean spacetime.

S5.1 Cluster decomposition within an infinite volume
Equation (S118) can be rewritten as . (S151) The contributions from the volume Ω 2 are given by the Euclidean version of the corresponding partition functions (S91), which gives the following expression in terms of modified Bessel functions: . (S152) Note that we have made explicit here the phase factors from the fermion determinant that have not been absorbed in κ. Since we assume Ω 1 to be finite and Ω → ∞, we need to take here Ω 2 → ∞. As indicated, we take again the limit of infinite Ω 2 before summing over ∆n. Then, the Bessel functions with an argument proportional to Ω 2 tend to a common limit, independent of ∆n 1 . As a result the sum over ∆n factorizes out and one is left with an expression for O 1 in terms of a path integration within the volume Ω 1 that is given by . (S153) We note that in this expression the θ-angle as well as the phases from the fermion determinant proportional to ∆n have disappeared. Additionally, when carrying out the remaining piece of path integration, the phase factors (−1) −N f ∆n1 e −iᾱ∆n1 will exactly cancel the phases from the fermion determinants in Ω 1 within each topological sector. Thus, no ∆n 1 -dependent phases remain, and there is no interference between topological sectors. The cancellation of global phases in Eq. (S153) can be understood as follows. As has been shown before, the global phases are fixed by the topological sectors of the total spacetime; however, for an infinite spacetime the different topological sectors effectively do not interfere, and the global phases thus drop from observables. It follows that when calculating the observable in the subvolume Ω 1 there is no CP violation, which is consistent with the calculation in the full, infinite volume. In particular, applying Eq. (S153), one can in fact redo the calculation of the fermionic correlation functions in the finite volume Ω 1 and in the dilute instanton approximation, accounting for the lack of θ-dependence and the insertions of the extra phases, and recover exactly the same result as in the underlying case of infinite volume: The phases of the instanton contributions are aligned with the tree-level phases from the fermion mass terms.
As a clarification, we recall that the justification of taking the limit of infinite Ω 2 before the sum over ∆n is related to the fact that the classification into topological sectors labelled by integers is only necessarily enforced for infinite volume, in order for the action of the saddle points to remain finite. As it has been mentioned before, for the volume Ω 1 the quantity ∆n 1 is not enforced to be an integer, and the restriction to integer values should be understood as an approximation in which one neglects contributions coming from particular fluctuations near the boundary of Ω 1 (e.g. instantons with centres close to the boundary of Ω 1 ). We expect the approximation to be accurate whenever Ω 1 is large but finite and embedded into a much larger volume, such that the region in which the additional fluctuations are ignored has a small relative weight.
In spite of the periodic boundary conditions, lattice simulations sample over topological sectors within their volume unless the continuum limit is approached. In the present context, we can interpret such sampling of sectors in a finite volume as an evaluation of the path integral where the contributions from Ω 2 to the action are discarded, see Refs. [50,51] for such lattice calculations of CP violation in finite volumes. Crucially, this includes here the phases f (∆n 2 ) exp(iᾱ∆n 2 ). If it were possible to circumvent the sign problem on the lattice associated with CP phases, one would therefore find CP -odd expectation values. However, one should then keep in mind that the result would not be based on the action integral over the full spacetime but only on an arbitrary subset of it. Appropriately integrating out the contributions from Ω 2 one arrives instead at Eq. (S153). To evaluate that expression on the lattice, one should setθ = 0. This automatically accounts for the phases from Ω 2 , irrespective of the value ofθ that follows from the Lagrangian of the theory.
Finally, we note that the result of Eq. (S153) can be equally recovered when keeping the infinite volume limit but removing the sum over topological sectors, i.e. for an infinite spacetime in a fixed topological sector, as would correspond to an observer arising from localized field excitations in a particular sector. In this sense, the property of cluster decomposition-i.e. that the expectation values of local operators only depend on local fluctuations-does not require a summation over the topological sectors of the total spacetime volume Ω. In the next section it will be seen that the previous property also holds for finite spacetimes, up to parametrically small corrections.

S5.2 Cluster decomposition within a finite volume
It is apparent from Eq. (S153) that for local observables in the volume Ω 1 the information about the boundary of Ω at infinity-including the possible effect of the θ-angle and the associated CP -violating phenomena-is lost, regardless whether or not one sums over topological sectors in the full volume Ω, cf. Figure 1. Hence, in contrast to the usual argument given in Section S4.1, the cluster decomposition principle does not strictly require summing over topological sectors when spacetime is infinite. The same can in fact be argued for a finite Ω, as long as Ω 2 Ω 1 and the path integration on Ω 1 ∪ Ω 2 is restricted to a single topological sector ∆n, which can be realized for periodic boundary conditions as in spacetimes with the topology of a torus. In principle, one can also apply the arguments for bounded spacetimes with discrete topological sectors but one should keep in mind that there is no first principle that would require vanishing fields at a finite boundary of Ω.
To see that cluster decomposition also holds indeed in large (i.e. κΩ 1) but finite volumes, one can follow the steps in the previous section with the summation over ∆n omitted, which leads to . (S154) In the previous expression it is not readily apparent that the terms that depend on Ω 2 can be factored out, which would lead again to an expression of O ∆n Ω in terms of a path integration in Ω 1 , in accordance with the expectations of the cluster decomposition principle. It turns out, however, that this factorization works up to corrections suppressed by inverse powers of Ω 2 . The idea is that the ∆n 1 dependence of the Bessel functions I ∆n−∆n1 (2κ 2 Ω 2 ) is only relevant for very large ∆n 1 , for which the path integration in Ω 1 becomes exponentially suppressed. Hence, in the dominant contributions to O ∆n Ω one can indeed factorize out the Ω 2 dependence and cluster decomposition is recovered. To show this in a bit more formal detail, we first note that the factor in the numerator from the integration over Ω 1 can be written as where B r are coefficients (that may have a tensor structure) and m r depend on the chiral fermionic contributions that appear within O 1 , cf. Eqs. (S76) and (S92). Analogously, the Ω 1 integration in the denominator of (S154) gives the Euclidean generalization of Eq. (S91), with the θ-dependence omitted, Note how the ∆n 1 -dependent phases in Eqs. (S155) and (S156) are exactly cancelled by those present in Eq. (S154). Next, we note the asymptotic expansion from which it follows that This means that for finite ∆n − ∆n 1 it is always possible to take Ω 2 large enough such that However, there is no bound on ∆n 1 . We therefore need to show that, as anticipated earlier, the contributions from large ∆n 1 can be neglected. This can be accomplished when considering the asymptotic expansion that implies an exponential suppression of the factor (S155) from the integration over Ω 1 for large ∆n 1 .
To obtain an upper bound on the magnitude of the contributions that arise for large ∆n 1 , note the inequalities where we neglect corrections due to m r , assuming |m r | ∆n 1 . With the help of these relations, we see that the sum over ∆n 1 in the numerator of Eq. (S154) can be split into a piece that is independent of ∆n and a remainder that depends on ∆n but goes to zero as Ω 2 → ∞, which is a prerequisite for factorization: (S162) To estimate the size of the first of the remainder terms, we have used Eq. (S158) and for the second one Eq. (S161) together with the relation 0 ≤ I 0 (2κΩ 2 )−I ∆n−∆n1 (2κΩ 2 ) ≤ I 0 (2κΩ 2 ). The terms in the denominator of Eq. (S154) can be rearranged in an analogous way.
With this information, we can put upper bounds on the difference between O ∆n Ω for a truncation N of the sum over topological sectors and finite Ω 2 and the limit that arises for Ω 2 → ∞ while keeping N finite: As the value of ∆n 1 can be chosen freely, we can now show that the remainders go to zero as Ω 2 → ∞. The exponentially suppressed corrections in Eq. (S163) are under control when choosing ∆n 1 = AκΩ 1 where A 1. In case ∆n 1 > |∆n|, the correction due to ignoring the index of the Bessel function arising from Ω 2 then is of order A 2 κΩ 2 1 /Ω 2 which should also be much smaller than one. Both corrections are then suppressed simultaneously when Ω 2 /(κΩ 2 1 ) 1. When ∆n 1 ≤ |∆n|, one simply takes Ω 2 |∆n|/κ in order to keep the correction from ignoring the index of the Bessel function arbitrarily small. Therefore, as advertised before, Ω 2 can always be chosen large enough such that the path integral restricted to Ω 2 is independent of ∆n 1 . Neglecting the remainder of higher order in Eq. (S163) the result is: . (S164) In the last step, we used Eqs. (S155) and (S156) and recovered the factorization result of the previous section, Eq. (S153). Hence, cluster decomposition and the absence of CP violation also hold in finite volumes, as long as the path integration is restricted to a single topological sector.

S5.3 Comparing with quantum mechanical systems with degenerate vacua
The results (S153) and (S164) may be used to relate the present discussion with one-dimensional periodic potentials in quantum mechanics (see e.g. Ref. [17]). For these, one considers the evolution on a time interval I 1 ⊂ I, where I is the time axis. It is useful to introduce basis states |i , where i = −∞, . . . ∞ is a label for the ith minimum of the periodic potential. Locally, such a state |i approximately takes the form of a ground state about the ith minimum.
Note that due to the exponentially suppressed instanton transitions between the minima, the states |i are not time independent. Energy eigenstates are those whose transition amplitudes have an exponential dependence on T 1 , the length of I 1 . These states can be parametrized by a phase θ I and therefore correspond to the θ-vacua, i.e. they have the form |θ I = i exp(i θ I i)|i , and their energy depends on θ I through a term proportional to − cos θ I . For gauge theory, this is often stated as a reason for constructing the θ-vacua |θ = nCS exp(i θ n CS )|n CS , besides their gauge invariance and their cluster decomposition properties. (For the sake of the discussion of the relation with periodic potentials in quantum mechanics, we attribute here θ to the vacuum state rather than the topological term in the Lagrangian or the weight factors f (∆n).) Now in the dilute gas approximation, one can compute the amplitude for the transition from |i to |i + ∆i for some integer ∆i within a time T 1 . Due to the above energy dependence, for large T 1 , this amplitude is dominated by the contribution from the energy eigenstate with θ I = 0, (S165) Back to gauge-theory instantons, we note that in the dilute gas approximation the dependence on the spacetime volume of the amplitude for evolving from a state |n CS to a state |n CS + ∆n coincides (up to an overall factor) with that of the amplitude computed for any linear combination of states |n CS (i.e. in particular also for θ-vacua) in a fixed topological sector ∆n. For large but finite spacetime volumes, it can therefore be argued that the partition function Z ∆n for fixed ∆n yields correlation functions that agree with those obtained when interfering the different topological sectors in finite spacetime volumes for θ = 0 [21]. This corresponds to an alternative explanation of the results (S153) and (S164) when applied to fixed topological sectors and finite spacetime volumes Ω. Nonetheless, we emphasize that the evolution in a fixed topological sector over large or infinite spacetime volumes Ω does not project a given state |θ = 0 onto |θ = 0 (as one may suspect because of the spacetime dependence of the amplitude), in contrast with what we have just stated about periodic potentials, simply because the time evolution changes the state only by an overall factor as ∆n is fixed. Thus, considering a fixed topological sector in an infinite spacetime volume is not at odds with θ being a good quantum number.
As for the energy, in periodic potentials it can be inferred from the logarithm of the partition function on I 1 . For gauge theory, the corresponding quantity in the present case is therefore the partition function in the finite subvolume Ω 1 , i.e. the denominator in Eq. (S153), from which we can infer the approximate energy density for fixed topological sectors in large volumes. In that expression, as discussed above, the phases incurred by fermion determinants cancel the explicit phases. In addition, the θ-dependent phases from f (∆n) have been cancelled as well. The reason is that the total phases are fixed by ∆n, i.e. the boundary conditions imposed on the spacetime volume Ω, such that the phases in Ω 1 and Ω \ Ω 1 are not independent. It is therefore crucial that for periodic potentials one can obtain observables by just considering a finite interval I 1 , while in gauge theory for finite Ω 1 one must not ignore phases incurred in the complement Ω \ Ω 1 .
To explain this, the relevant difference between the quantum mechanical case and gauge theory is that the parameter θ I in the quantum-mechanical example, which is related to the crystal momentum of a particle, is not exactly conserved. This is because the finite size of the crystal necessarily breaks periodicity. States with a given θ I = 0 therefore have a finite lifetime. For a circular crystal, i.e. a system with periodic minima and periodic boundary conditions, there are states with conserved angular momentum. Then however, there is only a finite number of sectors ∆i given by the number of degenerate minima, such that there are no material consequences when changing the order of their sum with the integral over I.
Another case of interest is a quantum mechanical potential with a finite number of degenerate minima. While for the above reasons, it is not possible to find exact eigenstates of crystal momentum or θ I , closely analogous are energy eigenstates corresponding to standing wave configurations. The archetypical example of such a potential is the double well. In contrast to gauge theory, instead of an infinite number of equivalence classes of boundary conditions, here there are effectively only two, corresponding to trajectories that either start and end in the same well or do so in different wells. We label these two classes by = and ×, respectively. The partition functions for these sectors are then given by [17] where ω denotes the oscillator frequency to quadratic order around the classical minima and κ arises from the functional determinant, with an analogous parameter appearing in the calculation for gauge theory instantons. Note that, in contrast to the partition function (S78) for instantons in a fixed sector ∆n, the configurations sum here to exponentials rather than modified Bessel functions. In analogy to the CP violation we look for in the fermion correlation function, we may consider here the expectation value for parity P . The possible states now correspond to even and odd wave functions, for which we obtain (when summing over the path integrals in the sectors = and × with coefficients set by projection on the wave functions of the even and odd states at the minima of the well) Corresponding to the procedure in Section S3.3, we may evaluate the ratios of partial sums in the limit T → ∞.
The main difference is that in the present case the sum is finite so that there arises no question about at which point T is to be taken to infinity. In contrast to the possible nonconservation of θ I in the quantum mechanical case, for gauge theory θ is a good quantum number protected by a superselection rule due to gauge invariance. The latter requires boundary conditions on spatial hypersurfaces that lead to the conservation of θ, whereas such strict conservation does not apply to the quantum-mechanical system. In a crystal, states with θ I = 0 are therefore observable after finite T 1 through their spontaneous decay or through some measurement that does not conserve θ I . It may also be possible to determine θ I by a measurement in a finite spacetime volume, e.g. when switching on and off a measurement device. Finally, for systems with a finite number of degenerate minima, only a finite number of boundary conditions needs to be considered such that the question of the correct order of limits does not arise in first place. Neither of these possibilities is viable in gauge theory: The parameter θ, once chosen, is a constant of the theory, it is not possible to switch on the gauge couplings of quarks in a finite spacetime volume only and off outside of it and gauge invariance requires to sum over an infinite number of boundary conditions.

S5.4 Topological susceptibility, instanton density and average topological charge
Moments and cumulants of the topological charge density, such as the chiral susceptibility, are important quantities that allow to relate results for the regime of nonperturbative couplings to physical observables. It is therefore of importance to cross check the correct qualitative behaviour of these quantities in the dilute instanton gas. In terms of the partition function Z, the topological susceptibility can be expressed as (S168) The requirementθ = 1/2(1 − (−1) N f )π ≡ θ 0 above is imposed to ensure that when performing the sum over topological sectors for a finite spacetime volume, the vacuum energy is minimal and χ remains positive, and q stands for the topological charge density, (S169) Now obviously, for a fixed topological sector χ Ω becomes arbitrarily small as Ω → ∞. For the case where we sum over the topological sectors after taking the spacetime volume to infinity, this also implies a globally vanishing topological susceptibility. This argument applies regardless of the inclusion of fermion fluctuations. One may see this as a problem given the relation between the topological susceptibility in the pure gauge theory and the mass of the η -meson that is derived in the limit of a large number of colours in Refs. [23,24]. However, as we have seen in Section S3.6, the chiral Lagrangian matched to the 't Hooft vertices of Eq. (S95) leads to a nonzero mass of η that is equivalent to the standard result from using θ-dependent phases in the chiral Lagrangian. Hence there is in principle no conflict between the vanishing of χ Ω for infinite Ω and a nonzero value of the η mass. The apparent contradiction with the results of Refs. [23,24] can be resolved by noting that the topological susceptibility considered in e.g. [23] is constructed with an infrared regulator in momentum space, which acts as a large length cutoff. Hence, the observable should correspond to an operator of the form of Eq. (S168) but defined in a finite subvolume Ω 1 ⊂ Ω. While ∆n is conserved for the full volume Ω within each topological sector, this is not the case for the subvolume Ω 1 , across whose boundaries topological currents may float freely. As a consequence, a nonzero value of χ Ω1 can arise. Indeed, when considering the expectation value of the square of the integral of the topological charge density q over a finite subvolume Ω 1 , we can apply Eq. (S153) to obtain Here, to simplify notation, we write κ N f as κ. In the second line we have used the fact that the insertions of (−1) −N f ∆n1 e −iᾱ∆n1 cancel the phases of the fermion determinants. The result precisely coincides with the one that is obtained when applying the differential relation from Eq. (S168) to the standard result Z = exp((−1) N f 2κΩ cosθ) that is obtained when interfering the different topological sectors before taking the spacetime volume to infinity. Therefore both variants of the calculation give rise to the same locally observed topological susceptibility. Note that the nonzero result for χ Ω1 does not contradict the vanishing of χ Ω when the infinite volume limit is taken before the sum over topological sectors. The reason is that the two quantities correspond to averages of different operators, an operator defined in a local volume for χ Ω1 , and a global operator corresponding to a topologically conserved quantity requiring integration over an infinite volume for χ Ω .
Only the first operator corresponds to the infrared regulated topological susceptibility considered in Ref. [23]. While the relation between the topological susceptibility in the pure gauge theory and the η mass obtained in Refs. [23,24] is derived in the limit of a large number of colours N c , our results suggest that the proportionality holds for arbitrary N c . Indeed, the result of Eq. (S170) relates the topological mass to the factor κ, which for a theory with N f flavours in Euclidean space is given by (S171) The previous equation follows from the definitions in Eqs. (S77), (S91) plus the fact that the ratios of determinants Θ j and (defined in Eqs. (S64) (S70)) are equal in Minkowski and Euclidean space and the property that the Jacobian J of zero modes in Minkowski space is related to the real Euclidean Jacobian J E as J = iJ E . It is reasonable to expect that the Dirac operator in the instanton background has a single discrete zero mode. On the other hand, the continuum spectrum should match that of the free Dirac operator. In this case, given the definition of Eq. (S64) one can approximate where the m i are moduli of the fermion masses. Hence we can write In this equation we have isolated the contributions κ gauge for the pure gauge theory (i.e. omitting the flucutation determinants of the fermions) and related these to the corresponding finite-volume topological susceptibility χ gauge Ω1 (i.e. the topological susceptibility for pure gauge theory) as in Eq. (S170). Finally, we note that matching the detU, detU † terms in the chiral Lagrangian in Eq. (S109) with the 't Hooft vertices of Eq. (S95) (which reproduce the correlators with p = 0, q = N f in Eq. (S93)) gives ∆n free, Ω ∞ first ∆n free, Ω ∞ last ∆n free, Ω 1 ⊂ Ω ∞ , Ω ∞ first or ∆n fixed, Ω ∞ or ∆n free, Ω fin or ∆n fixed, Ω 1 ⊂ Ω fin , Ω ∞ ∆n fixed, Ω fin χ 0 2κ 2κ ∆n 2 /Ω n /Ω κ (−1) N f κ e iθ κ κI ∆n−1 (2κΩ)/I ∆n (2κΩ) ∆n /Ω 0 2i(−1) N f κ sinθ 0 ∆n/Ω Table 1: Values of the topological susceptibility χ, average instanton density n /Ω and average topological charge ∆n /Ω for different choices of free (i.e. summed over topological sectors) vs fixed topological charge, finite or infinite volume as well as for different orderings of the limit of infinite volume and the sum over the topological sectors. The symbols Ω ∞ and Ω fin denote an infinite and finite total volume Ω, respectively, while Ω 1 is assumed to be finite. "Ω ∞ first/last" refers to the infinite volume limit being taken before/after the sum over topological sectors.
where we have also suppressed the subscript N f for Γ. In the above equation, the factor of κ and the inverse powers of fermion masses can be read off from the correlator of Eq. (S93) (identifying κ = κ N f ). The above result and the relation m 2 η = 8|λ|f 2 π from Section S3.6 imply m 2 η ∝ χ gauge Ω1 . (S175) This extends the fully nonperturbative results of Refs. [23,24] for large N c to arbitrary N c in the dilute instanton gas approximation. By Eq. (S164), the result (S170) for the topological susceptibility in a subvolume holds for a fixed topological sector just as well. This implies that also in a finite volume-large enough to suppress artefacts-with periodic boundary conditions and fixed topology of the gauge field, there is a massive η meson according to relation (S175).
In the dilute instanton gas approximation, it can also make sense to calculate the instanton number density n /Ω. Restricting the path integration for the full volume Ω to a fixed topological sector with charge ∆m, we find (from inserting n in Eq. (S91)) n ∆m Ω = κ I ∆m−1 (2κΩ) I ∆m (2κΩ) ∼ κ, (S176) while for the relative fluctuation one has (n − n ) 2 ∆m n ∆m = I ∆m (2κΩ) 2 I ∆m−1 (2κΩ) 2 + ∆mI ∆m (2κΩ) κΩ I ∆m−1 (2κΩ) − 1 where we have indicated the asymptotic behaviour for Ω → ∞. The value of n /Ω in Eq. (S176) coincides with the result obtained when the path integral in Ω includes a sum over topological sectors after taking the infinite volume limit. When performing instead the sum over a finite volume, one finds n Ω = (−1) N f κe iθ . (S178) One can proceed along the previous lines to estimate the topological susceptibility, the instanton number density n /Ω and the average topological charge ∆n /Ω for different choices of infinite or finite volume and order of taking limits. We collect the results in Table 1. The topological susceptibility χ as well as the charge ∆n are quantities that are well-defined independently of the dilute gas approximation because these are given in terms of integrals over functions of the topological term. We note that when summing over topological sectors for a finite volume or before taking the infinite volume limit, one finds the following result for ∆n 2 for arbitrary θ: ∆n 2 Ω = 2κ cos(N f π +θ) − 2κΩ sin 2 (N f π +θ) . (S179) Due to the dependence on the spacetime volume it is again not clear how to interpret this result. The term that depends on the spacetime volume vanishes when considering instead the susceptibility (S168) which however forθ = θ 0 = 1/2(1 − (−1) N f )π cannot be interpreted as ∆n 2 /Ω. For finite volume, the results in Table 1 are compatible with the property ∆n Ω = i (θ − θ 0 ) ∆n 2 Ω θ0 + O(θ − θ 0 ) 2 (S180) used in Ref. [25]. This property can be derived by performing an expansion in θ within the path integral, which requires the θ dependence to remain analytic. While this is true for finite spacetimes (and thus provides a consistency check of the results in Table 1), as discussed in Section S3.5 the analyticity is not retained when taking the infinite volume limit before the sum over topological sectors. In Ref. [25], there is also a nonzero estimate for the topological susceptibility ∆n 2 /Ω| θ0 based on current algebra theorems. As in Ref. [23], the topological susceptibility is defined there with an infrared regulator, and thus should correspond to a finite volume observable. It is in that sense in agreement with our result (S170) for χ Ω1 in finite subvolumes. Our argument deviates from the literature since in Ref. [25], the presence of CP violation is concluded on the basis of relation (S180). As discussed before this is not valid for an infinite spacetime Ω. When considering an infinite Ω and observables with support in a finite subvolume Ω 1 ⊂ Ω, as in Section S5.1, one can obtain expectation values in terms of a path integral restricted to Ω 1 , according to Eq. (S153). From these observables, θ is absent, and thus there is no relation analogous to Eq. (S180) and no CP violation even though χ Ω1 = 0.

S5.5 Schrödinger picture
While the boundary conditions for the path integral are chosen to correspond to vanishing physical fields, even in the ground state there are quantum fluctuations that lead to nontrivial correlators. These are recovered when evaluating the path integral sufficiently far away from the boundaries. In the Schrödinger picture, the ground state corresponds to a wave functional Ψ( A a ), where the procedure of canonical quantization can be carried out in a gauge with A a,0 = 0 [52]. Working in Minkowski spacetime and noting that g E a = −∂/∂t A a and g B a = ∇ × A a − 1/2 f abc A a × A b , the canonical momentum conjugate to A a is given by The corresponding quantized operator must observe the commutation relations Here, i, j, . . . are indices of three-dimensional space and a, b, . . . for the adjoint representation of the gauge group. These commutators hold for where we are free to choose the parameter α. The Hamiltonian density is then given by Due to the freedom of choosing α, the parameter θ turns out to be irrelevant for the form of the Schrödinger equation. For example, imposing that the operator iδ/δ A a corresponds to the field E a is met by the choice α = θ. Note that changing α in Eq. (S183) does not have an impact on the extra constraint from the Gauß law that should be imposed-D ab · E b = 0, where D is the covariant derivative-because D ab · B b = 0. Hence, we end up with a Hamiltonian and constraint that are θ-independent, as will be the spectrum and the corresponding eigenstates. Nonetheless, one should be aware that the choice of α determines the coefficient of the topological term when constructing the action that appears in the path integral starting from the canonically quantized theory. Now let G n be a large gauge transformation that changes the Chern-Simons number by n units. Since this operator commutes with the Hamiltonian, it is possible to find states that satisfy HΨ = EΨ and where the eigenvalue must be a pure phase in order to comply with gauge invariance. States with this property constitute subspaces invariant under the action of the Hamiltonian, i.e. θ is protected by a superselection rule. The question of whether the spectra of these subspaces are identical is crucial for the physical relevance of the vacuum angle. That is, even when starting with a Hamiltonian and constraint with no explicit dependence on a vacuum angle, if the spectrum of the Hamiltonian changed between sectors with different values of θ , one would conclude that the latter can be in some way observable. The only viable analytical approach to this question appears to be given by the saddle point expansion, where we find that the angle is irrelevant and therefore all of these subspaces should have the same spectra. Given states with the property of Eq. (S185) and evolving under the Hamiltonian of Eq. (S184) with α = θ (so that there is no explicit dependence on the vacuum angle in H) we consider new sates Ψ defined as This satisfies G n Ψ = Ψ as well as Hence, for states that transform according to Eq. (S185) the operator H has the same spectrum as H for states with G n Ψ = Ψ . Note that H corresponds to the Hamiltonian of Eq. (S184) with α = θ − θ . While H = H in general, they nonetheless lead to the same predictions for the observables. In order for H to correspond to the energy density, we should identify E with −igδ/δ A + g 2 /(8π 2 ) θ B such that after all H ( E) = H( E). Note that the latter identity holds also when adding some interaction terms to the Hamiltonian through which one can observe the field E. Therefore, the predicted observables in the subspace of the states Ψ are identical to those in the subspace of the states Ψ. Now, while the phase has been removed from the relation G n Ψ = Ψ , there is the parameter θ appearing in H . Constructing the path integral from this Hamiltonian then leads to a topological term 1/(16π 2 )θ tr F F in the Lagrangian. We make use of this freedom of redefinition in Section S3.1 when projecting on the vacuum states given by the wave functional. This is possible even though the wave functional is not known exactly because in the limit of infinite spacetime volumes only configurations that reduce to vanishing physical fields at infinity lead to saddle points, as discussed in Section S3.2. As a final remark, we note that the freedom of choosing α in Eq. (S183) can be understood in terms of quantum canonical transformations that preserve the commutation relations and do not affect the physics. Different choices of α lead to different sets of eigenfunctions which are related by unitary transformations that preserve the spectrum and the inner product. Equation (S186) for example corresponds to the mapping between eigenstates of Hamiltonians related by a canonical transformation corresponding to δα = −θ .

S6 Spectral decomposition of the free fermionic propagator in Minkowski spacetime
To illustrate the spectral decomposition and the ϑ-adjoint defined in Eq. (S48), we use here the techniques of Section S2.3 to derive the free Minkowski propagator in Eq. (S60). Throughout this section, all objects are assumed to be defined in Minkowski spacetime. The free propagator is the inverse of the operator