Optimization for factorized quantities in perturbative QCD

Perturbative calculations of factorized physical quantities, such as moments of structure functions, suffer from renormalization- and factorization-scheme dependence. The application of the principle of minimal sensitivity to"optimize"the scheme choices is reconsidered, correcting deficiencies in the earlier literature. The proper scheme variables, RG equations, and invariants are identified. Earlier results of Nakkagawa and Niegawa are recovered, even though their starting point is, at best, unnecessarily complicated. In particular, the optimized coefficients of the coefficient function C are shown to vanish, so that C^opt=1. The resulting simplifications mean that the optimization procedure is as simple as that for purely-perturbative physical quantities.


Introduction
The application of the principle of minimal sensitivity [1] to the problem of factorization-scheme dependence has had a rather unfortunate history. The present author shares some of the blame, and this paper aims to make amends. The pioneering work by Politzer [2], which showed the way, was marred by a trivial algebraic error, seemingly showing that the optimization equations had no solution. The error was belatedly corrected in Ref. [3]. However, Ref. [3] is, in retrospect, insufficiently general beyond second order. The formulation of Nakkagawa and Niégawa (NN) in a series of papers [4]- [7] is, at best, unnecessarily complicated and creates spurious difficulties.
However, NN's optimization equations are actually equivalent to those we derive below. We discuss their work in Appendix A. Note that in Refs. [2]- [7] "b" has the opposite sign to ours. 1 The prototypical factorization problem is in deep-inelastic leptoproduction, where a highenergy lepton collides with a proton, or other hadron, exchanging a virtual photon of large virtuality Q 2 . Neglecting power-suppressed terms, the nth moment, The scheme-dependent coefficients c 2 , . . . can be regarded as RS labels [1,8]. 1 Our notation follows Ref. [8], except that we now omit tilde's on Λ and ρj, which had merely emphasized a difference in definition from previous conventions. Tildes will be needed here for another purpose.

1
The coefficient function C can be calculated as a perturbation series: C(Q, M ) = 1 + r 1ã + r 2ã 2 + . . . , (1.5) whereã is the couplant of some other arbitrary RS -which can be different from the RS used to define a. It can have a different renormalization scaleM , and different RS labelsc 2 , . . ..
(In the latter respect we differ from Ref. [3].) Perhaps the easiest way to understand that the RS's for a andã can be distinct, without inconsistency, is to imagine that first both O and C are calculated in the same RS and then a substitutionã = a(1 + v 1 a + v 2 a 2 + . . .), with arbitrary v 1 , v 2 , . . ., is made in the result for C. In terms of renormalization constants, the Z O constant needed for the renormalization of the operator O (which is genuinely an infinite change of normalization) must be consistent between the calculations of C and γ O , but the reparametrization step -the substitution of a = Z a a bare andã =Z a a bare in the bare forms of γ O and C, respectively -can involve distinct Z a andZ a renormalization constants.
Thus, what we shall call "RS/FS dependence" involves a choice of factorization scheme (FS), parametrized by g 1 , g 2 , . . ., and two, independent, choices of RS for a andã that are labelled, respectively, by τ , c 2 , c 3 , . . . and byτ ,c 2 ,c 3 , . . ., where , (1.8) where, as with the definition of Λ, the lower limit of x → 0 in each integral produces a divergence that cancels between the two integrals. The normalization constant A is not calculable from perturbation theory, but is RS/FS invariant, as shown in Ref [3].

Second-order approximation
We first discuss second order, where all authors are in agreement. A second-order approximation corresponds to truncating the series for γ O , C, and β after two terms. The integrals in Eq.
Substituting in Eq. (1.1), one obtains the second-order approximation to F as This approximant depends on RS/FS choices through three variables, τ ,τ , and g 1 . Partial Self-consistency of perturbation theory requires these variations to be of order a 2 . Noting that a = a(1 + O(a)), we see that so that r 1 has the form Eliminatingã between these last two equations gives us the optimal g 1 in terms of a: Also, from the integrated β-function ("int-β") equation (see Appendix B), at second order, we Substituting for τ and for g 1 in Eq. (2.8) and equating to zero, since r opt 1 = 0, we find Note that the optimization condition r opt 1 = 0 means that C opt = 1, so that all perturbative corrections are effectively exponentiated and re-absorbed into the anomalous dimension by the optimization procedure. As we shall see later, this property holds at any order, as first noted by NN [5].
Also note that while the value ofã (and henceτ ) is determined, it is not needed to obtain the result for F (2) opt .

RG equations
As discussed above the RS/FS variables are τ , c j ,τ ,c j , and the g i coefficients. We now write down the RG equations expressing the fact that the physical quantity F is independent of all these variables. Symbolically, we have where X stands for any of the set of variables {τ, c j ,τ ,c j , g j }.
Recalling the factorized form F = O C of Eq. (1.1), and noting that O is manifestly independent ofM , we see that variables τ, c j , is manifestly independent ofã and its RS variablesτ ,c j . Thus, the first two RG equations have the familiar form The other RG equations all take the form where X is any of the variables τ, c j or g j . The first term only involves dependence via the r i coefficients -indeed we are tempted to add "|ã" (meaning "withã held constant") to the notation, to match Eqs. (3.3), (3.4), but it is unnecessary sinceã is manifestly independent of τ, c j and g j . The second term can be evaluated as follows. In the case X → τ , we may simply and then use Eq. (1.8) to obtain Although we return to this form later, for the present we follow NN and re-write it as The equivalence to Eq. (3.8) can be shown by integrating by parts and then using the differential equation satisfied by the β j functions (see Appendix B). Finally, for X → g j we find, from Eq. (1.8), As usual, the RG equations determine how the coefficients r i must depend on the RS/FS variables. We now re-write the RG equations to facilitate finding these dependences. First, we use the series for γ O and C: with r 0 ≡ g 0 ≡ 1. Second, we convert the β, β j functions to the B, B j functions of Appendix B (whose series begin 1 + . . .). A third simplification, concerning the lower limit of the i The i summations of the ∂r i /∂X terms inherently begin with i = 1, but in the c j and g j equations, where the second term starts only at order a j , it is immediately evident that r i cannot depend on c j or g j for i < j. Thus, we may begin those i summations at i = j. For thẽ c j equation a stronger result holds, since ∂r i /∂c j must vanish for i = j as well as for i < j. This observation is crucial for the "exponentiation theorem" proved in Sect. 5.
In (k + 1)-th order all the sums would go up to i = k only and the equations would only be satisfied, in an arbitrary RS/FS, up to remainder terms of order a k+1 . The vanishing of all terms up to and including a k fixes the RS/FS dependence of the r i coefficients, and leads us to identify a set of invariants, σ j , as discussed in the next section.

6
The scheme dependences of r 1 were already found in Eq. (2.7) and led us to the first invariant It is Q dependent because r 1 , when calculated from Feynman diagrams, will contain a term −bg ln(Q/M ). One can view σ 1 (Q) as b ln(Q/Λ F ), where Λ F is a scale specific to the quantity F , but related, in an exactly calculable way, to the Λ of some universal, reference RS. The earlier literature used an "invariant" κ 1 given by It is true that κ 1 is invariant under changes of FS and renormalization scale, with the explicit g 1 and M dependences cancelling the implicit g 1 and M dependences of r 1 . Where κ 1 fails to be invariant is under a change of RS that leaves the renormalization scale M unchanged, but changes the renormalization prescription, so that a ′ = a(1 this change in κ 1 cancels with the change from Λ to Λ ′ , by the Celmaster-Gonsalves [9] relation.
The higher invariants, σ 2 , σ 3 , . . ., can be defined to be Q-independent. As with the ρ j invariants, it is convenient to define the σ j 's so that they reduce to the β-function coefficients c j in "effective charge" schemes, defined by the RS/FS choices g j = 0, r i = 0. The invariants, so defined, depend on τ andτ only via the differenceτ − τ and have no dependence on Q or Λ.
To find the invariants we will need the conversion betweenã and a; eitherã = a(1 TheṼ i coefficients can most easily be found from the relation between the β functions:β(ã) = (dã/da)β(a). (In fact, the calculation mirrors that for the ρ i invariants in Ref. [8].) The first three coefficients arẽ of the inverse relationship are obtained by exchanging all plain and tilde variables.
We now turn to a calculation of the invariant σ 2 . Expanding Eqs. (3.15-3.19) in powers of a andã and using the above result forṼ 1 , we can extract the self-consistency conditions. From the lowest-order terms we recover Eqs. (2.7) for r 1 's derivatives, plus confirmation that r 1 does not depend on the other RS/FS variables (c 2 ,c 2 , g 2 ). From the next-order terms we find Integrating each of these equations individually is easy, but combining the results consistently is a little tricky. However, it is straightforward to check our result that r 2 has the form: where the constant is independent of all the RS/FS variables. The constant can be conveniently written as g 2 σ 2 so that the invariant σ 2 is given by An easier and more systematic way to calculate the σ i invariants is to find them as the ρ i invariants associated with the physical quantity The perturbation series for D can be found in terms of the C and γ O series in various ways.
Perhaps the simplest is the following. First, note that all the Q dependence of F resides in the r i coefficients of C. For dimensional reasons such Q dependence can come only via the ratios Q/M and Q/M . Thus, The M dependence of C must cancel out with that of O in the product F = O C, so that while C is independent ofM , so that Thus, D is, in a sense, a "physicalized" version of γ O .
Substituting in the above formula we find (4.14) We could now expand out in terms ofã, converting a toã using Eq. (4.4). Alternatively, we can eliminateã and find the series expansion in terms of a. The results are more compact in the a scheme: and so on. Note that these coefficients are independent of the FS and independent of the tilde RS variables, with the explicit g i andτ ,c j dependences exactly cancelling with the implicit dependences from the r i coefficients; see Eqs. (2.7), (4.6). Thus, the r D i coefficients only depend, in the usual way, on the RS variables τ, c j associated with a.
As usual, we can construct the ρ j invariants for the quantity D: and these coincide with the σ's. Indeed, it is easy to see that the "effective-charge-type" RS/FS used in the definition of the σ's corresponds to the usual effective-charge scheme for D, so the equivalence of ρ D j to σ j is true for all j. The calculation can be straightforwardly extended to higher orders. Defining the first three invariants are

The exponentiation theorem
The (k + 1)-th order approximation is defined by truncating the series for C, γ O , B, andB.
The resulting approximant, in general, will have a residual RS/FS dependence that is formally Theorem (Nakkagawa and Niégawa [5]) The solution to theτ andc j optimization equations is Thus, C = 1 in the optimal scheme, so that all perturbative corrections are effectively exponentiated and re-absorbed into the anomalous dimension γ O .
Proof: Thec j optimization equation follows from Eq. (3.16): Recall that all terms up to and includingã k must cancel in any RS, thus determining ∂r i /∂c j . By starting the sum at i = j + 1 we have already used the fact that ∂r i /∂c j must vanish for i < j and for i = j, as noted at the end of Sect. 3.
We begin by considering the case j = k. The first term vanishes, as there are no terms in the sum, so we find that in the optimal scheme dC dã = 0. (5.5) Next, consider the case j = k − 1. In any scheme, cancellation of theã k terms requires In the optimal scheme the left-hand side must vanish, since dC/dã vanishes in the optimization equation (5.4). Thus, in the optimal scheme, r 1 = 0. Proceeding to the case j = k − 2 we can find ∂r k /∂c k−2 as a sum of r 1 c and r 2 terms. In the optimal scheme this must vanish, and since we already have r 1 = 0, we now find that r 2 = 0, too. We may then proceed to successively lower j cases to see that other r i 's vanish. Finally, we reach j = 1, where we are dealing with theτ equation, which gives us r k−1 = 0. Substituting back into dC/dã = k i=1 ir iã i−1 = 0 then shows that r k = 0.

The optimization equations
The fact that C = 1 in the optimal scheme allows us to simplify the remaining optimization equations, which follow from Eqs. ∂c j . Also note that the g j optimization equations involve a related set of integrals In each of these equations the first term is a polynomial inã that must precisely cancel out the terms up to and includingã k present in the second term, if it were expanded out in a power series inã. In Ref [8] we used the notation T n [G(a)] to mean "truncate the series for G(a) = G 0 + G 1 a + . . . immediately after the a n term" (i.e., T n [G(a)] ≡ G 0 + G 1 a + . . . + G n a n ).
Here we will needT n as the equivalent operation in the expansion parameterã. Thus, we may re-write the equations (swapping the order of the two terms and dividing out a g factor) as However, note that the arguments of theT k 's are all functions of a, rather thanã, so it is best to think of theT k [G] operation in three stages (i) expand G as series in a up to a k , (ii) convert a toã using Eq. (4.4), and (iii) re-expand as a series inã, and truncate after theã k term.
A further simplification results from the realization that, since C = 1, we do not need to know the optimized value ofã; nor do we need to know thec j 's orτ : they do not enter into the optimized result for F , which just involves evaluating O in the optimal scheme. Thus, what we need to do is to take combinations of the optimization equations in whichã and thẽ V i 's cancel out. From the resulting equation combinations we can solve for the g j coefficients in terms of the "principal variables" a, c 2 , . . . c k . (Note that the I and J integrals are functions of these principal variables.) Finally, we can use the invariants, σ i and σ 1 (Q), and the int-β equation to determine the optimized result. Note that when r i =0 the σ j 's have exactly the same form as the usual ρ j invariants with g i 's in place of r i 's.
In the next section we illustrate the above observations in the case of third order.

Third-order approximation
In third order (k = 2) we have four remaining optimization equations, in the variables τ , c 2 , g 1 , and g 2 . From Eqs.(6.8)-(6.10) these are (7.1) Taking the g 1 equation minus the τ equation cancels theã terms and, not coincidentally, theṼ 1 terms, leaving Anã 2 term remains, but we can substitute from the g 2 equation to obtain Taking the g 2 equation minus the c 2 equation cancels theã 2 terms, giving J 3 − (I 2,0 + g 1 I 2,1 + g 2 I 2,2 ) = 0. (7.7) We may solve these last two equations for g 1 , g 2 in terms of the principal variables a, c 2 .
From the four original equations we have extracted just two equations that give us the g 1 , g 2 coefficients that we need. There are effectively two other equations that we can just ignore; they would determineã andṼ 1 (which givesτ and, combined with the int-β equation of the tilde scheme, would then fixc 2 ), but we have no need to obtain values for these variables.
To relate the principal variables to Q and the invariants, we substitute the optimal-scheme quantities into the expressions for σ 2 and σ 1 (Q), combining the latter with the int-β equation to eliminate τ . In the optimal scheme, since r i = 0, the formula for σ 2 reduces to which is the familiar form of a ρ 2 invariant, but with g i 's as the coefficients. Similarly, in the optimal scheme where K (3) (a) is the third-order approximation to the K(a) function of the int-β equation.
In fact, there is a simpler approach that allows us to get directly to the equations determining the optimal g i 's. Consider the physical quantity D defined in Eq. (4.9), which we showed is given by Eq. (4.13), so that D = γ O when C = 1. That suggests that we consider F in the form: where "[0]' is a shorthand for the same "lower limit of 0 with subtraction of the suitable infinite scheme-independent constant," as in Eq. (1.8). Formally, this expression for F is valid quite generally, and is independent of the RS used, so it satisfies RG equations saying that the total dependences on τ and c j all vanish. What we are doing in RS/FS optimization is equivalent to a normal RS optimization applied to F , except that the approximants being optimized are not truncations of the perturbation series for F , but are approximants formed by truncating the perturbation series for D and β. That is, the (k+1)-th approximant to F is given by substituting into Eq.
Substituting the series form for D(a) leads to where I j,i (a) = a i+1 I j (a) − I i+j+1 (a) arises from the first and third terms of Eq. (8.4).
The derivatives ∂r D i /∂τ and ∂r D i /∂c j are the usual RS dependences of perturbative coefficients [1,8], and can be quickly found from the expressions for the ρ D i invariants. Thus, Using these results, and recalling that in the FS/RS optimal scheme the optimized r D i 's equal the optimized g i 's, the reader can quickly check that at 3rd order (k = 2) Eqs. (8.5) and (8.6) lead directly to Eqs. (7.6) and (7.7).

Conclusions and outlook
The optimization approach to the problem of RS/FS dependence is now, we believe, on a firm footing. It is far less daunting than it might appear at first sight. There are 3k scheme variables at (k + 1)-th order and k coefficients, r i . However, k of the optimization equations lead to r 1 = . . . = r k = 0, so that C = 1; another k variables (τ ,c 2 , . . . ,c k ) then need not be solved for.
That leaves k combinations of optimization equations that can be solved for g 1 , . . . , g k in terms of the "principal variables" a, c 2 , . . . , c k . In fact, these equations can be obtained more directly by the approach in the last section. By substituting in the expressions for the invariants, one can then solve for all the needed quantities. The last step will require an iterative algorithm, as in ordinary optimization [8].
Our results have applications to various quantities, such as charmonium decays to hadrons, B decays to charmonium, or Higgs boson decay to hadrons: These quantities have a factorized form involving the wavefunction at the origin or, in the last case, the quark masses. For applications involving parton distribution functions and fragmentation functions there is more work to be done. We have only considered the non-singlet case; the flavour-singlet case involves matrices describing quark-gluon mixing. Also, our analysis has used the language of structure-function moments, which is convenient theoretically since it reduces a convolution integral to a simple Appendix A: Discussion of the work of NN In this appendix we critique the work of Nakkagawa and Niégawa (NN) [4]- [7] and outline why, nevertheless, their optimization equations are equivalent to ours. Note that their "µ" corresponds to ourM (and their "b" is the opposite sign to ours). Theirã is the same as ours, but their a is somehow supposed to explicitly depend on both M andM . They write a = a(µ, ξ) where ξ = M/µ. It is never clear quite how this object is defined. Because of its supposed dependence on two scales, NN associate it with two β functions, whose coefficients are supposed to depend on ξ. We find this rather odd; it might not be wrong, but it certainly creates difficulties without gaining any generality. In our approach the couplant a is a normal couplant, with a renormalization scale M , in a RS labelled by τ ≡ b ln(M/Λ), c 2 , c 3 , . . .. This RS is distinct from, and independent of, the tilde RS used forã, whose scale isM and whose scheme labels areτ ,c 2 ,c 3 , . . .. Along with FS labels g 1 , g 2 , . . . these form the complete set of RS/FS labels, and variation of any one label, in a partial derivative, is made holding the other labels constant. Thus, there is no question of c j 's "depending" on M orM or their ratio.
For NN the integration of their two β-function equations for "a(µ, ξ)" is problematic [5,6], because of a dependence on the integration path. Later [7] they claimed to have resolved this problem, and made the ξ dependence of their c j 's go away. In our view, this dependence and the integration-path problem should never have been there in the first place! NN's analysis involves a somewhat mysterious variable Φ, which it seems must actually be, in their notation, b ln(M/µ). In our notation that means Φ = −b ln(M/M ) =τ − τ . Provided that we make this identification, we find that their equations (Eqs. (18a-e) of Ref. [5]) are equivalent to ours. Apart from straightforward conversion of notation we need to recognize that they work with variables µ and Φ, etc., while we work withM = µ and M (related toτ and τ , respectively).
Hence, their optimization equation associated with µ is a sum of ourτ and τ optimization equations.
Notwithstanding our criticisms, NN deserve praise for arriving at the correct optimization equations, and they were correct to criticize Refs. [2,3]'s formulation as insufficiently general.
The applications of their results, pursued with Yokota [10], are valid and important. In particular, they show how optimization naturally resolves the issue that, in a naïvely fixed scheme, the perturbative coefficients for the nth moment would grow like ln n 2 .
Appendix B: β(a) and β j (a) functions For the reader's convenience we list here some key formulas from Refs. [1,8]. The integrated form of the β-function equation, referred to as the "int-β" equation, is where here the prime indicates differentiation with respect to a, regarding the coefficients c j as fixed.