Effective finite generation for [IA_n,IA_n] and the Johnson kernel

Let $IA_n$ denote the group of $IA$-automorphisms of a free group of rank $n$, and let $\mathcal I_n^b$ denote the Torelli subgroup of the mapping class group of an orientable surface of genus $n$ with $b$ boundary components, $b=0,1$. In 1935 Magnus proved that $IA_n$ is finitely generated for all $n$, and in 1983 Johnson proved that $\mathcal I_n^b$ is finitely generated for $n\geq 3$. It was recently shown that for each $k\in\mathbb N$, the $k^{\rm th}$ terms of the lower central series $\gamma_k IA_n$ and $\gamma_k\mathcal I_n^b$ are finitely generated when $n>>k$; however, no information about finite generating sets was known for $k>1$. The main goal of this paper is to construct an explicit finite generating set for $\gamma_2 IA_n = [IA_n,IA_n]$ and almost explicit finite generating sets for $\gamma_2\mathcal I_n^b$ and the Johnson kernel, which contains $\gamma_2\mathcal I_n^b$ as a finite index subgroup.

consisting of elements acting trivially on H 1 (Σ b n , Z).In this paper we will only consider the cases b = 0, 1.Given n ∈ N, let F n denote a free group on n generators, and let IA n be the subgroup of Aut(F n ) consisting of automorphisms acting trivially on the abelianization The group IA n is often called the Torelli subgroup of Aut(F n ) and is known to behave similarly to I 1 n in many ways.In 1935, Magnus [Ma] proved that IA n is finitely generated for all n ≥ 2; in fact, he found an explicit and simple-to-describe generating set of smallest possible cardinality n n 2 .In 1983, Johnson [Jo2] proved that I b n is finitely generated for n ≥ 3; his generating set is also explicit and also of optimal size for n = 3, but of considerably larger size in general (growing exponentially in n).More recently, Putman [Pu1] found a smaller generating set whose size grows cubically with n, which is known to be asymptotically optimal. 1   It was a very interesting question whether the commutator subgroup [G, G] is finitely generated for G = IA n or I b n .In both cases G has large abelianization, so there was no a priori reason to expect [G, G] to be finitely generated.On the other hand, G possesses a generating set in which many pairs of generators commute, which can be seen as positive evidence for finite generation of [G, G].An additional motivation for the finite generation question in the mapping class group case is given by the fact that [I b n , I b n ] is a finite index subgroup of the Johnson kernel K b n , a group of major interest in topology.
1 The smallest size of a generating set for IAn is indeed n n 2 since its abelianization IA ab n is free of rank n n 2 -see § 3 for details.Likewise the asymptotic optimality of the generating set from Putman [Pu1] follows from the fact that the torsion-free rank of (I b n ) ab is cubic in n which was proved by Johnson [Jo1,Jo3] -see § 4 for details.
In [EH] it was proved that the commutator subgroups of the Torelli groups are indeed finitely generated in sufficiently large rank: [IA n , IA n ] for n ≥ 4 and [I b n , I b n ] for n ≥ 12.In [CEP], [I b  n , I b n ] was shown to be finitely generated for all n ≥ 4, and finite generation was also extended to some higher terms of the lower central series, namely γ k IA n for n ≥ 4k −3 and γ k I b n for n ≥ 2k + 1 (see § 1.4 for an additional discussion).However, neither [EH] nor [CEP] dealt directly with the finite generation question, instead reducing the problem to an analysis of BNS invariants.
Given a finitely generated group G, its character sphere S(G) is the set of nonzero homomorphisms from G to (R, +) modulo the equivalence given by multiplication by positive scalars.The BNS-invariant of G, introduced by Bieri, Neumann and Strebel in [BNS] and denoted by Σ(G), is a subset of S(G) which determines which subgroups of G containing [G, G] are finitely generated (see Theorem 2.1, often called the BNS criterion).In particular, [G, G] itself is finitely generated if and only if Σ(G) = S(G), and it is the latter equality that was established in [EH] and [CEP] for G = IA n and I b n for n ≥ 4. Finite generation of higher terms of the lower central series was established in [CEP] by an inductive application of the BNS criterion.
Since the proof of the BNS criterion is not effective, [EH] and [CEP] did not yield an actual construction of finite generating sets for [G, G] (or higher terms) for G = IA n and I b n .The main goal of the present paper is to give an effective proof of finite generation for [G, G] when n ≥ 8.In § 1.4 we will discuss the main obstacle to extending this method to γ k G for k > 2.
The proof of finite generation that we will provide does not make a formal reference to the BNS invariant; however, it relies on the proof of the BNS criterion in a substantial way.Essentially, we follow the proof of the BNS criterion given in [Str] (which is considerably simpler than the original argument from [BNS]), replace all the non-effective steps with explicit constructions and make some simplifications which are not possible in general.In addition to providing explicit generating sets for [G, G], the proof in this paper is algorithmic in the following sense -given a sufficiently nice generating set S of G and an element g ∈ [G, G] expressed in terms of S, our proof yields a procedure for writing g in terms of a finite generating set for [G, G] (which is explicitly derived from S).Our general method for proving effective finite generation of [G, G] that will be developed in § 2 is sufficiently flexible and could be applicable to other groups.
1.2.Main results.We proceed with stating our main result for IA n .Throughout the paper, for a group G and elements x, y ∈ G, we set x y = y −1 xy and [x, y] = x −1 y −1 xy.
Theorem 1.1.Let n ≥ 8. Let N = n n 2 , and let S = {s 1 , . . ., s N } be the standard generating set for IA n constructed by Magnus (see the beginning of § 3).Then [IA n , IA n ] is generated by elements of the form where 1 ≤ i < j ≤ N and 0 ≤ |a m | < 5 • 10 12 for each m.
In particular, the minimal number of generators of [IA n , IA n ] is at most n n 2 • (10 13 ) n( n 2 ) . Remark.
(1) It was previously known [To] 2 that if G is any group generated by a finite set {x 1 , . . ., x k }, then [G, G] is generated by elements of the form [x i , x j ] x a i i x a i+1 i+1 ...x a k k with 1 ≤ i < j ≤ k and a m ∈ Z.Thus, the novel part of Theorem 1.1 is that in the case G = IA n it suffices to take only such elements where a m are bounded by an explicit constant independent of n.
(2) In § 3 we will show that the number of generators of [IA n , IA n ] is actually bounded by a function of the form C n 2 (see Theorem 3.5) which is slightly better than a bound of the form C n 3 given by Theorem 1.1.
(3) Our method of proof is in principle applicable to all n ≥ 4, but would yield a constant larger than 5 • 10 12 for n = 6, 7 and a much larger constant for n = 4, 5.
We now turn to the mapping class groups.In this case we will describe generating sets for two different subgroups of I For simplicity we state Theorems 1.2 and 1.3 below for b = 1.It is well known that there is a natural surjective map Mod 1 n → Mod 0 n which sends I 1 n to I 0 n and K 1 n to K 0 n , so any finite generating set of K 1 n yields the corresponding finite generating set for K 0 n .
Remark.We will construct explicit generating sets for [I b n , I b n ] and K b n only for n ≥ 8. Theorem 1.2 is still valid for all n ≥ 4 since [I b n , I b n ] and K b n are known to be finitely generated for all n ≥ 4 by [CEP] and we are not making any assertions about the constant R above.
The problem of explicitly estimating R, at least for n ≥ 8, reduces to a certain computation in the Torelli group I 1 5 .We did not compute a precise upper bound, but we believe that this can be achieved by carefully examining the proofs of Johnson [Jo2] and Stylianakis [Sty] (see the end of § 4.8 for a detailed discussion).
1.3.Generating the Johnson kernel by finitely many Dehn twists.One drawback of the generating sets from Theorem 1.2 is that they do not seem to have any natural geometric or topological interpretation.We will now address this issue in the case of the Johnson kernel K 1 n .Recall that K 1 n is generated by the Dehn twists about separating curves.Thus, one way to produce an explicit geometrically meaningful finite generating set for K 1 n is to show that K 1 n is generated by the Dehn twists about separating curves of explicitly bounded word length.
Let us now make our task more precise.Fix a point p 0 on the boundary of Σ = Σ 1 n .The fundamental group π 1 (Σ, p 0 ) is free of rank 2n and admits a basis α 1 , β 1 . . ., α n , β n such that n i=1 [α i , β i ] is represented by ∂Σ; below we will refer to such a basis as natural.Fix a natural basis ω of π 1 (Σ, p 0 ).Given m ∈ N, let SC(m) be the set of all elements of π 1 (Σ, p 0 ) which have word length at most m with respect to ω and which are represented by a separating simple curve on Σ.Let T sc (m) ⊂ Mod(Σ 1 n ) be the set of Dehn twists about the elements of SC(m).As we will explain in § 4, any two natural bases of π 1 (Σ, p 0 ) lie in the same orbit under the action of Mod(Σ, p 0 ) on π 1 (Σ, p 0 ) (see the remark after Theorem 4.1).This easily implies that T sc (m) is independent of ω up to conjugation; in particular, the smallest m for which T sc (m) generates K 1 n does not depend on the choice of ω.
We can now formulate our theorem describing an explicit finite generating set for K 1 n consisting of Dehn twists: Theorem 1.3.Assume that n ≥ 4.There exists an absolute constant D such that K 1 n is generated by the set T sc (D n 3 ) defined above.
Theorem 1.3 will be obtained as a relatively easy consequence of Theorem 1.2 and some auxiliary results established in § 4. The constant D in Theorem 1.3 can be expressed in terms of the constant R from Theorem 1.2 and two other absolute constants which we believe can be estimated explicitly for n ≥ 8. 1.4.Some questions and remarks.As we already stated in § 1.1, finite generation results from [EH] were extended to higher terms of the lower central series in [CEP], where it was shown that γ k IA n is finitely generated whenever n ≥ 4k − 3 and γ k I b n is finitely generated whenever n ≥ 2k + 1.Thus it is natural to ask if the method of the current paper can also provide explicit generating sets for higher terms.We did not succeed in doing this.
The proof in [CEP] was ineffective for two reasons: similarly to [EH], it exploited the BNS invariant.In addition, a combinatorial calculation from [EH] was replaced by an ineffective Zariski density argument in [CEP].The latter is not a real obstacle to constructing explicit generating sets, and one can show that algebraic geometry can be eliminated from the proof in [CEP] at the expense of increasing the lower bound on n in terms of k (for which we are claiming that γ k G is finitely generated), with the new bound being quadratic in k.What does cause a problem is the fact that for k ≥ 2, G = IA n or I b n , very little seems to be known about the torsion in γ k G/γ k+1 G or the presentation of γ k G/γ k+1 G by generators and relations (as an abelian group).
We conclude this section with some speculations on the asymptotic growth of the number of generators of [IA n , IA n ] and K b n as n → ∞.Below, for a group Γ we will denote by d(Γ) the minimal number of generators of Γ.The following inequalities are obvious: It is known that dim H 1 (K b n , Q) grows polynomially with n, and in fact a precise formula for this dimension for n ≥ 6 can be immediately extracted from Theorem 1.4 in a recent paper of Morita, Sakasai and Suzuki [MSS] which, in turn, makes essential use of an earlier work of Dimca, Hain and Papadima [DHP].This provides at least some evidence that d(K Acknowledgments.We are extremely grateful to Andrew Putman for explaining to us the proof of the BNS criterion given in [Str] and to the anonymous referee who made a number of suggestions that helped improve the exposition.We also thank Thomas Church, Thomas Koberda and Andrew Putman for useful discussions related to the subject of this paper.After this paper was completed, the authors learned that results similar to those in this paper were independently obtained by Church and Putman (unpublished).

BNS invariant and effective finite generation
We begin this section with some basic terminology.Let G be a group and S a subset of G.
Cayley graphs.The Cayley graph of G with respect to S, denoted by Cay(G, S), is the graph whose vertex set is G and where g, h ∈ G are connected by an edge if and only if h = gs ±1 for some s ∈ S. It is clear that Cay(G, S) is connected if and only if S generates G. S-words.By an S-word, we will mean a formal expression s 1 . . .s k with s i ∈ S∪S −1 .Thus each S-word naturally represents an element of G, and every element of G is represented by some S-word if and only if S generates G.For each g ∈ G, there is a natural bijection between S-words representing g and paths in Cay(G, S) from 1 to g.
Prefixes.If w = s 1 . . .s k is an S-word representing g ∈ G, by a prefix of w we will mean a subword of the form s 1 . . .s l with l ≤ k.Thus, geometrically, prefixes of w correspond to initial segments of the corresponding path in Cay(G, S) from 1 to g. Word length.If S generates G, for each g ∈ G we denote by g S the word length of g with respect to S, that is, the smallest k ∈ Z ≥0 such that g is represented by an S-word s 1 . . .s k .Geometrically, g S is the distance from 1 to g in Cay(G, S).

2.1.
Review of the BNS invariant.We start by recalling the definition of the BNS invariant.By a character of a group G we will mean a homomorphism from G to the additive group of R. Two characters χ and χ ′ will be considered equivalent if they are positive multiples of each other, and the equivalence class of a character χ will be denoted by [χ].The character sphere S(G) is the set of equivalence classes of nonzero characters of G.
Assume now that G is generated by a finite set S. Given a character χ of G, denote by Cay(G, S) χ the full subgraph of Cay(G, S) with vertex set {g ∈ G : χ(g) ≥ 0}.Note that Cay(G, S) χ is completely determined by the equivalence class of χ.The BNS invariant of G, denoted by Σ(G), is defined by It is not hard to show that Σ(G) does not depend on the choice of S although this is not obvious from definition.
The following remarkable result was proved by Bieri, Neumann and Strebel in [BNS]: The original proof of Theorem 2.1 given in [BNS] was quite involved.A much simpler and more transparent proof appears in an unpublished manuscript of Strebel [Str] who attributes the argument to Bieri.While still ineffective, the proof in [Str] is almost entirely algorithmic apart from one step, as we will explain later in this section.
2.2.On the proof of the BNS criterion.In this subsection we will give a brief outline of the proof of the "if" part of Theorem 2.1 from [Str].With the exception of Lemma 2.3 below, the results discussed in this subsection will not be used in the rest of the paper, and the main purpose of providing this outline is to help the reader follow the proofs later in this section where we will establish an effective version of (the "if" part of) Theorem 2.1 under some additional hypotheses.
The following theorem (Theorem 2.2) must be well known, although we are not aware of a reference in the literature where it is stated exactly in this form.We are grateful to Andrew Putman for pointing out the formulation below.
Theorem 2.2.Let G be a group generated by a finite set S, let K be a subgroup of G (not necessary normal), and let θ : G → G/K be the natural projection.Then K is finitely generated if and only if there is a finite subset A of G/K such that θ −1 (A) is connected in Cay(G, S).
The "only if" part (which is not essential for our purposes) is a straightforward exercise.The "if" part of Theorem 2.2 is an immediate consequence of [Str,Theorem A4.7].Later in this section we will prove Theorem 2.14 which is an effective version of the "if" part of Theorem 2.2.
We now begin a sketch of proof of the "if" direction of Theorem 2.1.Keeping all the notations from Theorem 2.2, suppose now that G/K is abelian and Σ(G) ⊇ S(G/K).We wish to show that K is finitely generated.Since G is finitely generated, after replacing K by a finite index overgroup (which does not affect finite generation), we can assume that G/K is torsion-free.In addition, we want to impose an extra condition on the generating set S given by (2.1) below.
Lemma 2.3.Let G be a finitely generated group and let K be a normal subgroup of G such that G/K is abelian and torsion-free.Let θ : G → G/K be the natural projection, and choose a basis E of G/K.Then G has a generating set S such that Moreover, if S 0 = {s 1 , . . ., s n } is any finite generating set of G, one can obtain a generating set S satisfying (2.1) from S 0 by applying a sequence of right Nielsen transformations, that is, transformations of the form (g 1 , . . ., g i , . . .g n ) → (g 1 , . . ., g i g ±1 j , . . ., g n ) for some i = j and possibly invering one of the generators.
From now on assume that S satisfies the conclusion of Lemma 2.3 (with respect to some fixed basis E).Choose an isomorphism G/K ∼ = Z m which maps E onto the standard basis of Z m .Let • denote the corresponding l 2 -norm on G/K, and let B(R) denote the l 2 -ball of radius R with respect to this norm (centered at 0).The goal now is to show that θ −1 (B(R)) is connected for sufficiently large R (this would imply that K is finitely generated by Theorem 2.2).To do this, one chooses an arbitrary path p in Cay(G, S) whose end vertices a and b lie in θ −1 (B(R)) and then applies a (finite) sequence of modifications to p, so that the resulting path lies entirely in θ −1 (B(R)).
At each step the modification is as follows.Choose a vertex g on the current path such that θ(g) is maximal.If θ(g) ≤ R, there is nothing to do, so assume that θ(g) > R.
Since χ g vanishes on K and we assume that Σ(G) ⊇ S(G/K), there exist paths p t,y 2 ,g from t to y 2 t and q t,y 1 ,g from y 1 t to t such that χ g is positive on any vertex of those paths.Now we replace the segment (gy 1 , g, gy 2 ) of the current path by a new subpath passing through gy 1 , gy 1 t, gt, gy 2 t, gy 2 where one moves from gy 1 t to gt using the path g • q t,y 1 ,g and from gt to gy 2 t using the path g • p t,y 2 ,g (see Figure 1).
A key step of the proof is a compactness argument4 which shows that there is a finite set of paths Ω and ε > 0 such that for any character χ g arising above we can find desired paths p t,y 2 ,g and q t,y 1 ,g in Ω and moreover χ g (v) ≥ ε θ(g) for any vertex v of p t,y 2 ,g or q t,y 1 ,g .Now let r be the maximum of θ(v) where v ranges over the vertices of all paths from Ω.The direct computation below shows that if R ≥ r 2 2ε , then θ(z) < θ(g) for any vertex z on the newly added segment.This concludes the (sketch of) proof of the "if" direction of Theorem 2.1.
Indeed, any new vertex z has the form gv where v lies on q t,y 1 ,g or p t,y 2 ,g .Therefore, As the above outline suggests, in order to turn this proof into an actual algorithm (for a specific group), one needs to have an explicit procedure for constructing the paths p t,y 2 ,g and q t,y 1 ,g .We will now discuss some additional conditions which make this possible.
2.3.Explicitly finding non-negative forms.The following notion of a non-negative form of a group element provides a convenient way to reformulate the definition of Σ(G).
Definition 2.4.Let S be a generating set of a group G, let χ be a nonzero character of G and assume that χ(g) ≥ 0 for some g ∈ G.By a (χ, S)-non-negative form of g we will mean an S-word w which represents g such that χ(v) ≥ 0 for every prefix v of w.
It is clear that a given g ∈ G with χ(g) ≥ 0 admits a (χ, S)-non-negative form if and only if there is a path in Cay(G, S) χ connecting 1 with g (recall that Cay(G, S) χ is the full subgraph of Cay(G, S) with the vertex set {x : χ(x) ≥ 0}).Thus, [χ] ∈ Σ(G) if and only if every g with χ(g) ≥ 0 admits a (χ, S)-non-negative form.
There is a well-known sufficient condition for a character to lie in the BNS invariant: Lemma 2.5.Suppose G is generated by S = {s 1 , . . ., s n } and χ is a character of G such that (i) χ(s 1 ) > 0; (ii) for every i ≥ 2 there exists j < i such that [s j , s i ] = 1 and χ(s j ) = 0.
Lemma 2.5 was proved in [KMM,Lemma 1.9], although indirectly it appeared already in [MeVW]; see also [EH,Lemma 2.4] for a generalization.Although the condition in Lemma 2.5 may appear very special, there are many important classes of groups G for which every character in Σ(G) does satisfy this condition for some S -for instance, this is the case for right-angled Artin groups [MeVW], groups of pure symmetric automorphisms of free groups [OK], pure braid groups [KMM], IA n for n ≥ 5 and I b n for n ≥ 5, b = 0, 1 (see [CEP, EH]).
If a generating set S and a character χ satisfy the hypotheses of Lemma 2.5, it is not difficult to describe an algorithm which computes an (S, χ)-non-negative form for a given g ∈ G with χ(g) ≥ 0. In fact, such an algorithm implicitly appears in the proof of [MeVW,Theorem 4.1].
In this paper it will be more convenient to work with a slightly more restrictive condition, which still holds for IA n and I b n for sufficiently large n and leads to a very simple formula for a (χ, S)-non-negative form.
We will need some technical definitions.
Definition 2.6.Let S be a finite generating set for a group G and Z a subset of S.
(a) We will say that the pair (S, Z) is chain-centralizing if for every s ∈ S and z ∈ Z there exists z ′ ∈ Z which commutes with both s and z.(b) If χ is a character of G, we will say that χ is regular for (S, Z) if χ(z) = 0 for all z ∈ Z.
Remark.If χ is regular for a chain-centralizing pair (S, Z), it is easy to show that the pair (χ, S) satisfies the hypothesis of Lemma 2.5 (for a suitable ordering of S), but we will not use this fact in the proofs.
The name chain-centralizing is motivated by the following property which is an obvious consequence of the definition: Observation 2.7.Suppose that (S, Z) is chain-centralizing.Then for any finite sequence s 1 , . . ., s k ∈ S there exists a sequence z 1 , . . ., z k ∈ Z such that (i) z i commutes with s i for each The following lemma shows that if χ is regular for a chain-centralizing pair (S, Z), it is very easy to construct (χ, S)-non-negative forms: Lemma 2.8.Suppose that (S, Z) is chain-centralizing and χ is regular for (S, Z).Let g ∈ G with χ(g) ≥ 0, and write g = s 1 . . .s k with s i ∈ S ±1 .Choose z 1 , . . ., z k satisfying conditions (i) and (ii) of Observation 2.7, and choose Proof.The word w χ represents g by conditions (i) and (ii) of Observation 2.7 -we first move z −n 1 1 past s 1 and z n 2 2 and cancel it with z n 1 1 , then we move z −n 2 2 past s 2 and z n 3 3 and cancel it with z n 2 2 etc.
Let us now prove that χ(v) ≥ 0 for every S-prefix v of w χ .Without loss of generality, we can assume that χ(z i ) > 0 for all i, in which case n i ≥ 0 by (a).If v does not end with z −n i i or s i for some i, we can produce another S-prefix v ′ of w χ with χ(v ′ ) ≤ χ(v) by either removing the last letter of v or adding the next letter of w χ to the end of v. Thus, it suffices to prove that χ(v) ≥ 0 when v ends with z −n i i or s i .
2.4.Extra hypothesis.In this subsection we introduce the additional condition that will allow us to turn the proof of the BNS criterion into an algorithm.As before, we will assume that G, K, E, θ and S satisfy the hypotheses and conclusion of Lemma 2.3: In order to make use of Lemma 2.8, we need to know that every nonzero character is regular with respect to some chain-centralizing pair.Note that a single chain-centralizing pair would rarely work for all the characters (apart from rather trivial examples).Also observe that if we have one chain-centralizing pair (S, Z), then for any ϕ ∈ Aut(G), the pair (ϕ(S), ϕ(Z)) is also chain-centralizing.
This motivates our new hypothesis.We would like to assume that there is a finite subset Φ ⊆ Aut(G) with the following property: for every nonzero character χ of G which vanishes on K, there is some Z ⊆ S and ϕ ∈ Φ such that (S, Z) is chain-centralizing and χ is regular for (ϕ(S), ϕ(Z)).In fact, we will need to assume a bit more (see the Regularity Hypothesis below), but first we will introduce some additional notations involving automorphisms of G.
Constants A and B. Let ϕ ∈ Aut(G).Define B(ϕ, S) = max{ θ(ϕ(s)) 1 : s ∈ S} (where • 1 denotes the l 1 -norm with respect to E).Now define A = A(ϕ, S) to be the smallest integer with the following property: for every s ∈ S, there is an S-word w ϕ,s representing ϕ(s) such that θ(v) 1 ≤ A for every S-prefix v of w ϕ,s .Since w ϕ,s is its own S-prefix, we have the obvious inequality B(ϕ, S) ≤ A(ϕ, S).
If Φ is a finite subset of Aut(G), we define When S is fixed or clear from the context, we will usually suppress it from the notation and write A(ϕ) for A(ϕ, S) etc.
Constant M .For each character χ define We are now ready to state our additional hypothesis.Let Aut(G, K) denote the subgroup of Aut(G) consisting of automorphisms which leave K invariant.
Regularity Hypothesis.There exist a finite subset Φ ⊆ Aut(G, K) and a constant C > 0 with the following property: C for all z ∈ Z.
Before proceeding, we establish a few simple inequalities involving the constants A, B and M .
Observation 2.9.Let g ∈ G. Then θ(g) 1 is the smallest integer m for which there exists h ∈ G with h S = m and θ(h) = θ(g).
Proof.This immediately follows from the assumption that θ(S) = E or E ∪ {0}.
Proof.Fix s ∈ S ±1 .By definition of the constant A there exists an S-word x 1 . . .x k (with Fix such an index l.By Observation 2.9 and the choice of the word x 1 . . .x k , there exists an S-word v = y 1 . . .y m with m ≤ A such that θ( l j=1 x j ) = θ(v) and hence l j=1 x j = vz with z ∈ K.
Then ϕ( l j=1 x j ) = ϕ(v)ϕ(z), and since 2.5.The main result.In this subsection we will prove the main result of this section, Theorem 2.13.We start with a key proposition which shows that under the Regularity Hypothesis, we can control the "size" of (χ, S)-non-negative forms with respect to any character χ of G/K.Proposition 2.12.Let G, K and S be as in Lemma 2.3, and assume that the Regularity Hypothesis holds for some Φ ⊆ Aut(G, K) and a constant C > 0.
and write g = s 1 . . .s r with s i ∈ S ±1 .Then there exists a (χ, S)-non-negative form w of g such that for any prefix v of w we have Proof.The basic idea is very simple.Choose ϕ ∈ Φ such that (***) in the Regularity Hypothesis holds for χ, express g as a ϕ(S)-word and then use Lemma 2.8 to construct a (χ, ϕ(S))-non-negative form of g.This almost works -the issue is that when we rewrite the obtained (χ, ϕ(S))-non-negative form as an S-word, prefixes of this S-word may have negative χ-values.However, we have a lower bound for those χ-values: χ(v) ≥ −M A for every such prefix v, where M = M (χ).
To resolve this problem we choose t ∈ S ±1 with χ(t) = M and apply the same argument We now present the full argument.For convenience, we break the construction into three steps.
Step 1: Choose t ∈ S ±1 with χ(t) = M .Recall that s 1 . . .s r is an S-word representing g and hence p = t −A s 1 . . .s r t A is an S-word representing t −A gt A .Replacing each factor s ∈ {s 1 , . . ., s r , t ±1 } in the word p by the corresponding s given by Claim 2.11, we obtain a ϕ(S)-word p representing t −A gt A .
Let v be an arbitrary ϕ(S)-prefix of p .We claim that (1) For convenience let us write s i , and there exists some −A ≤ j < r + A and a ϕ(S)-prefix w of The first summand is bounded above by A+ r (this upper bound may occur for j = r since for larger j we start getting cancellations of θ(t) with θ(t −1 )).Also θ(w) 1 ≤ AB by Claim 2.11, so we proved (1).Inequality (2) immediately follows from (1) and Claim 2.10.
Step 2: Next we use Lemma 2.8 to construct a (χ, ϕ(S)) non-negative form of p, call it q (the elements z i in Lemma 2.8 will be chosen from ϕ(Z)).For each i we shall choose n i to be smallest in absolute value satisfying the inequality in Lemma 2.8.Since |χ(z i )| ≥ M C by the Regularity Hypothesis and |χ(v)| ≤ (AB + A + r)M for every prefix v of p, we have Now we need to establish the analogue of (1) for prefixes of q.Let v be a ϕ(S)-prefix of q.It is clear from the construction that there is a ϕ(S)-prefix w of p such that θ(v) is equal to θ(w) or θ(wz m i ) with |m| ≤ |n i | for some i or θ(wz n i i z m i+1 ) with |m| ≤ |n i+1 | for some i.In any case we have Since θ(w) 1 ≤ AB + A + r by (1), θ(z j ) 1 ≤ B for all j by definition of B and |n j | ≤ C(AB + A + r) for all j as established above, we get (3) Step 3: Next we rewrite q as an S-word, call it q.Let v be an S-prefix of q.We claim that ( 4) Recall now that q represents t −A gt A in G and hence t A qt −A represents g.We claim that t A qt −A is the desired (χ, S)-non-negative form.Take any S-prefix w of t A qt −A .We need to show that (6) θ(w) 1 ≤ (2BC + 1)(AB + A + r) + 2A; (7) χ(w) ≥ 0. Clearly, there are 3 cases: Case 1: w = t m for some 0 ≤ m ≤ A. In this case both ( 6) and ( 7) are obvious.
Case 2: w = t A v for some prefix v of q.In this case θ(v) 1 ≤ (2BC +1)(AB+A+r)+A by (4), so (6) holds, and (7) follows from (5) and the fact that χ(t A ) = M A (by the choice of t).
Case 3: w = t A qt −m for some 0 ≤ m ≤ A. In this case (6) again follows from (4).Finally, χ(t We can now construct an explicit finite subset of G/K whose preimage in Cay(G, S) is connected.We will show that the l ∞ -ball of a certain radius has this property: Theorem 2.13.We keep all the hypotheses and notations from Proposition 2.12 and let R = 16(BC + 1) To simplify terminology, in the proof of Theorem 2.13 we will occasionally talk about the l 2 -norm or l ∞ -norm for an element g ∈ G, by which we mean the corresponding norm of θ(g).

Proof.
Recall that E is a fixed a basis of G/K such that θ(S) = E or E ∪ {0}.As before, we choose an isomorphism G/K → Z m which maps E to the standard basis of Z m .We will generally follow the outline of the proof of Theorem 2.1 given earlier in this section.Here is the summary of the new ingredients that we will use: (i) The compactness argument will be replaced by a reference to Proposition 2.12, which enables us to get an explicit formula for R. (ii) In the proof we will use not just l 2 -norm, but also l ∞ -norm and l 1 -norm (where all the norms are taken with respect to the chosen identification of G/K with Z m ).(iii) When modifying the path at each step we will make slightly more complicated "detours".Note that modification (i) is essential, while (ii) and (iii) will only be used to obtain a better estimate.
So take any a, b Our goal is to show that there is a path from a to b in Cay(G, S) whose projection lies in B ∞ (R).We start by choosing some path p from a to b in Cay(G, S) and then apply a sequence of modifications, eventually pushing it inside θ −1 (B ∞ (R)).
If θ(p) lies inside B ∞ (R), we are done, so assume that θ(p) has at least one vertex outside B ∞ (R).Among all the vertices of p outside B ∞ (R) we choose one with the largest l 2 -norm, call it g.Our goal is to replace p by another path p ′ from a to b which does not have any vertices with l 2 -norm larger than g 2 and has fewer vertices with l 2 -norm equal to g 2 than p.Clearly, after applying such a modification finitely many times, we will obtain a path inside θ −1 (B ∞ (R)), as desired.Note that the maximal l ∞ -norm may increase during some initial steps -this is not a problem.
To prove the claim recall that by assumption g has the largest l 2 -norm among the vertices of p which lie outside of Thus, we proved that ||θ(gy Let θ(g) i denote the i th coordinate of θ(g), and choose i such that θ(g) i is maximal in absolute value.Since by assumption θ(g) ∈ B ∞ (R), we have |θ(g) i | > R. Choose t ∈ S ±1 with θ(t) = ±e i where the sign is chosen to be the same as the sign of θ(g We proceed with the construction of p ′ (see Figure 2 for an illustration).The path p ′ will coincide with p prior to gy 1 and after gy 2 , but we will connect gy 1 and gy 2 by a new subpath which passes through the vertices gy 1 t −r , gt −r and gy 2 t −r , in this order, where r satisfying 1 ≤ r ≤ R will be chosen later.We connect gy 1 with gy 1 t −r in the natural way (multiplying r times by t −1 ) and similarly gy 2 t −r with gy 2 .To connect gt −r with gy 2 t −r , we write gy 2 t −r = (gt −r )(t r y 2 t −r ) and then replace the suffix t r y 2 t −r by its (χ, S)-nonnegative form constructed in Proposition 2.12 (the latter is applicable since χ(t r y 2 t −r ) = χ(y 2 ) ≥ 0).Similarly we connect gy 1 t −r with gt −r .
We now need to show that if v is any vertex on this new subpath different from the end vertices gy 1 and gy 2 , then θ(v) 2 < θ(g) 2 .Recall that θ(t) = ±e i , so all the vertices on the segment between gy 1 to gy 1 t −r differ only in the i th coordinate.Moreover, by assumption θ(gy 1 ) i ≥ θ(g) i − 1 ≥ R and θ(t) i and θ(gy 1 ) i have the same sign.Thus, if we require that r ≤ R, then |θ(gy 1 t −j ) i | = |θ(gy 1 ) i | − j for 0 ≤ j ≤ r, so the l 2 -norm strictly goes down as we move from gy 1 to gy 1 t −r .Similarly, the l 2 -norm strictly goes down when we move from gy 2 to gy 2 t −r .
2.6.An explicit generating set for K.To finish the constructive proof of finite generation given in this section, we need to establish an effective version of Theorem 2.2.
For a technical reason, in the following two results it will be convenient to work with left Cayley graphs (note that earlier in this section we worked with the commonly used right Cayley graphs).The left Cayley graph of a group G with respect to S, denoted by Cay lef t (G, S), is defined in the same way as Cay(G, S) except that edges have the form (g, s ±1 g) with s ∈ S. Note that Cay lef t (G, S) and Cay(G, S) are isomorphic as graphs via the inversion map g → g −1 .
Theorem 2.14 below is a variation of the Reidemeister-Schreier rewriting process.This result is undoubtedly well known, but we are not aware of a specific reference in the literature, so we will provide a proof.
Recall that if K is a subgroup of a group G, a left transversal of K in G is a subset T of G which contains exactly one element from each left coset of K.
Theorem 2.14.Let G be a group generated by a set S, let K be a (not necessary normal) subgroup of G, and let θ : Then K is generated by the set where for every g ∈ G by g ∈ T we denote the unique element of T such that θ(g) = θ(g).
In particular, if F and S are finite, then K can be generated by (at most) |F||S| elements.
Remark.If we take F = G/K, then θ −1 (F) = G is automatically connected.In this case S K is the usual Reidemeister-Schreier generating set for K.
Proof.Let K ′ be the subgroup generated by S K .Clearly S K ⊆ K (since gK = gK for all g ∈ G by definition of g), so K ′ ⊆ K. Let us now prove that K ⊆ K ′ .Take any k ∈ K. Since K = θ −1 (1) ⊆ θ −1 (F) and θ −1 (F) is connected in Cay lef t (G, S), we can find a path 1 = y 0 , y 1 , . . ., y m = k in Cay lef t (G, S) with θ(y i ) ∈ F for each i.Since T is a transversal for K, for each 0 ≤ i < m we can uniquely write y i = t i k i where t i ∈ T and k i ∈ K.Note that k 0 = 1 and k m = k.Thus, to prove that k ∈ K ′ it suffices to show that By assumption y i and y i+1 are connected by an edge in Cay lef t (G, S), so there exists s ∈ S such that y i+1 = sy i or y i = sy i+1 .
First consider the case y i+1 = sy i .We have θ( In the case y i = sy i+1 we can repeat the above argument swapping i and i + 1 in every expression and observe in the end that Making the generating set for K more explicit.Let us now consider the special case where G/K is abelian and torsion-free.Let θ : G → G/K be the natural projection. Choose an ordered basis E = {e 1 , . . ., e n } of G/K, and use it to identify G/K with Z n .Recall that by Lemma 2.3, G has a finite generating set S such that θ(S) = E or θ(S) = E ∪ {0}.Note that θ restricted to S need not be injective.
We will show that if F satisfies a certain technical condition (see the definition of a Schreier set below), one can obtain an even more explicit finite generating set for K (by slightly modifying the set S K from Theorem 2.14).
Choose s 1 , . . ., s n ∈ S with θ(s i ) = e i , and let S 1 = {s 1 , . . ., s n }.Let S 2 be the elements of S \ S 1 which lie outside of K, and let S 3 be the elements of S \ S 1 which lie in K. Thus S = S 1 ⊔ S 2 ⊔ S 3 .For each s ∈ S 2 let d(s) be the unique integer such that θ(s) = e d(s) .
Remark.It is clear that for any p ∈ [1, ∞], the l p -ball centered at 0 is a Schreier set.
Recall that for group elements x, y we set [x, y] = x −1 y −1 xy and x y = y −1 xy.
Theorem 2.16.Assume that G, K and S satisfy the above conditions.Let F ⊆ G/K be a Schreier subset of G/K such that θ −1 (F) is connected.Then K is generated by the following three types of elements: Proof.Clearly all elements in (a)-(c) above lie in K. Let T = {s a 1 1 . . .s an n : a i ∈ Z for all i}.By construction T is a transversal for K in G, and let S K be the corresponding set from Theorem 2.14.We need to show that every element of S K can be expressed in terms of elements of type (a)-(c).Let us denote the subgroup generated by those elements by K ′ .So take any s ∈ S, t ∈ T with θ(t) ∈ F. Thus t = n j=1 s a j j for some (a 1 , . . ., a n ) ∈ F. We are also allowed to assume that θ(st) ∈ F, but this extra condition will not be needed for the argument.We will consider 3 cases depending on which of the subsets S 1 , S 2 and S 3 the generator s lies in.
Case 2: s ∈ S 2 .Let j = d(s).As in Case 1 we write t = uv where u = j i=1 s a i i , so st = us j v. Hence st −1 st = (us j v) −1 suv and therefore The last expression in (2.4) is an element of type (b).Since s j t −1 s j t lies in K ′ by Case 1, it follows from (2.4) that st −1 st ∈ K ′ as well.
Case 3: s ∈ S 3 .In this case s ∈ K, so st = t and hence st −1 st = s t , which is an element of type (c).

Effective finite generation of
In this section we will prove Theorem 1.1.Throughout this section we fix an integer n ≥ 2 and let [n] = {1, 2, . . ., n}.
Magnus [Ma] proved that IA n is generated by the elements K ij with i = j ∈ [n] and K ijk with i, j, k ∈ [n], i, j, k distinct defined by x l → x l for l = i .
Clearly K ikj = K −1 ijk , so IA n is generated by the set {K ij } ∪ {K ijk : j < k}.Throughout this section we set S = {K ij } ∪ {K ijk : j < k} and will refer to S as the Magnus generating set for IA n . 5An easy computation shows that |S| = n n 2 .The following commutation relations between the Magnus generators of IA n are straightforward to check: Lemma 3.1.The following hold: , m} and k ∈ {i, j}.In particular, two Magnus generators commute if their sets of indices are disjoint.
has the natural structure of a GL n (Z)-module.As a GL n (Z)-module, IA ab n is canonically isomorphic to V * ⊗(V ∧V ) where V = Z n , considered as a standard GL n (Z)module, and V * = Hom(V, Z) is the dual module.This isomorphism was first established by Formanek [Fo], but there are several alternative proofs in the literature (e.g.see [DP]).
Let e 1 , . . ., e n be the standard basis of V , and let e * 1 , . . ., e * n be the dual basis.Given 5 New, more geometric, proofs of the fact that S generates IAn were given in [BBM] and [DP].The proof given in [Ma] has two parts: one first shows that S generates IAn as a normal subgroup of Aut(Fn) and then shows that the subgroup generated by S is normal in Aut(Fn).Both [BBM] and [DP] gave very different proofs for the first part, but followed the original argument of Magnus for the second part.
From now on we will identify IA ab n with V * ⊗ (V ∧ V ) via the map (3.1).Let N = n n 2 = n 2 (n−1)

2
. By the above discussion, IA ab n ∼ = Z N as abelian groups, and moreover the natural projection IA n → IA ab n is injective on S = {K ij } ∪ {K ijk : j < k} and maps S to E = {e * i ⊗ (e j ∧ e k ) : j < k}, which is a basis of IA ab n .In particular, S satisfies the conclusion of Lemma 2.3 for G = IA n and K = [IA n , IA n ].Now let Z = {K 12 , K 34 , K 56 , K 78 }.By Lemma 3.1, elements of Z commute with each other, and it is easy to check that every element of S commutes with an element of Z.These two properties immediately imply that the pair (S, Z) is chain-centralizing.We now need to construct Φ satisfying the Regularity Hypothesis, but first we make some general observations.Action on the space of characters.For any group G we have a natural action of Aut(G) on the space of characters Hom(G, R) given by (ϕχ)(x) = χ(ϕ −1 (x)) for any x ∈ G and ϕ ∈ Aut(G).
Clearly, Inn(G) acts trivially, so we get an action of Out(G) = Aut(G)/Inn(G).
Next note the centralizer of IA n in Aut(F n ) is trivial (since already the centralizer of Inn(F n ) in Aut(F n ) is trivial) and hence the conjugation action of Aut(F n ) on IA n yields an embedding of Aut(F n ) into Aut(IA n ). 6his, in turn, induces an embedding of Aut(F n )/IA n ∼ = GL n (Z) into Out(IA n ) and thereby an action of GL n (Z) on Hom(IA n , R).It is easy to check that this is a "standard" action, dual to the action of GL n (Z) on IA ab n discussed above, but it is important for us that it comes from an action of Aut(F n ) on IA n .
The key technical result that we will prove in this section is the following lemma: Lemma 3.2.There exists a finite subset Ω of GL n (Z) with the following properties: Proof.Let Φ be the set of all elements ϕ −1 where ϕ ranges over all lifts from the conclusion of Lemma 3.2(b).Then Φ is finite (since Ω is finite) and B(Φ ∪ Φ −1 ) ≤ 150 and A(Φ ∪ Φ −1 ) ≤ 8100 by Lemma 3.2(b).Now let χ be any nonzero character of G.By Lemma 3.2(a) there exists g ∈ Ω such that |gχ(z)| ≥ M (χ) 3 for all z ∈ Z.Let ϕ ∈ Aut(F n ) be the lift of g from Lemma 3.2(b).Since χ(ϕ −1 (x)) = (ϕχ)(x) = gχ(x) for all x ∈ G and ϕ −1 ∈ Φ by construction, Regularity Hypothesis holds for this Φ and C = 3.The proof of Lemma 3.2 will consist of two parts.First we will construct g satisfying (a).This will be done in several steps, and g will be constructed as a product of at most 9 unit transvections and at most 2 permutation matrices (this ensures that there are only finitely many possibilities for g).Then we will prove (b) using the specific form of g constructed in the proof of (a).
In the computations below it will be convenient to use the following notation: for a character λ of G and i, j, k ∈ [n] we set c ijk (λ) = λ(e * i ⊗ (e j ∧ e k )).Note that we can reformulate the condition on g in Lemma 3.2(a) in terms of the coefficients c ijk as follows: for (i, j) ∈ {(1, 2), (3, 4), (5, 6), (7, 8)}.
Given i, j ∈ [n] with i = j and a permutation σ of Proof of Lemma 3.2(a).In each step below M will denote a positive real number and λ will denote an arbitrary character of G (which will vary from step to step).
Step 1: If M (λ) ≥ M , there is a permutation matrix g 1 such that |c 112 (g By definition of M (λ) we have |c iij (λ)| ≥ M or |c ijk (λ)| ≥ M for some distinct i, j, k.
In the next step we give different arguments depending on which case occurred in Step 1. Step We have By assumption |c 112 (λ) ∓ 2c 132 (λ)| ≥ M for any choice of sign and |c 332 (λ) ± 2c 132 (λ)| ≥ M for some choice of sign, so either E 2 31 or E −2 31 can be used as g 2 .
Step 2B: If |c 112 (λ)| ≥ M , there exists g ∈ GL n (Z) which is either the identity matrix or 3 , then g 2 = 1 obviously works, so assume that |c 332 (λ)| < M 3 .For ε = ±1 by direct computation we have To do this we apply Steps 2 and 3 with indices 3 and 4 replaced by 5 and 6.Since we will be acting by matrices E ij with i, j ∈ {3, 4}, the value of c 334 will not change.
Step Putting all the steps together, we obtain the desired g ∈ GL n (Z) given by the product g 5 g 4 g 3 g 2 g 1 (recall that g i is the matrix we acted by in Step i).
Before turning to the proof of Lemma 3.2(b), we will first explain how the lifts from the conclusion of Lemma 3.2(b) will be constructed and derive some general bounds on the constants A and B for certain maps.
Constructing lifts.Since g in the proof of Lemma 3.2(a) is explicitly constructed as a product, we can obtain a lift of g by simply lifting each factor.The natural lift of a transposition matrix F ij is F ij ∈ Aut(F n ) which swaps x i and x j and fixes other generators.A transvection matrix E ij can be lifted to either left or right Nielsen map R ji or L ji defined by Note that we can use different lifts for different occurrences of E ij but this does not seem to matter for the resulting bound.It is clear that A( F ij ) = B( F ij ) = 1, and an explicit computation in [DP] (see Table 1 . Note that g in the proof of Lemma 3.2(a) is a product of at most 2 transpositions and at most 9 unit transvections.Combining these facts with the easy observation that for any ϕ, ψ ∈ Aut(F n ), we already deduce that g has a lift ϕ with A(ϕ ±1 ) ≤ 6 9 and B(ϕ ±1 ) ≤ 4 9 .
To improve those bounds, we will prove the following general lemma: Lemma 3.4.Let G, K and S be as in Lemma 2.3.The following hold: Proof.(b) follows from (a) by straightforward induction, so we will only prove (a).Fix s ∈ S ±1 .By definition of A(ψ) and Observation 2.9, there exists an S-word s 1 . . .s r representing ψ(s) such that for all 1 ≤ j ≤ r, one can write θ(s 1 . . .s j ) as a product of at most A(ψ) elements θ(s), s ∈ S ±1 .Next for each s ∈ S ±1 choose an S-word w s representing ϕ(s) such that θ(u) 1 ≤ A(ϕ) for every S-prefix u of w s .Consider the S-word w = w s 1 . . .w sr .Then w represents ϕψ(s), and any S-prefix of w is equal to w s 1 . . .w s j−1 u for some 1 ≤ j ≤ r and S-prefix u of w s j .By assumption, θ(w s 1 . . .w s j−1 ) is the product of at most A(ψ) elements θ(w s ), s ∈ S ±1 .Since θ(w s ) ≤ B(ϕ) by definition of B(ϕ) and θ(u) 1 ≤ A(ϕ), we have Thus, A(ϕψ) ≤ A(ψ)B(ϕ) + A(ϕ), as desired.
We can now finish the proof of Lemma 3.2 Proof of Lemma 3.2(b).First note that for ϕ ∈ Aut(F n ), the constant B(ϕ) depends only on the image of ϕ in GL n (Z), so we can talk about B(g) for g ∈ GL n (Z).
We will consider the case when g constructed in the proof of part (a) is equal to It is not hard to check that this g represents the worstcase scenario.It is also easy to see that B(g) ≥ B(h) for any prefix h of g and the same is true for g −1 .
To estimate B(g −1 ) we first compute the action of g on the basis elements of V and V * .We have where k ≤ 9 and A(ϕ i ) ≤ 6 for all i.We just argued that B j i=1 We are finally ready to prove Theorem 1.1.In fact, we will prove a slightly stronger statement: Proof.We start by proving that [IA n , IA n ] is generated by all elements of the form (***), which is precisely the assertion of Theorem 1.1.
First we compute the radius R in Theorem 2.13 applied to G = IA n and K = [IA n , IA n ].Corollary 3.3 shows that for a suitable Φ we have B ≤ 150, A ≤ 8100 and can take C = 3. Hence the preimage of the l ∞ -ball of radius R = 16(BC + 1) 2 (AB + 3A + 3) ≤ 16 • 451 2 • (8100 • 153 + 3) < 5 • 10 12 is connected.Now Theorem 1.1 follows directly from Theorem 2.16 (note that for G = IA n we have S = S 1 in the notations from that theorem).
Let us now prove the full statement of Theorem 3.5.As before, we identify IA ab n with Z N via the map For a point (a 1 , . . ., a N ) ∈ Z N define its support as Let F be the set of all (a 1 , . . ., a N ) ∈ Z N with |a i | ≤ R for each i and |supp ((a 1 , . . ., a N ))| ≤ 8n 2 .Clearly, F is a Schreier set.Thus, Theorem 2.16 reduces Theorem 3.5 to showing that θ −1 (F) is connected (where θ : IA n → Z N is the natural projection).So take any x, y ∈ IA n with θ(x), θ(y) ∈ F. We already know that x and y can be connected by a path p which lies in the θ-preimage of the l ∞ -ball of radius R. If θ(g) ∈ F for every vertex g of p, we are done; otherwise, consider all vertices g on p such that |supp (θ(g))| is largest possible, and among these vertices (if there is more than one), choose one where θ(g) 1 is maximal.In particular, by assumption |supp (θ(g))| > 8n 2 and thus θ(g) = θ(x), θ(y).
The vertices of p which precede and succeed g have the form gy 1 and gy 2 for some y 1 , y 2 ∈ S ±1 .Write θ(g) = (a 1 , . . ., a N ).Since |supp (θ(g))| > 8n 2 , there exist more than 8n 2 indices m such that a m = 0, and an easy calculation using Lemma 3.1 shows that there exists 1 ≤ m ≤ N such that a m = 0, s m = y ±1 1 , y ±1 2 and s m commutes with both y 1 and y 2 .Without loss of generality, we can assume that a m > 0. Now modify p replacing the segment (gy 1 , g, gy 2 ) of p by Another easy calculation shows that for every vertex v on this segment we have |supp (θ(v))| ≤ |supp (θ(g))| and θ(v) 1 < θ(g) 1 .Thus, after applying such modification finitely many times, we will obtain a path connecting x to y which lies in θ −1 (F), as desired.

Effective finite generation of [I 1
n , I 1 n ] and the Johnson kernel 4.1.Preliminaries.Throughout this section we fix an integer n ≥ 0 and let Σ = Σ 1 n be an orientable surface of genus n with 1 boundary component.We start by introducing some curves and subsurfaces on Σ that will be used throughout the proof.
First, it will be convenient to think of Σ as a (closed) disk with n handles attached; let us number the handles from 1 to n.We also fix a point p 0 on the boundary ∂Σ (it will serve as the base point for all the fundamental groups considered below).
For each 1 ≤ i ≤ n, choose a point p i on the i th handle and curves a i and b i passing through p i as shown on Figure 3. Also choose (oriented) paths γ i from p 0 to p i as in Figure 3.In particular, we require the sets γ i \ {p 0 } to be disjoint.
Next define the curves α i and β i by -note that these are closed curves based at p 0 (see Figure 3).For simplicity we will also use the notations α i and β i for the corresponding classes in the fundamental group π 1 (Σ, p 0 ).
Let (∂Σ) p 0 denote the boundary of Σ considered as a closed path from p 0 to itself oriented clockwise.It is easy to check that (∂Σ) p 0 is homotopic to The mapping class group Mod 1 n = Mod(Σ) is defined as the subgroup of orientation preserving homeomorphisms of Σ which fix the boundary ∂Σ pointwise modulo the isotopies which fix ∂Σ pointwise.The action of Mod(Σ) on Σ induces an action on π = π 1 (Σ, p 0 ) and thus we obtain a homomorphism ι : Mod(Σ) → Aut(π).The group π is free of rank 2n with generators α 1 , β 1 , . . ., α n , β n , and since (∂Σ) p 0 is fixed under the action, the image of ι stabilizes n i=1 [α i , β i ] by property (ii) above.In fact, a stronger statement holds: Theorem 4.1.The map ι is injective and Im ι is equal to the full stabilizer of Theorem 4.1 is proved, e.g., in [ZiVC]: the above map ι is injective by [ZiVC,Theorem 5.13.2] and surjective (that is, Im ι is the full stabilizer) by [ZiVC,Theorem 5.7.1].The surjectivity part is originally due to Zieschang [Zi].
Remark.(a) The surjectivity part of Theorem 4.1 can be rephrased by saying that Mod(Σ) acts transitively on the set of (ordered) bases δ Recall that such bases were called natural in the introduction.(b) Theorem 4.1 is a variation of the classical Dehn-Nielsen-Baer theorem which asserts that for a closed surface Σ of genus n, the mapping class group Mod(Σ) is isomorphic to an index 2 subgroup of the outer automorphism group of a surface group on 2n generators.
Johnson filtration.For each k ∈ N define I 1 n (k) to be the kernel of the induced map Mod 1 n → Aut(π/γ k+1 π).The filtration {I 1 n (k)} ∞ k=1 is called the Johnson filtration.The first term of this filtration I 1 n = I 1 n (1) is the Torelli subgroup of Mod 1 n .It can also be defined as the set of elements of Mod 1 n acting trivially on H 1 (Σ 1 n ).The second term of the Johnson filtration K 1 n = I 1 n (2) is known as the Johnson kernel.One can characterize K 1 n purely topologically as the subgroup generated by Dehn twists about separating curves.The equivalence of these two definitions of K 1 n is a deep theorem of Johnson [Jo3].Recall that we consider n as being fixed, and for the rest of the section we will use the simplified notations M = Mod 1 n , I = I 1 n and K = K 1 n .Occasionally we will also use the notations I(Ω) and K(Ω) for the Torelli subgroup (resp.Johnson kernel) of the mapping class group of a surface Ω. 4.2.Generators for the mapping class group.It is well known that the mapping class group Mod 1 n is generated by Dehn twists.The minimal number of Dehn twists needed to generate Mod b n for b = 0, 1 is 2n + 1 as proved by Humphries [Hu] for b = 0 and by Johnson [Jo2] for b = 1 (the generating set in [Jo2] is a natural analogue of the one in [Hu]).More specifically, Mod 1 n is generated by the Dehn twists about the curves c 1 , c 2 , . . ., c 2n , b defined in [Jo2,p.428,Figure 5].
Usually, in the definition of the Dehn twist T γ one assumes that γ is an essential simple closed curve, but for the discussion below it will be convenient to introduce the following convention: If γ is a closed curve on Σ which is not simple, but freely homotopic to some essential simple closed curve γ ′ , we set T γ = T γ ′ .The right-hand side is well defined since two freely homotopic curves on a surface are isotopic, and the Dehn twist T γ ′ is determined by the isotopy class of γ ′ .
With this convention, we can relate the Humphries-Johnson generating set to the curves α i , β i introduced in § 4.1.It is not hard to see that c 2i is freely homotopic to β i for 1 ≤ i ≤ n, that c 2i−1 is freely homotopic to α i α −1 i−1 for 2 ≤ i ≤ n and that c 1 and b are freely homotopic to α 1 and α 2 , respectively.Thus, [Jo2, Theorem 3] can be restated as follows: Theorem 4.2.The mapping class group Mod 1 n is generated by the following Dehn twists: We will not explicitly refer to Theorem 4.2 in this paper, but we will use several results whose proof relies on Theorem 4.2.4.3.Generators for the Torelli subgroup and subsurfaces Σ I .It is a celebrated theorem of Johnson [Jo2] that the Torelli group I = I 1 n is finitely generated for n ≥ 3. Johnson's generating set from [Jo2] is explicit, but it lacks a key feature of Magnus' generating set for IA n -the fact that most generating pairs commute -that is essential for our purposes.A generating set for I which has the latter property was constructed by Church and Putman [CP] using an earlier work of Putman [Pu1].
Recall the curves α i and β i , 1 ≤ i ≤ n, defined in § 4.1.For each I ⊆ [n] choose a subsurface Σ I ⊆ Σ satisfying the following properties: (i) The curves α i and β i lie on Σ I for all i ∈ I.
(ii) Σ I has genus |I| and 1 boundary component.
(iii) The boundary of Σ I is homotopic to i∈I [α i , β i ] (where the product is taken in increasing order).(iv) Σ I ∩ ∂Σ is an interval which contains p 0 and does not depend on I.For a longer but more transparent definition of Σ I see [CP,§ 4] or [EH,§ 7].For an illustration see Figure 4.The subsurfaces Σ I are uniquely defined up to isotopy and satisfy the following properties (1)-(4).Properties (1)-( 3) follow immediately from the definitions and (4) can be proved by a standard application of the change of coordinates principle [FM,1.1.3].
Observation 4.3.The following hold: (1) Σ (3) If I and J are disjoint and uncrossed (as defined below), there exist subsurfaces there exists an orientation-preserving homeomorphism g of Σ acting trivially on ∂Σ such that g(Σ I ) = Σ J .
Definition 4.4.Let I and J be disjoint subsets of [n].We will say that I and J are crossed if there exist i 1 , i 2 ∈ I and j 1 , j 2 ∈ J such that i 1 < j 1 < i 2 < j 2 or j 1 < i 1 < j 2 < i 2 .
Otherwise I and J will be called uncrossed.Clearly, if I consists of consecutive integers, then I is uncrossed with any subset J disjoint from it.
Remark.The technical condition (iv) in the definition of Σ I (which was not imposed in [CP] or [EH]) is needed to ensure that property (4) in Observation 4.3 holds.Note that an easier way to achieve (4) would be to require that Σ I ∩ ∂Σ = ∅.However, the latter would prevent us from considering the fundamental groups of Σ I as subgroups of π 1 (Σ, p 0 ), something that is essential for our purposes.A deep result of Church and Putman [CP,Proposition 4.5] asserts that if n ≥ 3, then I = I I : |I| = 3 .This fact played a key role in the proof of finite generation of K 1 n in [EH], but in [EH] there was no need to work with a specific finite generating set inside |I|=3 I I .In this paper we will need to choose such a generating set S as follows: we will start with some finite generating set S {1,2,3} of I {1,2,3} and then add to it the images of S {1,2,3} under carefully chosen isomorphisms between I {1,2,3} and I I for every 3-element subset I of [n].The details of this construction will be given later in this section.4.4.Abelian quotients of the Torelli subgroup.Let V = H 1 (Σ).Then V is a free abelian group of rank 2n, and it is well known that the algebraic intersection form on V is symplectic.Clearly M/I acts on V preserving this form, so there is a canonical group homomorphism M/I → Sp(V ) where Sp(V ) is the group of automorphisms of V preserving this form.It is also well known that this homomorphism is an isomorphism, which enables us to identify M/I with Sp(V ).
From the definition of the Johnson filtration it is easy to see that the quotient I/K = I 1 n (1)/I 1 n (2) is abelian and torsion-free.A complete description of the abelianization of I for n ≥ 3 was obtained in a series of Johnson's papers [Jo1,Jo3,Jo4].Below we collect some specific results about abelian quotients of I that will be needed in this paper: Theorem 4.6.Assume that n ≥ 3. The following hold: (a) I/K is the largest torsion-free abelian quotient of I.
(b) There is a canonical isomorphism of Sp(V )-modules 1 + 2n 0 .Remark.We briefly comment on how (a), (b) and (c) follow from the results of [Jo1,Jo3,Jo4].In [Jo1], Johnnson constructed • an epimorphism τ : I → ∧ 3 V such that K ⊆ Ker τ and the induced map I/K → ∧ 3 V is a homomorphism of Sp(V )-modules and • a group epimorphism σ : I → B 3 where B 3 is an elementary abelian 2-group of rank (there is also a natural Sp(V )-module structure on B 3 , but it is not essential for our purposes).
In [Jo3] it was proved that K = Ker τ which yields (b).One of the main results of [Jo4] is that [I, I] = K ∩ Ker σ.This implies that I/[I, I] embeds into I/K ⊕ B 3 which yields (a).The other main result of [Jo4] is that B 3 is the largest exponent 2 quotient of I.
Combined with (a) and (b) this implies (c).
We proceed with the description of I/K.Recall the curves a i and b i on Σ introduced in § 4.1.By slight abuse of notation below we will use the notations a i and b i for the corresponding homology classes in Recall that we have an isomorphism of Sp(V )-modules I/K → ∧ 3 V .For every I ⊆ [n] let ϕ I : I I → ∧ 3 V be the map obtained by precomposing the isomorphism I/K → ∧ 3 V with the natural projection I → I/K and the inclusion I I → I.
The following result is an immediate consequence of Johnson's paper [Jo4]: Lemma 4.7.For every I ⊆ [n] we have ϕ I (I I ) = V I .
We can now construct a generating set for I with certain nice properties.First order the chosen basis of V as follows: a 1 < b 1 < a 2 < . . .< b n , and let B be the set of all wedges x ∧ y ∧ z with x, y, z ∈ {a i , b i } and x < y < z.Clearly B is a basis for ∧ 3 V .Proof.By [Jo2], the group I I ∼ = I 1 3 has a generating set S 0 I with 42 elements.By Lemma 4.7, the image of I I in V is equal to V I , which is a free abelian group of rank 20 with basis B ∩ V I .As in the proof of Lemma 2.3, applying suitable Nielsen moves to S 0 I (replacing x by xy ±1 where x and y are distinct generators), we obtain another generating set S I of I with 42 elements whose image in V is equal to (B ∩ V I ) ⊔ {0}.
The set S = |I|=3 S I generates I by [CP,Proposition 4.5].Also the image of S in ∧ 3 V is equal to |I|=3 (B ∩ V I ) ⊔ {0} = B ⊔ {0}, so S indeed satisfies the conclusion of Lemma 2.3 (for the desired G, K and E).
Our effective generation procedure described in Section 2 can be applied to any generating set S constructed in Lemma 4.8.However, some additional compatibility assumptions on {S I } will be needed in order to explicitly estimate the constants A and B from Theorem 2.13.We postpone this discussion until the end of this section.4.5.Some elements of Mod(Σ).In this subsection we will prove the existence of certain elements of Mod(Σ) with a prescribed induced action on π 1 (Σ, p 0 ).These results will play a key role in estimating the constant A later this section.
Moreover, suppose that f ′ ∈ Mod(Σ) is another element satisfying (a) and (b).Then f ′ coincides with f on Σ [k] , and therefore, Proof.We first construct an element f satisfying (a) and (b).By Observation 4.3(4), there exists g ∈ Mod(Σ) such that some representative of g maps Σ [k] to Σ I .
Since π 1 (Σ I , p 0 ) is free of rank 2k and the automorphism group of a free group F acts transitively on the sets of bases of F , there exists ϕ ∈ Aut (π 1 (Σ I , p 0 )) such that ϕ(δ j ) = g * (α j ) and ϕ(δ ′ j ) = g * (β j ) for all j ∈ [k].Then (where the last equality holds by (4.1)), so by Theorem 4.1 applied to the surface Σ I , there exists h ∈ Mod(Σ I ) such that h * = ϕ.If we extend h to Mod(Σ) by letting it act trivially on Σ \ Σ I , then clearly f = h −1 g satisfies both (a) and (b).
Let us now prove the 'moreover' part.Suppose that f ′ ∈ Mod(Σ) satisfies both (a) and (b), and let h = (f ′ ) −1 f .By (a) some representative of h stabilizes Σ [k] .By (b) h * acts trivially on π 1 (Σ [k] ), and hence by the injectivity part of Theorem 4.1 h acts trivially on Σ , so hs = sh, which yields the last assertion.
The following corollary describes a key special case of Lemma 4.9.Proof.The first assertion is a special case of Lemma 4.9, so we only need to prove the 'moreover' part.So take any f I ∈ Mod(Σ) satisfying (a) and (b).Then (4.2) Thus, if f I is any orientation-preserving homeomorphism of Σ which fixes ∂Σ pointwise and represents f I , then f I (∂Σ J ) is homotopic (and hence isotopic) to ∂Σ I J .By [FM,Prop. 1.11], an isotopy between f I (∂Σ J ) and ∂Σ I J can be extended to an isotopy of Σ (acting trivially on ∂Σ).Hence after perturbing f I by an isotopy of Σ, we can assume that f I (∂Σ J ) = ∂Σ I J and hence f I (Σ \ ∂Σ J ) = Σ \ ∂Σ I J .
The topological spaces Σ \ ∂Σ J and Σ \ ∂Σ I J both have two connected components: Σ J \ ∂Σ J and Σ \ Σ J (resp.Σ I J \ ∂Σ I J and Σ \ Σ I J ).Since f I acts trivially on ∂Σ and since the intersection (Σ \ Σ J ) ∩ (Σ \ Σ I J ) ∩ ∂Σ is non-trivial by construction, f I must map Σ \ Σ J to Σ \ Σ I J .Thus f I (Σ J \ ∂Σ J ) = Σ I J \ ∂Σ I J and hence f I (Σ J ) = Σ I J .4.6.An analogue of Lemma 3.2.In this subsection we will establish an analogue of Lemma 3.2 for mapping class groups (see Lemma 4.11 below).The proof of the second part of Lemma 4.11 will be postponed till the next subsection, as estimation of the constants A and B in the mapping class group case requires more work.
Since I/K is the maximal torsion-free abelian quotient of I, we have Hom(I, R) = Hom(I/K, R).By the same logic as in Section 3, there is a natural embedding of M = Mod(Σ) into Aut(I) and a natural action of Sp(V ) = M/I on Hom(I/K, R).
The symplectic group Sp(V ) ∼ = Sp 2n (Z) is generated by the elements w i , τ i for i ∈ [n] and τ ij for i = j ∈ [n] defined as follows (all basis elements whose image is not specified are fixed): In addition, for each 1 ≤ i = j ≤ n let f ij ∈ Sp(V ) be the element which swaps a i and a j and swaps b i and b j .We will refer to the elements f ij as transpositions.Proof.Unlike the case of Aut(F n ), the set Z will depend on χ, but there will be only boundedly many possibilities, and in each case Z will contain 4 elements, exactly one element from each of the sets S ∩ G I for I = {1, 2}, {3, 4}, {5, 6} and {7, 8}.For each such Z the pair (S, Z) is chain-centralizing -this follows immediately from Observation 4.5(iii).
The proof of Lemma 4.11(a) is very similar to that of Lemma 3.2(a), so we will just list the main steps and skip the details of the computations.As in the proof of Lemma 3.2, in each step M denotes a positive real number and λ is an arbitrary character of G. Step Below we will consider the case where z in Step 1 lies in V {1,2} ; the other case is similar.Without loss of generality we can assume that z = a 1 ∧ b 1 ∧ b 2 .
(b) For an essential simple closed curve α, the action of T α on H 1 (Σ) is given by the following formula (see, e.g., [FM,Prop. 6.3] A direct computation using this formula shows that (i) The Dehn twist Below we exhibit the calculation for (iii) (assuming the result for (ii)).Recall that a In the case of transpositions f ij it will be more convenient to define lifts directly instead of expressing them as products of Dehn twists.Given distinct i < j ∈ [n], let F ij be the unique element of Mod(Σ) which is supported on Σ {i,j} and acts on π 1 (Σ {i,j} ) as follows: .
Such an element F ij exists (and is unique) by Lemma 4.9.It is clear that F ij is a lift of f ij .Now any g in part (a) has a lift ϕ which can be written as a product of at most 3 transposition-lifts F ij and at most 135 Dehn twists T α i , T β i or T α i α −1 j or their inverses.By (3.4), in order to get an absolute bound for A(ϕ ±1 , S), it suffices to prove the following proposition.
Proposition 4.12.For a suitable choice of S there exists an absolute constant Remark.The assertion of Proposition 4.12 does not appear to be obvious even if we restrict ourselves to, say, g = T α i with i fixed but n tending to infinity.
Proposition 4.12 will be proved in the next subsection.

4.7.
Estimating the A constants.Recall that by Corollary 4.10, for every non-empty subset I = {i 1 < i 2 < . . .< i k } of [n] there exists an element f I ∈ Mod(Σ) which maps Σ [k] to Σ I and satisfies f * I (α j ) = α i j and f * I (β j ) = β i j for all 1 ≤ j ≤ k.From now on we will fix such an element f I for every I.
Let us record one more simple observation, which is an immediate consequence of the moreover parts of Lemma 4.9 and Corollary 4.10.
Observation 4.13.Let I = {i 1 < i 2 < . . .< i k } be a subset of [n], let J be a subset of [k], and let I J = {i j : j ∈ J}.Then the elements f I J and f I f J coincide on Σ [|J|] and hence . Proof.Let t = |J|, and write J = {j 1 < . . .< j t }.Recall that some representative of f J maps Σ [k] to Σ J , and by Corollary 4.10 some representative of f I maps Σ J to Σ I J .Thus, if f = f I f J , the following hold: (i) Some representative of f maps Σ then both (i) and (ii) also hold by construction.Hence, the assertion of Observation 4.13 follows from the moreover part of Lemma 4.9.
We are now ready to prove Proposition 4.12: Proof of Proposition 4.12.Define a generating set S for I as follows: for some u = v.Since the Dehn twist does not depend on the orientation of the curve, without loss of generality we can assume that u < v. Below we will consider the subcase g be the elements of I listed in increasing order.Then, in the notations of Observation 4.13 we have L By the conjugation formula for Dehn twists (see, e.g., [FM,Fact 3.7]) we have We claim that each factor in the latter product lies in S.This would imply that gxg −1 S ≤ m ≤ C and thus finish the proof.
Indeed, for each k as above we have . Hence, using Observation 4.13 again we have Case 2: g = T ±1 αu or T ±1 βu for some u.This case is similar to (and easier than) Case 1. Case 3: g = F ±1 uv for some u < v.The argument in this case is mostly similar to Case 1, so we will just outline the differences.First as in Case 1, without loss of generality we can assume that g = F uv .
Fix L ⊆ [n] , and let , and we just need to find an appropriate expression for the conjugate f −1 I F uv f I .Let i 1 < i 2 < . . .< i |I| be the elements of I listed in increasing order, and let r, s ∈ [|I|] be such that u = i r and v = i s .By definition of f I , some representative of f I sends Σ {r,s} to Σ {u,v} .Since F uv is trivial on the complement of Σ {u,v} , we conclude that f −1 I F uv f I is trivial on the complement of Σ {r,s} .The action on Σ {r,s} is determined by direct computation below.We have Proof of Theorem 1.2.Let us first prove Theorem 1.2(1).Let S be a generating set for I constructed in Proposition 4.12.This set need not satisfy the conclusion of Theorem 1.2(1).First we decompose S = S 1 ⊔ S 2 ⊔ S 3 as in the setup introduced before Theorem 2.16.Thus, S 3 = S ∩ K, S 1 = {s 1 , . . ., s N } is a subset of S \ S 3 such that the natural projection θ : I → I/K maps S 1 bijectively onto B (and S 2 = S \ (S 1 ⊔ S 3 )).Also recall that for s ∈ S 2 we denote by d(s) the unique integer such that θ(s) = θ(s d(s) ).
Define S (1) = S 1 and S (4) = S 3 ∪{s −1 s d(s) : s ∈ S 2 }.Then S (1) ∪S (4) = S 1 ⊔S 2 ⊔S 3 = I.Also note that S (4) lies in K and the image of S (4) in I ab generates K/[I, I] (the latter holds since the image of S (1) ∪ S (4) generates I ab and the images of elements of S (1) are linearly independent modulo K).Since K/[I, I] is a vector space over F 2 , we can choose a subset S (2) of S (4) whose image in K/[I, I] is a basis of K/[I, I].Finally, we can multiply each element of S (4) \ S (2) by a product of elements of S (2) (on either side) so that the obtained element lies in [I, I], and let S (3) be the set of all such elements.Since S (1) ⊔S (2) ⊔S (3) is obtained from S by Nielsen transformations, it clearly generates I, and the additional properties asserted in Theorem 1.2(1) hold by construction.Now let θ : I → I/K be the natural projection.By Theorem 2.13 and the remark after Lemma 4.11, the set θ −1 (B ∞ (R)) is connected where R is bounded by an absolute constant.By Theorem 2.16, K is generated by elements of the form This generating set is almost the same as the set in Theorem 1.2(2) -the only difference is that the condition x ∈ S (4) above is replaced by x ∈ S (2) ⊔ S (3) .But by construction the subgroup generated by S (4) is equal to the subgroup generated by S (2) and S (3) , so the set described in Theorem 1.2(2) also generates G.This proves Theorem 1.2(2).
Finally, Theorem 1.2(3) follows from Theorem 1.2(2) by a standard application of the Reidemeister-Schreier process and straightforward computations involving basic commutator identities.
Estimating the constant R. Let us now briefly address the problem of explicitly estimating the constant R in Theorem 1.2.Let S 0 [3] be the generating set of I 1 [3] with 42 elements constructed in Johnson's paper [Jo2].We can algorithmically construct another generating set S S I .Now the key step is estimating the constant C 1 from the proof of Proposition 4.12.To do this, we need to explicitly express each conjugate yty −1 , where t ∈ S [5] and y = T ±1 α i α −1 j , T ±1 α i or T ±1 β i , 1 ≤ i = j ≤ 5, as a product of elements of S [5] .An algorithm for obtaining such expressions is given in the Ph.D. thesis of Stylianakis [Sty]; however, some of the arguments in [Sty] rely on certain computations in [Jo2].
Finally, it is easy to express the constant R in terms of C 1 following the steps of the proof of Theorem 1.2(2).4.9.Proof of Theorem 1.3.Throughout this subsection S will denote the generating set for I constructed in Proposition 4.12 (not the modified set from the proof of Theorem 1.2).Let us recall some notations from the introduction.We let ω = {α i , β i } n i=1 ; also for 1 ≤ l ≤ n we set ω l = {α i , β i } l i=1 .Given m ∈ N we denote by T sc (m) the set of all Dehn twists T γ where γ ∈ π 1 (Σ, p 0 ) can be represented by a separating curve and γ ω ≤ m.
Theorem 1.3 will be deduced from Theorem 1.2 (or rather its proof) and the following technical lemma.
Lemma 4.14.The following hold: (a) Let w be any word in the free group on 2 generators.There exists a constant C w depending only on w with the following property: for every s, t ∈ S such that w(s, t) ∈ K, the element w(s, t) lies in T sc (C w ) , the subgroup generated by T sc (C w ).(b) There exists an absolute constant C 2 such that s * (α) ω ≤ C 2 α ω for all α ∈ π 1 (Σ, p 0 ) and s ∈ S ±1 .
Before proving Lemma 4.14, we establish a simple auxiliary result.Proof.Let π = π 1 (Σ, p 0 ) and π I = π 1 (Σ I , p 0 ).Using Theorem 4.1, we can identify Mod(Σ) and Mod I = Mod(Σ I ) with subgroups of Aut(π) and Aut(π I ), respectively.Under this identification, K(Σ I ) consists of all elements of Aut(π I ) which act trivially modulo γ 3 π I and lie in Mod I , while K ∩ Mod I consists of all elements of Aut(π I ) which act trivially modulo γ 3 π ∩ π I and lie in Mod I .Thus, proving the equality K(Σ I ) = K ∩ Mod I reduces to showing that γ 3 π I = γ 3 π ∩ π I .The latter holds since π is free and π I is a free factor of π and hence a retract of π.
Remark.Claim 4.15 is a special case of [Ch,Theorem 4.6]; however, we decided to give a direct proof since we are dealing with a much more specific situation compared to [Ch].
. Also note that s ′ and t ′ are obtained from s and t by conjugation by the same element of Mod(Σ).Since w(s, t) ∈ K and K is normal in Mod(Σ), it follows that w(s ′ , t ′ ) ∈ K ∩ Mod [l] = K(Σ [l] ) where the last equality holds by Claim 4.15.
Thus we can write w(s ′ , t ′ ) = k i=1 T ±1 γ i where each γ i ∈ π 1 (Σ [l] , p 0 ) is separating with Finally, by definition of f L we have f * L (ω l ) ⊆ ω and therefore for any γ i in (4.7) we have f * L (γ i ) ω ≤ γ i ω l ≤ C w , and hence w(s, t) lies in T sc (C w ) by (4.7).
Discussion of the problem.Given non-negative integers n and b, let Σ b n be an orientable surface of genus n with b boundary components, and let Mod b n = Mod(Σ b n ) be its mapping class group.The corresponding Torelli group I b n is the subgroup of Mod b n b n -the commutator subgroup [I b n , I b n ] and the Johnson kernel K b n .One can define the Johnson kernel algebraically, as the second term of the Johnson filtration of I b n (see § 4.1 for the definition of the Johnson filtration in the case b = 1; the definition in the case b = 0 is similar) or topologically, as the subgroup of Mod b n generated by Dehn twists about separating curves.It also follows from work of Johnson that I b n /K b n is the largest quotient of I b n which is abelian and torsion-free.This, together with finite generation of I b n , immediately implies that K b n contains [I b n , I b n ] as a finite-index subgroup for n ≥ 3.
and |S (3) | = 42 n 3 − |S (1) | − |S (2) |. (2) There exists an absolute constant R such that K is generated by elements of the form (a) [s i , s j ] i < j ≤ N and |a m | ≤ R for all m; (b) x s a x ∈ S (2) ∪ S (3) and |a m | ≤ R for all m.

Theorem 3. 5 .
Let n ≥ 8. Let N = n n 2 , and let S = {s 1 , . . ., s N } be the Magnus generating set for IA n .Then [IA n , IA n ] is generated by elements of the form [s i , s j ] s a i i s a i+1 i+1 ...s a N N where 1 ≤ i < j ≤ N and 0 ≤ |a m | < 5 • 10 12 for each m.( * * * )In fact, [IA n , IA n ] is generated by elements of this form with the additional property that |{m : a m = 0}| ≤ 8n 2 .

For
each I ⊆ [n] define Mod I to be the subgroup of Mod(Σ) consisting of mapping classes which have a representative supported on Σ I , and let I I = Mod I ∩ I. Parts (1)-(3) of Observation 4.3 have obvious group-theoretic consequences: Observation 4.5.The following hold: (i) Mod [n] = Mod(Σ).(ii) If I ⊆ J, then Mod I ⊆ Mod J .(iii) If I and J are disjoint and uncrossed, then Mod I and Mod J commute.
3 ) as a group.(c) K/[I, I] (which by (a) is the torsion part of I/[I, I]) has exponent 2 and rank 2n 2 + 2n Lemma 4.8.Suppose that n ≥ 3. Then for every I ⊆ [n] with |I| = 3 there exists a generating set S I for I I with |S I | = 42 whose image in I/K is equal to (B ∩ V I ) ⊔ {0}.Moreover, if we let S = ∪ |I|=3 S I , then S is generating set for I which satisfies the conclusion of Lemma 2.3 for G = I, K = K and E = B.
Corollary 4.10.Let I be a subset of[n], let k = |I|, and let i 1 < i 2 < . . .< i k be the elements of I listed in increasing order.Then there exists an elementf I ∈ Mod(Σ) such that (a) some representative of f I maps Σ [k] onto Σ I ; (b) the induced action of f I on π 1 (Σ) satisfies f * I (α j ) = α i j and f * I (β j ) = β i j for all 1 ≤ j ≤ k.Moreover, if f I ∈ Mod(Σ) is any element satisfying (a) and (b), then for any subset J of [k], some representative of f I maps Σ J onto Σ I J where I J = {i j : j ∈ J}.
Lemma 4.11.Let S be as in Lemma 4.8, let χ be a nonzero character of G = I and let M = M (χ).The following hold:(a) There exists g ∈ Sp 2n (Z) and Z ⊂ S with the following properties:(i) (S, Z) is chain-centralizing.(ii) |gχ(z)| ≥ M for all z ∈ Z.(iii) g is a product of at most 15 elements of the form τ ij and and at most 3 transpositions f ij .(b) There exists an absolute constant C 0 (independent of n) such that any g in part (a) admits a lift ϕ ∈ Mod 1 n with A(ϕ ±1 ) ≤ C 0 .Remark.Arguing exactly as in the proof of Corollary 3.3, we deduce from Lemma 4.11 that the pair (G, K) = (I, K) satisfies the Regularity Hypothesis for C = 1 and some finite Φ with B(Φ) ≤ A(Φ) ≤ C 0 .
Definition of S: Choose any generating set S [3] of I [3] satisfying the requirement of Lemma 4.8.For every I ⊆ [n] with |I| = 3 define S I = f I S [3] f −1 I and let S = |I|=3 S I .It is clear that each S I (and hence S) also satisfy the requirement of Lemma 4.8.By definition, A(g, S) = A(g, |L|=3 S L ) = max |L|=3 A(g, S L ), so it is enough to bound A(g, S L ) with |L| = 3.From now on we fix L with |L| = 3. Case 1: g = T ±1 αuα −1 v 1 s .where r, s ∈ [|I|] are the unique indices such that i r = u and i s = v.It follows that f −1 I gf I is an element of Mod [|I|] , for which we have only boundedly many (in fact, at most 5 2 = 10) possibilities.Hence we also have boundedly many possibilities for y = (f −1 I gf I )t(f −1 I gf I ) −1 (since t ∈ S J and J ⊂ [5] with |J| = 3, we have at most 5 3 |S [3] | = 420 possibilities for t and hence at most 4200 possibilities for y).Note that y also lies in I [|I|] and thus we can write y = t ±1 1 . . .t ±1 m where each t j ∈ S [|I|] and m ≤ C 1 for some absolute constant C 1 .By (4.4) we have gxg −1
i < j ≤ n and |a j | ≤ R for all j; (b) x s a S (4) and |a j | ≤ R for all j.
[3] of I 1 [3] , also with 42 elements, satisfying the conclusion of Lemma 4.8.Next for each I ⊆ [5] with |I| = 3 we choose f I satisfying the conclusion of Corollary 4.10 (it is easy to do this explicitly) and define S I = f I S [3] f −1 I .Also define S [5] = |I|=3,I⊂[5]
b n ) might grow polynomially as well.We are not aware of analogous results dealing with [IA n , IA n ].Finally, it is natural to ask if the assertions of Theorem 1.1 and Theorem 1.2(2) would remain true if the condition |a m | ≤ C on the exponents is replaced by the much more restrictive condition |a m | ≤ C for some absolute constant C. Clearly, if this stronger version of Theorem 1.1 (resp.Theorem 1.2(2)) holds, it would immediately imply polynomial growth for d([IA n