Thick subsets that do not contain arithmetic progressions

We adapt the construction of subsets of {1, 2, ..., N} that contain no k-term arithmetic progressions to give a relatively thick subset of an arbitrary set of N integers. Particular examples include a thick subset of {1, 4, 9, ..., N^2} that does not contain a 3-term AP, and a positive relative density subset of a random set (contained in {1, 2, ..., n} and having density c n^{-1/(k-1)}) that is free of k-term APs.


Introduction
For a finite set N (whose cardinality we denote by N ), set r k (N ) to be the largest possible size of a subset of N that contains no k-term arithmetic progressions (k-APs). A little-known result of Komlós, Sulyok, and Szemerédi [KSS75] implies that for an explicit positive constant C. Abbott [Abb90] reports that their proof gives C = 2 −15 , and indicates some refinements that yield C = 1/34. In this work, we focus on the situation when N itself has few solutions and we can give bounds on r k (N ) that are much stronger than those implied by Eq. (1) and the currently best bounds on r k ([N ]). In particular, we adapt the Behrend-type construction [O'B10] of subsets of [N ] without k-APs to arbitrary finite sets N . We draw particular attention to subsets of the squares and to subsets of random sets. As the statement of our theorem requires some notation and terminology, we first give two corollaries.
Our first corollary brings attention to the fact that while the squares contain many 3-APs, they also contain unusually large subsets that do not. Here and throughout this paper, exp(x) = 2 x and log x = log 2 (x). For comparison, r 3 ([N ]) ≥ N exp(−2 √ 2 √ log N + 1 4 log log N ). Corollary 1. There is an absolute constant C > 0 such that for every N there is a subset of {1, 4, 9, . . . , N 2 } with cardinality at least C · N · exp −2 √ 2 log log N + 1 4 log log log N that does not contain any 3-term arithmetic progressions.
Our second corollary identifies sets that have subsets with no k-APs and with positive relative density .
Corollary 2. For every real ψ and integer k ≥ 3, there is a real δ > 0 such that every sufficiently large N ⊆ Z that has fewer than ψ|N | arithmetic progressions of length k contains a subset that is free of k-term arithmetic progressions and has relative density at least δ. In particular, for each δ > 0, if n is sufficiently large and N ⊆ {1, 2, . . . , n} is formed by including each k independently with probability cn −1/(k−1) > 0, then with high probability N contains a subset A with relative density δ and no k-term arithmetic progressions.
The structure of the proof requires us to consider a generalization of arithmetic progressions. A k-term D-progression is a nonconstant sequence a 1 , . . . , a k whose (D + 1)-st differences are all zero: Equivalently, a 1 , . . . , a k is a k-term D-progression if there is a nonconstant polynomial Q(j) with degree at most D and . Clarifying examples of 5-term 2-progressions of integers are 1, 2, 3, 4, 5 (from Q(j) = j), and 4, 1, 0, 1, 4 (from Q(j) = (j−3) 2 ), and 1, 3, 6, 10, 15 (from Q(j) = 1 2 j + 1 2 j 2 ). Let Q(j) = D ′ i=0 q i j i be a polynomial with degree D ′ ≥ 1, so that Q(1), Q(2), . . . , Q(k) is a k-term D-progression for all D ≥ D ′ . The quantity D ′ !q D ′ , which is necessarily nonzero, is called the difference of the sequence, and (D ′ , Q(1), D ′ !q D ′ ) is the type of the sequence. Note that different progressions can have the same type: both 1, 4, 9, 16, 25 and 1, 5, 11, 19, 29 have type (2, 1, 2). For any set N , we let Type k,D (N ) be the number of types of k-term D-progressions contained in N . The proof of [O'B10, Lemma 4] shows that Type k,D (N ) ≪ |N | diam(N ). Since the type of a k-term D-progression is determined by its first D +1 elements, we also have Type k,D (N ) ≤ N D+1 . We define We can now state our main theorem.
Theorem 1. Let k ≥ 3, n ≥ 2, D ≥ 1 be integers satisfying k > 2 n−1 D. Let Ψ(N ) be any function that is at least 2. There is a constant C = C(k, D, Ψ) such that for all N ⊆ Z with Corollary 2 is now straightforward: set D = 1 and Ψ(N ) = max{ψ, 2} and take to arrive at the first sentence. Considering the random set N described in the second sentence of Corollary 2, for each pair (a, a + d) of elements of N the likelihood of the next k − 2 elements a + 2d, . . . , a + (k − 1)d of the arithmetic progression being in N is (cn −1/(k−1) ) k−2 . Consequently, the expected number of k-term arithmetic progressions in N is and the expected size of N is N = n · cn −1/(k−1) = cn k/(k−1) . We can take Ψ(N ) to be a constant with high probability, and so Corollary 2 follows from Theorem 1. Corollary 1 is only a bit more involved. It is known (perhaps since Fermat, see [Con08, Con07, vdP07, BFS03, FO04, KK05, McR10] for a history and for the results we use here) that while the squares do not contain any 4-term arithmetic progressions, the 3-term arithmetic progressions a 2 , b 2 , c 2 are parameterized by with s, t, u ≥ 1 and gcd(s, t) = 1. Merely observing that s, t, u ≥ 1, b ≤ N yields that there are fewer than 2πN log N triples (s, t, u) with a, b, c in [N ], i.e., Now, setting k = 1, n = 2, D = 1, Ψ(N ) = 2π log N in Theorem 1 produces Corollary 1.
Section 2 gives a short outline of the construction behind Theorem 1, which is given in greater detail in Section 3. We conclude in Section 4 with some unresolved questions.

Overview of construction proving Theorem 1
Throughout this work we fix three integers, k ≥ 3, n ≥ 2, D ≥ 1, that satisfy k > 2 n−1 D; in other words, one may take n = ⌈log(k/D)⌉.
In this section, we outline the construction, suppressing as much technical detail as possible. In the following sections, all definitions are made precisely and all arguments are given full rigor.
Fix Ψ(N ), and take N ⊆ Z with |N | = N , and so that N contains less than N Ψ(N ) types of k-term D-progressions. The parameters N 0 , d, δ are chosen at the end for optimal effect.
Let A 0 = R k,2D (N 0 ) be a subset of [N 0 ] without k-term 2D-progressions, and Consider ω, α in T d (we average over all choices of ω, α later in the argument), and set where Annuli is a union of thin annuli in R d with thickness δ whose radii are affinely related to elements of A 0 . Set Then A \ T is free of k-term D-progressions, and so r k,D (N ) ≥ |A \ T | = |A| − |T |, and more usefully with the expectation referring to choosing ω, α uniformly from the torus T d . We have We also have a, b), and is 0 otherwise, and the summation has Type k,D (N ) summands. Using the assumption that A 0 is free of k-term 2Dprogressions, we are able to bound efficiently in terms of the volume of Annuli and the volume of a small sphere. We arrive at . Given x ∈ R d , we denote the unique element y of Box 0 with x − y ∈ Z d as x mod 1.
Let A 0 be a subset of [N 0 ] with cardinality r k,2D ([N 0 ]) that does not contain any k-term 2Dprogression, and assume 2δN 0 ≤ 2 −2D . We define Annuli in the following manner: where z ∈ µ ± σ is chosen to maximize the volume of Annuli. Geometrically, Annuli is the union of |A| spherical shells, intersected with Box D . From [O'B10, Lemma 3], the Barry-Esseen central limit theorem and the pigeonhole principle yield: Lemma 1 (Annuli has large volume). If d is sufficiently large, A 0 ⊆ [N 0 ], and 2δ ≤ 1/n, then the volume of Annuli is at least 2 5 2 −dD |A 0 |δ.

Set
A := A(ω, α) = {n ∈ N : n ω + α mod 1 ∈ Annuli}, which we will show is typically (with respect to ω, α being chosen uniformly from Box 0 ) a set with many elements and few types of D-progressions. After removing one element from A for each type of progression it contains, we will be left with a set that has large size and no k-term D-progressions. Define T := T (ω, α) to be the set which is contained in A(ω, α). Observe that A \ T is a subset of N and contains no k-term D-progressions, and consequently r k,D (N ) ≥ |A \ T | = |A| − |T | for every ω, α. In particular, First, we note that Lemma 2. Suppose that p(j) is a polynomial with degree D ′ , with D ′ -th coefficient p ′ D , and set x j := ω p(j) + α mod 1. If x 1 , x 2 , . . . , x k are in Box D and k ≥ D + 2, then there is a vector polynomial Thus, the x i are a D ′ -progression in R d , say P (j) = D ′ i=0 P i j i has P (j) = x j and D ′ !P D ′ = D ′ !p D ′ ω mod 1 = b ω mod 1. Recalling that z was chosen in the definition of Annuli, by elementary algebra is a degree 2D ′ polynomial in j (with real coefficients), and since P (j) = x j ∈ Annuli for j ∈ [k], we know that for all j ∈ [k], and also Q(1), . . . , Q(k) is a 2D ′ -progression. Define the real numbers a j ∈ A 0 , ǫ j ∈ ±δ by For a finite sequence (a i ) k i=1 , we define the forward difference ∆(a i ) to be the slightly shorter finite sequence (a v+1 − a v ) k−1 v=1 . The formula for repeated differencing is .
We note that a nonconstant sequence (a i ) with at least 2D + 1 terms is a 2D-progression if and only if ∆ 2D+1 (a i ) is a sequence of zeros. If a i = p(i), with p a polynomial with degree 2D and lead term p 2D = 0, then ∆ 2D (a i ) = ((2D)!p 2D ), a nonzero-constant sequence. Note also that ∆ is a linear operator. Finally, we make use of the fact, provable by induction for 1 ≤ m ≤ k, that We need to handle two cases separately: either the sequence (a i ) is constant or it is not. Suppose first that it is not constant. Since a i ∈ A 0 , a set without k-term 2D-progressions, we know that ∆ 2D+1 (a i ) = (0), and since (a i ) is a sequence of integers, for some v Since |ǫ i | < δ, we find that and since we assumed that 2δN 0 ≤ 2 −2D , we arrive at the impossibility Now assume that (a i ) is a constant sequence, say a := a i , so that . This translates to Clearly a degree 2D ′ polynomial, such as P (j) 2 2 , cannot have the same value at 2D ′ + 1 different arguments; we pull now another lemma from [O'B10, Lemma 1] that quantifies this.
Using Lemma 3, the lead coefficient P D ′ of P (j) satisfies where F is an explicit constant. We have deduced that E(D ′ , a, b) = 1 only if a ω + α mod 1 ∈ Annuli and b ω mod 1 2 ≤ √ F σδ.
Since α is chosen uniformly from Box 0 , we notice that independent of ω. Also, we notice that the event { b ω mod 1 2 ≤ √ F σδ} is independent of α, and that since b is an integer, ω mod 1 and b ω mod 1 are identically distributed. Therefore, the event and so Equations (3), (4), and (5) now give us .