On the Distribution of Critical Points of a Polynomial

This paper proves that if points $Z_1,Z_2,...$ are chosen independently and identically using some measure $\mu$ from the unit circle in the complex plane, with $p_n(z) = (z-Z_1)(z-Z_2)...(z-Z_n)$, then the empirical distribution of the critical points of $p_n$ converges weakly to $\mu$.


Introduction
Across many fields of mathematics, one of the fundamental questions about a function is the location of its zeros. Entire fields such as algebraic geometry and the emergent study of stable functions have locations of zeros as their focus.
The relation between the zeros of a function and the zeros of its derivative (the critical points) is interesting and not always obvious. In the case where all zeros are real, Rolle's theorem tells us that the zeros of the derivative interlace the zeros of the function itself. In the case of complex polynomials the analogous result is the Gauss-Lucas theorem which states that the zeros of the derivative of f must lie in the convex hull of the zeros of f and gives a representation of the zeros of f ′ as convex combinations of the zeros of f . A corollary of this is that differentiating preserves stability. Differentiation is also known never to increase the number of non-real zeros of a polynomial.
Two famous conjectures in this area are the conjectures of Sendov and Smale. The former, made by Blagovest Sendov during the 1950's, states that if the roots z 1 , z 2 , ..., z n of a polynomial all lie inside the closed unit disc, then for each root of the polynomial, the closed unit disc centered at the root must contain at least one critical point. The latter, made by Steve Smale, states that if f is a polynomial of degree n with at least one root 0 and f ′ (0) = 0, then, where K = 1 or n−1 n . Sendov's conjecture has been proven for the case when z 1 , z 2 , ..., z n all lie on the unit circle, whereas Smale's conjecture has been proven for when f has all its roots, save 0, on the unit circle. The most general forms of these conjectures are still unsolved. More information on these conjectures and proofs of some of the special cases can be found in [RS02].
Recent work in random marix theory has put forward numerous connections between the zeros and critical points of Riemann zeta function and those of the characteristic polynomial of a unitary matrix in the Circular Unitary Ensemble. While Keating and Snaith in [KS00] conjectured values for all even moments of Riemann zeta function on the critical line, Dueñez et al. ([DFFHMP10]) compared the horizontal distribution of critical points of the Riemann zeta function to the radial distribution of critical points of the characteristic polynomial of a random unitary matrix.
A probabilistic study on the roots of derivatives of polynomials was done by Pemantle and Rivin in [PR12]. Let f be a polynomial with n roots that are chosen independently and uniformly from a measure µ on the complex plane. They conjectured that the empirical distribution of the roots of f ′ converges weakly to µ as n → ∞. They prove this in the special case when µ has finite 1-energy, namely when µ satisfies 1 |z − w| dµ(z)dµ(w) < ∞.
This condition cannot hold, however, when µ is supported on any set of dimension 1 or less. The aim of the present paper is to extend their result to the case of any measure supported on the unit circle.
The author would like to mention that while this paper was being refereed, a proof of the Pemantle-Rivin conjecture in the general case was found in [Za12], along very different lines from the approach taken here.

Notations and Background
Say, Z 1 , Z 2 , ... is a sequence of points chosen i.i.d. with respect to some distribution µ on the unit circle. Write, Z k = exp(2πiθ k ), so that {θ k } is a collection of IID random variables whose common law is supported on [0, 1], which we denote by ν.
We shall write D for the open unit disc, and C for the unit circle.
In their paper, [PR12], the authors conjectured that, for any distribution µ on the closed unit disc, Z(p ′ n ) converges weakly to µ. That paper also proves the following proposition.
Proposition 2.1. Let µ be the uniform measure on C. Then Z(p ′ n ) converges to C in probability, that is, P (Z(S) ≥ ǫ) → 0) for any ǫ > 0 and any closed set S ⊂ D, disjoint from C.
In this note, we shall generalize this to prove that Lemma 2.2. For any distribution µ on C, Z(p ′ n ) converges to C in probability. In fact, if µ is not uniform on C, the convergence is almost everywhere.
The above leads us to prove our main result, which is a special case of the aforementioned conjecture in [PR12]: Theorem 2.3. For any distribution µ on C, Z(p ′ n ) converges weakly to µ on C. The proof, as shall be seen in forthcoming sections, can be divided in to two parts, the latter following a pattern similar to the proof of Weyl's equidistribution criterion (see, for example [Ch68]). The former requires the following theorem (proved both in [KR01] and in [CN06]) regarding a companion matrix of the critical points.

Proofs of Lemma 2.2 and Theorem 2.3
We first begin by proving a small lemma.
By the Strong Law of Large Numbers, for all k ≥ 1, and so by Weyl's criterion, for any 0 ≤ a < b ≤ 1, But Since ν is not uniform on [0, 1], we have arrived at a contradiction. So, there must exist at least one non-zero c k .
We proceed to use this fact for the proof of Lemma 2.2.
Proof of Lemma 2.2. Assume µ is not the uniform distribution on the circle (as the uniform case has been taken care of in [PR12]). Then, as mentioned above, there is at least one non-zero c k . Thus the power series function f (z) = ∞ k=0c k+1 z k exists at every point z ∈ D, is analytic there (since |c k | < 1, ∀k), and so has only finitely many zeros inside any r-ball, where r < 1.
V n has n − 1 zeros, which are exactly the zeros of p ′ n (z), and n poles, which are exactly the zeros of p n (z). Thus V n (z) is analytic inside D. We shall show that as n → ∞, V n converges inside the disc to −f , uniformly over compact sets. To see this, note that for z ∈ D, where, we write a k+1 n for the kth power sum average Let 0 < r < 1. Given any δ > 0, ∃K ≥ 1 such that Corresponding to the chosen K, there exists an N ≥ 1 such that, ∀n ≥ N and ∀k = 1, 2, ..., K − 1. Therefore, ∀n ≥ N and all z ∈ B r (0), which proves uniform convergence of V n to −f over compact sets.
Using Hurwitz's theorem (see [Co78]), given any 0 < r < 1, there exists an M ≥ 1 for which V n and f have the same number of zeros inside B r (0) for all n ≥ M. That is, p ′ n and f shall have the same number of zeros inside B r (0) for all n ≥ M. But, as discussed above, f has only finitely many zeros inside B r (0). Thus Z(p ′ n ) converges to the unit circle almost surely.
Our main result, Theorem 2.3, will be a consequence of the following proposition.
Proof. Note that, it is easy to see that this theorem holds true for k = 1, because the average of the critical points is exactly equal to the average of the roots (by comparing the coefficients of z n−1 in p n (z) with z n−2 of p ′ n (z)). To prove the result for general k, we use a result of [KR01] (also appeared in [CN06]), mentioned as a proposition in Section 2, to see that for k ≥ 2, (y Note that the expansion of [D I − 1 n J + zn n J] k is the sum of all terms such as (2) where the exponents l 1 , l 2 , ..., l 3k are non-zero integers, with l 3j−2 + l 3j−1 + l 3j = 1 for all j = 1, 2, .., k. Clearly the number of such terms is 3 k , which does not depend on n, and so, if we find that the trace of the matrix in the expression (2) converges as n → ∞ to a l 1 ,l 2 ,...l 3k , then the trace of [D I − 1 n J + zn n J] k converges to a l 1 ,l 2 ,...,l 3k .
The above tells us that there exists p, q, s 0 , s 1 , s 2 , ..., s k−1 ≥ 0 such that, term (2) is of the form where the numbers p, q, s 0 , s 1 , ..., s k−1 are determined solely by the l i 's (and so, are independent of n).
Also, M can only be one of the following terms: D k or D m J n or D m 1 J n D m 2 for some m, m 1 , m 2 ≥ 0, which are fixed, ≤ k, and dependent only on the l i 's. Furthermore, the scalar coefficient in (3) is always O(1).
Observe that, if M = D k , then the scalar coefficient in (3) is equal to 1 and Thus, We now have all the tools required to prove our main result, namely Theorem 2.3.
Proof of Theorem 2.3. Say we write, j ), j = 1, 2, ..., n − 1. The proof will consist of three major segments. Our first task is to prove that In fact, unless µ is uniform on the circle, we will show that Next, we shall use the above information to show that exp(2kπiφ (Again, the convergence is almost sure, unless µ is uniform on C.) Finally, using arguments analogous to those in the proof of Weyl's equidistribution criterion, we shall arrive at our final result.
Assume, initially, that µ is not the uniform law on C. For the first task as noted above, observe that, by Lemma 2.2, given any ǫ > 0, Clearly then, a simple squeeze theorem argument gives us Now, from Proposition 3.2, for any positive integer k, Note that (5) gives us that (1 − (r So, (6) gives, cos(2kπφ    −→ E(sin(2kπΘ)).
Then, for any trigonometric polynomial q(x), Let f be a continuous real-valued function on [0, 1] and fix ǫ > 0. By Stone-Weierstrass theorem ( [St48]), there exists a trigonometric polynomial q such that |f − q| < ǫ. So, The first and third terms on the right hand side are each < ǫ while the second term goes to 0 almost surely, by (7). Hence for any f continuous on [0, 1], and this holds for complex-valued continuous functions as well (which is easily seen by comparing the real and imaginary parts). Thus, the joint empirical distribution of φ (n) j , j = 1, 2, ..., n−1, converges weakly to ν, which means that the joint empirical distribution of exp(2πiφ Note that the above is a slightly weaker version of (5), since the convergence is now in probability, and not almost sure.
For the rest of the proof, we can follow the same arguments as in the non-uniform case, except that the almost sure convergence in each of the statements will be replaced by convergence in probability. Thus we shall arrive at for any continuous function f : [0, 1] → C. Then, as before, the joint empirical distribution of φ (n) j , j = 1, 2, ..., n − 1, converges weakly to ν (which is the uniform law on [0, 1]), and so, the joint empirical distribution of exp(2πiφ (n) j ), j = 1, 2, ..., n−1, converges weakly to uniform on C. Lemma 2.2 then gives us the desired result.