On a bound of Hoeffding in the complex case

It was proved by Hoeffding in 1963 that a real random variable X confined to [a, b] satisfies E e^(X--E X) $\le$ e^((b--a)^2/8). We generalise this to complex random variables.


Introduction
A celebrated concentration inequality of Hoeffding relies on the following bound. Lemma 1.1 ([4]). Let X be a real random variable such that a ≤ X ≤ b. Then 1 ≤ E e X−E X ≤ e (b−a) 2 /8 .
Given the hundreds of references to this inequality in the literature, we believe that a similarly tight bound for complex random variables may also have application. Indeed, our ongoing investigation of complex martingales related to multidimensional asymptotics encounters such a need. Our aim, therefore, is to find a complex analogue of Lemma 1.1.
There are several possible complex replacements for the real bounds a ≤ X ≤ b. A natural choice would be to confine Z to a disk of given radius, but we will use the weaker condition that the support of Z has bounded diameter. This measure of spread naturally arises in the study of separately Lipschitz functions [2], also called functions satisfying the bounded difference condition [1]. For a complex random variable Z, define the diameter of Z to be where Z 1 , Z 2 are independent copies of Z, * Research supported by the Australian Research Council.
1 with the infimum of an empty set taken to be ∞. Since Hoeffding's bound states that E e X−E X is concentrated near 1, the natural complex analogue is to bound the distance of E e Z−E Z from 1. This is the nature of our main theorem.
Since |Z − E Z| ≤ α implies diam Z ≤ 2α, Theorem 1.1 has a simple consequence, which is a complex version of an inequality used many times in proving Azuma-type inequalities.
Note that in both the theorem and its corollary, distrubutions supported on {−1, +1} are enough to show that the constants (respectively Hoeffding actually found the best possible bound on e X−E X in the real case, as we recall in the next section. In Section 2.1 we show that for d ≤ 3.12 the same tighter bound holds in the complex case too, making use of a lemma that only random variables with support of at most three points need to be considered. Then in Section 2.2 we complete the proof of Theorem 1.1 for all d.

Results
Hoeffding's paper [4] used the convexity of the exponential function to find the tightest possible bound in the real case. For d > 0, consider the random variable X d supported on {0, d} with We can now state Hoeffding's bound in its strongest form.
Lemma 2.1. Let X be a real random variable such that a ≤ X ≤ b. Then where the first inequality holds with equality if and only if X = X b−a + a almost surely.

The complex case: tight bound for small diameter
Let Z 1 , . . . , Z n be complex random variables and let c 1 , . . . , c n be nonnegative real numbers with c 1 + · · · + c n = 1. Define the mixture Z = Mix c 1 ,...,cn (Z 1 , . . . , Z n ) by for every measurable set A ⊆ C. A standard property of mixtures is that for any measurable function F : C → C for which the expectations exist. Let Z d be the class of all complex random variables Z with E Z = 0 and diam Z ≤ d, and let Z (k) d be the subclass of Z d consisting of those variables supported on at most k points.
Proof. This is an example of the "Carathéodory Principle", see for example [3,6,7]. Since we didn't find a statement in the literature that exactly matches our needs, we outline the proof.
First, by a simple induction, any Z ∈ Z d with finite support can be written as a mixture of members of Z (3) d . This is true because, for any finite set of points in C having the origin in its convex hull, there is a subset of three or fewer points having the origin in its convex hull. By (2.1), this implies that the lemma holds when Z has finite support.
For more arbitrary Z ∈ Z d , we can use the continuity of F to show that for any Allowing ε to tend to 0 completes the proof.
We now return to Hoeffding's bound using Lemma 2.2 for F (z) = e z − 1. Obviously |E e Z−E Z − 1| = 0 if Z is constant, so we need to consider the cases of 2-point and 3-point supports. Since a random variable supported on 3 collinear points is a mixture of two random variables supported on 2 points, in the case of 3 points only the non-collinear case needs to be considered.
Proof. The case of real Z is treated in Lemma 2.1. More generally, since E Z = 0, Z = e iθ X where X is real. Since E X = 0, there are x, x ≥ 0 such that X has support −x with probability x /(x + x ) and x with probability x/(x + x ). For any odd k we can calculate that This shows that either X or −X has only nonnegative moments. By adding π to θ if necessary, we assume that the former holds. Now, recalling that E Z = E X = 0, we can calculate The case of a support of three points is considerably more difficult. To begin, define Since e z ≥ 0 for |z| ≤ π/2 we see that d 0 ≥ π/2. The actual value is almost twice as large.
Proof. By applying Lemma 2.2 to F (z) = e d − z, we see that only supports of two points or three non-collinear points need to be considered. For supports of two points, we have by definition There appears to be no closed form for the infimum. However, careful numerical computation shows that the infimum crosses 0 at d = d 2 ≈ 3.120491233, which occurs when = d, x ≈ 0.636527202 and θ ≈ 1.9198934984.
By the compactness of Z with a support of three non-collinear points {z 1 , z 2 , z 3 }. Define Z (x,y) to be the random variable with the same support, but mean at x + iy (thus Z = Z 0,0 ). As is well known, x, y determine the probabilities at z 1 , z 2 , z 3 linearly, so for some complex constants A, B, C we have valid whenever x + iy lies in the convex hull of {z 1 , z 2 , z 3 }.
In order for Z to be a local minimum for the real part, the coefficients of x and y in (2.2) must be purely imaginary. Therefore, for some v, w ∈ R, A = C + iv and B = iC + iw. Substituting in these values and writing C = c 0 + ic 1 we find which is clearly negative for c 0 > 0. Thus (x, y) = (0, 0) is not a local minimum for E e Z (x,y) −E Z (x,y) , contrary to our assumption. This proves that d 0 = d 2 .
Proof. We can rely on continuity to assume that d < d 0 .
By where the support of Z consists of three non-collinear points. Let Z (x,y) be the random variable with the same support as Z, but mean at x + iy. As in the proof of Lemma 2.4, (2.2) holds. Since we are assuming this to be a local maximum for |E e Z − 1|, A − C and B − iC must be orthogonal to C − 1, i.e. real multiples of i(C − 1). That is, for some v, w ∈ R, A = C + iv(C − 1) and B = iC + iw(C − 1). Substituting in these values, writing C = c 0 + ic 1 for c 0 , c 1 ∈ R, and defining ∆ = |C − 1| we can expand We now show that Q cannot be negative semidefinite. If ∆ = 0 then E e Z − 1 = 0, which is clearly not a maximum, so assume ∆ > 0. The trace of Q is ∆(v 2 + w 2 − 1), which is impossible for a negative semidefinite matrix if v 2 + w 2 > 1, so which is negative (since d < d 0 implies c 0 > 0), which is also impossible for a negative semidefinite matrix. Therefore, there is no local maximum here and the proof is complete.
We have no reason to believe that Theorem 2.1 requires the condition d ≤ d 0 , and expect that it is true for all d. However, the same proof is insufficient since it is possible for local maxima to occur for supports of three points. However, we can now complete the proof of Theorem 1.1 for all d.

Proof of Theorem 1.1
We need two technical bounds whose uninteresting proofs are omitted, and a standard result on planar sets.
Lemma 2.6. Let Z be a bounded complex random variable. Then Z is almost surely confined to some closed disk of radius 1 Proof. This follows from a standard result on convex sets, see [5,Thm. 12.3] for example. An equilateral triangle shows that the constant cannot be reduced.
Proof of Theorem 1.1. For d ≤ 3, the theorem follows from Theorem 2.1, so we can assume that d ≥ 3. By Lemma 2.6, there is some a ∈ C such that |Z − a| ≤ 1 √ 3 d. We will find two bounds on |E Ze Z |. First we argue that where α, β are defined by |a| = 1 √ 3 αd and |E e Z | = 1.65 βe d 2 /8 , and we have used that We can now complete the proof. We have

Conclusions
In conclusion, we note some questions that we have not answered. Our prediction that Theorem 2.1 holds for all d is one of them. We can also ask for an exact description of the  d are not always equal. However it could be that the parts of them in the right half-plane are equal.
By considering mixture with an identically-zero random variable, we find that the region S d is star-like from the point 1, and therefore simply connected. Lemma 2.2 applied to functions of the form F (z) = e z−iθ shows that S d and S