Symmetrization of Bernoulli

Let X be a random variable. We shall call an independent random variable Y to be a symmetrizer for X, if X+Y is symmetric around zero. A random variable is said to be symmetry resistant if the variance of any symmetrizer Y, is never smaller than the variance of X itself. We prove that a Bernoulli(p) random variable is symmetry resistant if and only if p is not 1/2. This is an old problem proved in 1999 by Kagan, Mallows, Shepp, Vanderbei&Vardi using linear programming principles. We reprove it here using completely probabilistic tools using Skorokhod embedding and Ito's rule.


Introduction
Let X be a random variable. We call an independent random variable Y to be a symmetrizer for X, if X + Y has a symmetric distribution around zero. Two simple cases immediately come to mind. One is when X itself is symmetric. In that case the constant −E(X) is a symmetrizer for X. On the other hand, for a general X, an independent random variable Y , which has the same law as −X, is obviously a symmetrizer. The difference in these two cases is the fact that the symmetrizer in the former case has zero variance while in the latter case it has the same variance as that of X. Thus, we are led to the question whether given a random variable X which is not symmetric, can one find a symmetrizer which has a variance less than that of X ? If such a symmetrizer cannot be found, the random variable X is said to be symmetry resistant. Symmetry resistance is an interesting property which seems to be surprisingly difficult to prove, even in simplest of cases. For example, let X be a Bernoulli(p) random variable. If p = 1/2, it is immediate that the degenerate random variable, Y ≡ −1/2, is a symmetrizer for X. Hence, X is not symmetry resistant. However, we shall show that if p = 1/2, for any symmetrizer Y , we have where q = (1 − p). It is immediate from this inequality that X is symmetry resistant and the minimum variance symmetrizer has the same variance as −X.
The last result is the main content of a paper by Kagan, Mallows, Shepp, Vanderbei and Vardi (see [Kagan et al., 1999]), where the reader can look for the motivation and the history of this problem. We merely reprove the result here. The previous authors discuss why characteristic functions are not helpful in determining symmetry resistance and how symmetry resistance is independent of decomposability of the random variable into symmetric components. Ultimately the authors use duality theory of linear programing to prove inequality (1). In fact, the correct solution of the linear programing problem had to be first guessed from the output of a linear programing software. The novelty in this paper is that we have used purely probabilitic techniques, as ubiquitous as Ito's rule, to prove (1). This avoids the technicalities of linear programing and adds to the collection of weaponry of attacking discrete problems through stochastic calculus. I would like to thank Prof. Larry Shepp for communicating the problem to me and his encouragement during the work.

Proof
Proof of inequality (1). Let X be a Bernoulli(p) random variable for p = 1/2. Let Y be any symmetrizer for X with finite variance. By Skorokhod embedding of mean zero, finite variance random variables in Brownian motion, there is a stopping time τ such that for any standard Brownian motion (i.e., starting from zero), the stopped process has the distribution of Y − E(Y ). That is to say, if W is a standard Brownian motion Here and throughout d = refers to equality in distribution. On a suitable probability space construct a process {B t , t ∈ [0, ∞)} such that and where B 0 is independent of the standard Brownian motion W t . Then clearly, B t is a Brownian motion which has the initial distribution of X − E(X). Also, by equation (2), we have But, since Y is a symmetrizer of X, we should have E(X + Y ) = 0, and hence Now, let ρ be any smooth odd function with bounded derivatives on the real line. By Ito's rule, we have where M t is a martingale. Thus, by the optional sampling theorem, we have

Electronic Communications in Probability
Now, since ρ is an odd function, and X + Y is symmetric around zero, by equation (4), E(ρ(B τ )) = 0. Thus (6) reduces to Let us now look at the RHS of (7). By conditioning on B 0 , we have Let now impose the following restrictions on ρ (an example will soon follow): 1. |ρ ′′ | ≤ 1.
in addition to the fact that ρ is odd. Thus for any x, we have Then, by equation (8) we get Also, by equation (3), Substituting these values in equation (6) gives We now assume that q = p. The other case of q = p has already been discussed in the previous section. Now, use the fact that |ρ ′′ | ≤ 1, to conclude Thus, from (9), we get E(τ ) ≥ 2ρ(q).
However, by Skorokhod embedding, E(τ ) = Var(Y ). Hence Finally, we have to exhibit such a ρ. For 0 ≤ x ≤ 1, define Extend it to the entire positive axis by the property ρ(1 + x) = −ρ(x). That is to say, And extend to the entire negative axis by the oddness of ρ. That is The function ρ does not have a continuous second derivative. However, the set of discontinuity is just the countable set of integers, and this is sufficient for the usual Ito's rule to go through. See, for example, [Karatzas and Shreve, 1991, page 219]. More importantly, |ρ ′′ (x)| ≤ 1, whenever x is not an integer. Hence by equation (10), we get which is what we claimed. This proves the result.
It is interesting to see how every inequality above becomes equality only when τ is the Skorokhod embedding for Y = −X. This proves the uniqueness of the minimum variance symmetrizer.