ELECTRONICCOMMUNICATIONSinPROBABILITY A QUESTION ABOUT THE PARISI FUNCTIONAL

We conjecture that the Parisi functional in the Sherrington-Kirkpatrick model is convex in the functional order parameter. We prove a partial result that shows the convexity along \one-sided" directions. An interesting consequence of this result is the log-convexity of L m norm for a class of random variables.


A problem and some results.
Let M be a set of all nondecreasing and right-continuous functions m : [0, 1] → [0, 1]. Let us consider two convex smooth functions Φ and ξ : R → R both symmetric, Φ(−x) = Φ(x) and ξ(−x) = ξ(x), and Φ(0) = ξ(0) = 0. We will also assume that Φ is of moderate growth so that all integrals below are well defined. Given m ∈ M, consider a function Φ(q, x) for q ∈ [0, 1], x ∈ R such that Φ(1, x) = Φ(x) and (1.1) Let us consider a functional P : M → R defined by P(m) = Φ(0, h) for some h ∈ R. Main question: Is P a convex functional on M?
The same question was asked in [7]. Unfortunately, despite considerable effort, we were not able to give complete answer to this question. In this note we will present a partial result that shows convexity along the directions λm + (1 − λ)n when m(q) ≥ n(q) for all q ∈ [0, 1]. It is possible that the answer to this question lies in some general principle that we are not aware of. A good starting point would be to find an alternative proof of the simplest case of constant m given in Corollary 1 below.
The functional P arises in the Sherrington-Kirkpatrick mean field model where with the choice of Φ(x) = log chx, the following Parisi formula inf m∈M log 2 + P(m) − 1 2 1 0 m(q)qξ (q)dq (1.2) gives the free energy of the model. A rigorous proof of this result was given by Michel Talagrand in [5]. Since the last term is a linear functional of m, convexity of P(m) would imply the uniqueness of the functional order parameter m(q) that minimizes (1.2). A particular case of ξ(x) = β 2 x 2 /2 for β > 0 would also be of interest since it corresponds to the original SK model [2].
In the case when m is a step function, the solution of (1.1) can be written explicitly, since for a constant m the function g(q, x) = exp mΦ(q, x) satisfies the heat equation Given k ≥ 1, let us consider a sequence We will denote m = (m 0 , . . . , m k ) and q = (q 0 , . . . , q k+1 ). Let us define a function m ∈ M by For this step function P(m) can be defined as follows. Let us consider a sequence of independent Gaussian random variables (z l ) 0≤l≤k such that Define Φ k+1 (x) = Φ(x) and recursively over l ≥ 0 define where E l denotes the expectation in (z i ) i≥l and in the case of m l = 0 this means Φ l (x) = E l Φ l+1 (x + z l ). Then P(m) for m in (1.3) is be given by For simplicity of notations, we will sometimes omit the dependence of P k on q and simply write P k (m). Let us consider another sequence n = (n 0 , . . . , n k ) such that The following is our main result.
Theorem 1 If n j ≤ m j for all j or n j ≥ m j for all j then Remark. In Theorem 1 one does not have to assume that the coordinates of vectors m and n are bounded by 1 or arranged in an increasing order. The proof requires only slight modifications which for simplicity will be omitted.
Since the functional P is uniformly continuous on M with respect to L 1 norm (see [1] or [7]), approximating any function by the step functions implies that P is continuous along the directions λm + (1 − λ)n when m(q) ≥ n(q) for all q ∈ [0, 1]. Of course, (1.6) implies that P k (m) is convex in each coordinate. This yields an interesting consequence for the simplest case of a constant function m(q) = m, which formally corresponds to the case of k = 2, 0 = m 0 ≤ m ≤ m 2 = 1 and 0 = q 0 = q 1 ≤ q 2 = q 3 = 1.
In this case, Here σ 2 = ξ (1) can be made arbitrary by the choice of ξ. (1.6) implies the following.
Corollary 1 implies that the L m norm of exp Φ(h + σz) is log-convex in m. This is a stronger statement than the well-known consequence of Hölder's inequality that the L m norm is always log-convex in 1/m. At this point it does not seem obvious how to give an easier proof even in the simplest case of Corollary 1 than the one we give below. For example, it is not clear how to show directly that Finally, let us note some interesting consequences of the convexity of f (m). First, f (0) ≥ 0 implies that the third cumulant of η = Φ(h + σz) is nonnegative, Another interesting consequence of Corollary 1 is the following. If we define by continuity f (0) = Eη = EΦ(h + σz) and write λ = λ · 1 + (1 − λ) · 0 then convexity of f (m) implies If A = log E exp(η − Eη) < ∞ then Chebyshev's inequality and (1.9) imply that and minimizing over λ ∈ [0, 1] we get, This result can be slightly generalized.
Corollary 2 If η = Φ(|h + z|) for some h ∈ R n and standard Gaussian z ∈ R n then the function m −1 log E exp mη is convex in m and, thus, (1.9) and (1.10) hold.
The proof follows along the lines of the proof of Corollary 1 (or Theorem 1 in the simplest case of Corollary 1) and will be omitted.
The proof of Theorem 1 will be based on the following observations. First of all, we will compute the derivative of P k with respect to q l . We will need the following notations. For Then the following holds.
Proof. The proof can be found in Lemma 3.6 in [7] (with slightly different notations).
It turns out that the function U l is nondecreasing in each m j which is the main ingredient in the proof of Theorem 1.
First, let us show how Lemma 1 and Theorem 2 imply Theorem 1.
Proof of Theorem 1. Let us assume that n j ≤ m j for all j ≤ k. The opposite case can be handled similarly. If we define m l = (n 0 , . . . , n l , m l+1 , . . . , m k ) We will prove that which, obviously, will prove Theorem 1. Let us consider vectors where q l+1 (t) = q l + t(q l+1 − q l ). Notice that we inserted one coordinate in vectors m l and q. For 0 ≤ t ≤ 1, we consider ϕ(t) = P k+1 (m l + , q l (t)).
It is easy to see that ϕ(t) interpolates between ϕ(1) = P k (m l ) and ϕ(0) = P k (m l−1 ). By Lemma 1, where U l+1 is defined in terms of m l + and q l (t). Next, let us consider Again, by Lemma 1, where U ε l+1 is defined in terms of m l ε and q l (t). It is obvious that for ε ∈ [0, 1] each coordinate of m l ε is not smaller than the corresponding coordinate of m l and, therefore, and, therefore, which is the same as Letting ε → 0 implies (2.4) and this finishes the proof of Theorem 1.

Proof of Theorem 2.
Let us start by proving some preliminary results. Consider two classes of (smooth enough) functions The next Lemma describes several facts that will be useful in the proof of Theorem 2.
Proof. (a) Since Φ k+1 is convex, symmetric and nonnegative then Φ l (x) is convex, symmetric and nonnegative by induction on l in (1.4). Convexity is the consequence of Hölder's inequality and the symmetry follows from the symmetry of Φ l+1 and the symmetry of the Gaussian distribution. Obviously, this implies that Φ l (x) ∈ C .
(b) Let z l be an independent copy of z l and, for simplicity of notations, let σ 2 = Ez 2 l . Since E l V l = 1 (i.e. we can think of V l as the change of density), we can write, , if we make the change of variables s = x + z l and t = x + z l then the right hand side of (3.3) can be written as We will split the region of integration {s ≥ t} = Ω 1 ∪ Ω 2 in the last integral into two disjoint sets In the integral over Ω 2 we will make the change of variables s = −v, t = −u so that for (s, t) ∈ Ω 2 we have (u, v) ∈ Ω 1 and dsdt = dudv. Also, since Φ l is symmetric by (a), f 1 ∈ C, f 2 ∈ C and, therefore, and (3.4) can be rewritten as and the latter holds because x ≥ 0 and s + t ≥ 0 on Ω 1 . This proves that (3.5), (3.4) and, therefore, the right hand side of (3.3) are nonnegative.
(d) Take f ∈ C. Positivity of E l V l f (x + z l ) is obvious and symmetry follows from . By (a), Φ l+1 ∈ C , and since f ∈ C, (b) implies that II ≥ 0. The fact that I ≥ 0 for x ≥ 0 follows from (c) because f (−x) = −f (x) and f (x) ≥ 0 for x ≥ 0.
(e) Take f ∈ C . Antisymmetry of E l V l f (x + z l ) follows from As in (d), the derivative can be written as First of all, I ≥ 0 because f ≥ 0 for f ∈ C . As in (3.3) we can write But both f and Φ l+1 are in the class C and, therefore, both nondecreasing which, obviously, implies that they are similarly ordered, i.e. for all a, b ∈ R, and as a result II ≥ 0. (f) Symmetry of g(x) = E l V l log V l follows as above and positivity follows from Jensen's inequality, convexity of x log x and the fact that Since Φ l+1 ∈ C and Φ l+1 ∈ C , (b) implies that for x ≥ 0, g (x) ≥ 0 and, therefore, g ∈ C.
Proof of Theorem 2.
We will consider two separate cases. Case 1. j ≤ l − 1. First of all, using Lemma 2 (a) we can rewrite U l as Using that we get For p ≤ j, we get and for p > j, X p does not depend on m j . Therefore, Hence, First of all, let us show that Since X j does not depend on z j and E j W j = 1, this is equivalent to Here f j and X j+1 are both functions of Z j+1 = Z j +z j . Since by (3.8), f l (Z l ) seen as a function of Z l is in C, applying Lemma 2 (d) inductively we get that f j (Z j+1 ) seen as a function of Z j+1 is also in C. By Lemma 2 (a), X j+1 seen as a function of Z j+1 is also in C. Therefore, f j and X j+1 are similarly ordered i.e.
and, therefore, using the same trick as in (3.3) we get (3.11) and, hence, (3.10). By Lemma 2 (d), E j W j f j (Z j+1 ) seen as a function of Z j is in C and by Lemma 2 (f), E j W j (X j+1 − X j ) = m −1 j E j W j log W j seen as a function of Z j is also in C. Therefore, they are similarly ordered and again Combining this with (3.10) implies that Since m j = p≤j (m p − m p−1 ), this and (3.9) imply that ∂U l /∂m j ≥ 0 which completes the proof of Case 1.
Case 2. j ≥ l. If we denote then a straightforward calculation similar to the one leading to (3.9) gives To show that this is positive we notice that and we will show that the last term with factor 2m j is bigger than all other terms with negative factors. If we denote then since Φ ∈ C , using Lemma 2 (e) inductively, we get that h(Z j+1 ) seen as a function of Z j+1 is in C . Each term in the third line of (3.12) (without the factor 2(m p − m p−1 )) can be rewritten as (3.13) the term in the second line of (3.12) (without the factor 2m l − m l−1 )) is equal to (3.13) for p = l, and the term in the fourth line (without 2m j ) can be written as (3.14) We will show that (3.14) is bigger than (3.13) for l ≤ p ≤ j. This is rather straightforward using Lemma 2. Notice that g l = g l (Z l ) seen as a function of Z l is in C by Lemma 2 (a). If we define for l ≤ p ≤ j, then the difference of (3.14) and (3.13) is Using the argument similar to (3.6) (and several other places above), it should be obvious that r p (−Z l ) = −r p (Z l ) since X i 's are symmetric and h is antisymmetric. Similarly, r(−Z l ) = −r(Z l ). Therefore, if we can show that r(Z l ) − r p (Z l ) ≥ 0 for Z l ≥ 0 (3.16) then, since g l ∈ C , we would get that g l (Z l )(r(Z l ) − r p (Z l )) ≥ 0 for all Z l and this would prove that (3.15) is nonnegative. Let us first show that (3.16) holds for p = j.
In this case, since X j does not depend on z j and, therefore, E j W j X j = X j , (3.16) is equivalent to for Z l ≥ 0. Let us define As above, ∆ j (−Z j ) = −∆ j (Z j ) and by Lemma 2 (b), ∆ j (Z j ) ≥ 0 for Z j ≥ 0, since h ∈ C and X j+1 ∈ C. Therefore, by Lemma 2 (c), For i = l this proves (3.17) and, therefore, (3.16) for p = j. Next, we will show that for all l ≤ p < j, and this, of course, will prove (3.16). If we define f 1 (Z p+1 ) = E p+1 W p+1 . . . W j h(Z j+1 ) and f 2 (Z p+1 ) = E p+1 W p+1 . . . W j (X j+1 − X j ) then (3.18) can be rewritten as Since h(Z j+1 ) ∈ C , recursive application of Lemma 2 (e) implies that f 1 (Z p+1 ) ∈ C . Since E j W j (X j+1 −X j ) = m −1 j E j W j log W j seen as a function of Z j is in C by Lemma 2 (f), recursive application of Lemma 2 (d) implies that f 2 (Z p+1 ) ∈ C. If we now define then, as above, ∆ p (−Z p ) = −∆ p (Z p ) and by Lemma 2 (b), ∆ p (Z p ) ≥ 0 for Z p ≥ 0, since f 1 ∈ C and f 2 ∈ C. Therefore, by Lemma 2 (c), ∆ p−1 (Z p−1 ) := E p−1 W p−1 ∆ p (Z p−1 + p j ) ≥ 0 if Z p−1 ≥ 0 and, easily, ∆ p−1 (−Z p−1 ) = −∆ p−1 (Z p−1 ). Therefore, if for i ≥ l we define we can proceed by induction to show that ∆ i (−Z i ) = −∆ i (Z i ) and ∆ i (Z i ) ≥ 0 for Z i ≥ 0. For i = l this proves (3.18). Thus, we finally proved that (3.14) is bigger than (3.13) for p ≥ l. To prove that (3.12) is nonnegative it remains to show that each term in the first line of (3.12) (without the factor −(m p − m p−1 )) is smaller than (3.14). Clearly, it is enough to show that EW 1 . . . W l−1 f l E p W p . . . W j (X j+1 − X j ) ≤ EW 1 . . . W l−1 f l E l W l . . . W j (X j+1 − X j ) (3.19) since the right hand side of (3.19) is equal to (3.13) for p = l which was already shown to be smaller than (3.14). The proof of (3.19) can be carried out using the same argument as in the proof of (3.10) in Case 1 and this finishes the proof of Case 2.