Free Exponential Families as Kernel Families

Free exponential families have been previously introduced as a special case of the q-exponential family. We show that free exponential families arise also from a procedure analogous to the definition of exponential families by using the Cauchy-Stieltjes kernel instead of the exponential kernel. We use this approach to re-derive several known results and to study further similarities with exponential families and reproductive exponential models.


Introduction
Since the seminal work of Voiculescu [23], there has been a flurry of activity on how properties of free convolution µ ⊞ ν of probability measures are similar to and how they differ from properties of classical convolution µ * ν. In particular, free probability analogues of the Central Limit Theorem, of the Poisson limit theorem, and the Lévy-Khinchin representation of ⊞-infinitely divisible laws are now known, see [13]. New additional analogies between free and classical probability are developed in [4,5]. In this paper we study a free probability analogue of the concept of exponential family.
Free exponential families were introduced in [10, Definition 4.1] as part of a study of the relations between approximation operators, classical exponential families and their q-deformations. An alternative approach to free exponential families which we adopt in this paper emphasizes similarities to classical exponential families, and is based on an idea of kernel family introduced in [25]. We show that the two approaches are closely related, and that every non-degenerate compactly supported probability measure generates a free exponential family, see Theorem 3.1. We then relate variance functions of free exponential families to free cumulants. This relation is simpler than the corresponding relation for classical exponential families and is expressed by a concise formula. We apply the formula to compute free cumulants of the "free gamma" law which were stated without proof in [9], to derive simple necessary conditions for a smooth function to be the variance function of a free exponential family, and to investigate similarities with classical dispersion models [15].

Cauchy-Stieltjes Kernel Families
According to Weso lowski [25], the kernel family generated by a kernel k(x, θ) consists of the probability measures where L(θ) = k(x, θ)ν(dx) is the normalizing constant, and ν is the generating measure.
Definition 2.1. Suppose ν is a compactly supported non-degenerate (i.e. not a point mass) probability measure. Let The Cauchy-Stieltjes family generated by ν is the family of probability measures where Θ ∋ 0 is an open set on which M(θ) is well defined, strictly positive and θ supp(ν) ⊂ (−∞, 1). (We shall only consider Θ = (−ε, ε) with ε > 0 small enough.) Our first goal is to show that the Cauchy-Stieltjes family is essentially the same concept as the concept of free exponential family introduced in [10]. We begin with a suitable reparametrization of K(ν; Θ).
2.1. Parameterizations by the mean. From (2.1) we compute the mean m(θ) = xP θ (dx). Since P 0 = ν we get m(0) = xν(dx) = m 0 and for θ = 0, a calculation gives Since M(0) = 1 and M(θ) is analytic at θ = 0, we see that m(θ) is analytic for |θ| small enough. We have Since ν is non-degenerate, for all |θ| > 0 small enough. Thus the function θ → m(θ) is increasing on an open interval containing 0. Denoting by ψ the inverse function, we are thus lead to parametrization of a subset of K(ν, Θ) by the mean, The variance function of the Cauchy-Stieltjes family (2.4) is 3. Relation to free exponential families The following generalizes slightly [10,Section 4]; note that this definition is not constructive: for a given V , the corresponding free exponential family may fail to exist, see Example 3.2.
Definition 3.1. The free exponential family with variance function V generated by a compactly supported measure ν with mean m 0 ∈ (A, B) is a family of probability measures The next result shows that Cauchy-Stieltjes kernel families under parametrization by the mean are essentially the same as free exponential families, thus providing existence argument for free exponential families. Furthermore, the generating measure ν is determined uniquely by m 0 and the variance function V (m); the latter is an analog of the classical uniqueness theorem for exponential families, see [15,Theorem 2.11] or [17,Proposition 2.2].
Recall that the Cauchy-Stieltjes transform of a probability measure ν is If ν is compactly supported then G ν is analytic in the neighborhood of ∞ in the complex plane; in particular, compactly supported measures are determined uniquely by G ν (z) for large enough real z.
Conversely, if V is analytic and strictly positive in a neighborhood of m 0 , and there is a probability measure ν with mean m 0 such that the (positive) measures are probability measures for all m in a neighborhood of m 0 , then ν is compactly supported, non-degenerate, and is determined uniquely by Proof. We first calculate the variance v(θ) = Since m(θ) is analytic at θ = 0 and m(0) = m 0 , this shows that that v(θ) is analytic at θ = 0. Let V (m) = v(ψ(m)) denote the variance function in parametrization of (a subset of) K by the mean; clearly V is an analytic function in a neighborhood of m = m 0 .
To prove the converse implication, note that for m such that V (m) > 0 we can re-write Q m (dx) = 1 as Thus with we get (3.5). Since lim m→m ± 0 V (m)/(m − m 0 ) = ±∞, this shows that Cauchy-Stieltjes transform G ν (z) is defined for all real z with |z| large enough. This implies that ν has compact support, with moments that are uniquely determined from the corresponding moment generating function M(z) = 1/zG(1/z) for z small enough. (Compactness of support is also proved more directly in the proof of Theorem 3.3.) Finally, ν is non-degenerate as its variance is V (m 0 ) > 0.
Thus a necessary condition for V to be a variance function is that 3.1. Free exponential families with quadratic variance function. In this section we recall [10, Theorem 4.2 ]; since manuscript [10] is available in preprint form only and we have already set up all identities needed for the proof, we include the argument which is taken from [10]. The corresponding result for classical exponential families is [ where the discrete part of ν is absent except for the following cases: with the sign opposite to the sign of a. (iii) if −1 ≤ b < 0 then there are two atoms at This Cauchy-Stieltjes transform corresponds to the free-Meixner law (3.8), see [1,21]. (v) the free analog of hyperbolic type law if b > 0 and a 2 < 4b; see [1,Theorem 4]; (vi) the free binomial type law if −1 ≤ b < 0; see [21,Example 3.4] and [9, Proposition 2.1]. The laws in (i)-(v) are infinitely divisible with respect to free additive convolution (we recall the definition near (3.17)). In [1,Theorem 4] they appear in connection to martingale polynomials with respect to free Lévy processes; free infinite divisibility is analyzed also in [21]; [2] studies further free probability aspects of this family; in [9, Theorem 3.2] the same laws appear as a solution to a quadratic regression problem in free probability; in [11,Theorem 4.3] these laws occur in a "classical regression" problem.

Free Cumulants and Variance Functions.
Recall that if ν is a compactly supported measure with the Cauchy-Stieltjes transform G ν , then the inverse function K ν (z) = G −1 ν (z) exists for small enough z = 0, see [24]. The R-transform is defined as and is analytic at z = 0, The coefficients c n = c n (ν) are called free cumulants of measure ν, see [22].  Proof. Suppose that V determines the free exponential family generated by a compactly supported measure ν. For m = 0 close enough to 0, from (3.5) and (3.7) we get Thus, (3.10) says that the R-transform of ν satisfies Suppose now that a probability measure ν satisfies (3.12) and xν(dx) = m 0 . Then the variance c 2 (ν) = V (m 0 ) > 0, so ν it is non-degenerate. We first verify that ν has compact support. Since V is analytic, (3.12) implies that and ν has compact support. From supp(ν) ⊂ [−4M, 4M] we deduce that the Cauchy-Stieltjes transform G ν (z) is analytic for |z| > 4M, and the R-series is analytic for all |z| small enough.
Since V (m) > 0 for m close enough to m 0 , taking the derivative we see that z → (z − m 0 )/V (z) is increasing in a neighborhood of z = m 0 . Denoting by h the inverse, we have From c 1 (ν) = m 0 we see that R(0) = m 0 = h(0). By (3.14), we see that all derivatives of h at z = 0 match the derivatives of R. Thus h(z) = R(z) and (3.13) holds for all m in a neighborhood of 0. For analytic G ν , the latter is equivalent to (3.4) holding for all m close enough to 0. Thus V (m) is the variance function of a free exponential family generated by ν with m ∈ (−δ, δ) for some δ > 0.
We now use (3.12) to relate certain free cumulants to Catalan numbers.
This fact was stated without proof in [9, Remark 5.7]; the approach indicated there lead to a relatively long proof.

Corollary 3.5. V (m) = 1/(1 − m) is a variance function of a free exponential family generated by the centered ⊞-infinitely divisible measure ν with free cumulants
Proof. From (3.12), It is well known that Catalan numbers are even moments of the semicircle law, dx corresponds to ⊞-infinitely divisible law, see [13,Theorem 3.3.6]. Thus Catalan numbers c k+1 with c 1 = 0 are indeed free cumulants of some ⊞-infinitely divisible measure ν.
It is known that not every function V is a variance function of a natural exponential family. It is therefore not surprising that not every analytic functions V can serve as the variance functions for a free exponential family.   (i) There exists a centered ⊞-infinitely divisible probability measure ν such that V is the variance function of a free exponential family generated by ν. (ii) There exists a compactly supported probability measure ω such that The Cauchy-Schwarz inequality applied to the right hand side of the Lévy-Khinchin formula (3.18) implies (V 3 ) ′′ /6 ≥ ((V 2 ) ′ ) 2 /4. This gives a simple necessary condition.
Corollary 3.8. If V is analytic at 0, V (0) = 1, V ′′ (0) < 0 then V cannot be the variance function of a free exponential family generated by a centered ⊞-infinitely divisible measure.
We remark that the bound is sharp: from Theorem 3.2 we see that V (m) = 1 is a variance function of the free exponential family generated by the semicircle law; all of its members are infinitely divisible, see Example 4.1.

Example 3.3 (Compare [21, Theorem 3.2]
). If b < 0 then V (m) = 1 + am + bm 2 cannot be the variance function of a free exponential family generated by a centered ⊞-infinitely divisible measure.
3.3. Reproductive property. Natural exponential families have two "reproductive" properties. The first one is usually not named, and says that if a compactly supported measure ν generates natural exponential family F and µ ∈ F (ν) then F (µ) = F . This is usually interpreted as a statement that the natural exponential family F is determined solely by the variance function V and can have many generating measures.
The analog of this property fails for free exponential families due to the fact that the generating measure is determined uniquely by the variance function and parameter m 0 . For example, a free exponential family F generated by the centered semicircle law consists of the affine transformations of the Marchenko-Pastur laws, and for m 0 = 0 the free exponential family generated by µ ∈ F with mean m 0 contains no other measures in common with F except for µ.
The second property which in [15, (3.16)] is indeed called the reproductive property of an exponential family states that if µ ∈ F (V ), then for all n ∈ N the law of the sample mean, D n (µ * n ), is in F (V /n). Here D r (µ)(U) := µ(rU) denotes the dilation of measure µ by a number r = 0; in probabilistic language, if L(X) = µ then L(X/r) = D r (µ).
Our goal is to prove an analogue of this result for the Cauchy-Stieltjes families.
Let µ ⊞r denote the r-fold free additive convolution of µ with itself. In contrast to classical convolution, this operation is well defined for all real r ≥ 1, see [19]. Moreover, if for each λ > 0, there is a neighborhood of m 0 such that V /λ is a variance function of some free exponential family, then ν is ⊞-infinitely divisible.
We note that in contrast to classical natural exponential families, the neighborhood of m 0 where m → V (m)/λ is a variance function may vary with λ, see Example 4.1.
Proof. Combining (3.12) with R aX+b (z) = b + aR X (az), we see that the free cumulants of ν λ are c 1 (ν λ ) = c 1 (ν) = m 0 and for n ≥ 1 Theorem 3.3 implies that V /λ is the variance function of the free exponential family generated by ν λ . If ν 1/n exists for all n ∈ N, then the first part of the proposition together with uniqueness theorem (Theorem 3.1) implies that ν = (D n (ν 1/n )) ⊞n , proving ⊞-infinite divisibility.

Marchenko-Pastur Approximation
Let denote the semicircle law of mean a and variance σ 2 . Up to affine transformations, this is the free Meixner law which appears in Theorem 3.2 as the law which generates the free exponential family F a (V ) with the variance function V ≡ σ 2 . Following the analogy with natural exponential families, family F 0 (σ 2 ) can be thought as a free exponential analog of the normal family. Somewhat surprisingly, this family does not contain all semicircle laws, but instead it contains affine transformations of the (absolutely continuous) Marchenko-Pastur laws.
We remark that Biane [7] analyzes f → g(m) := f (x)π m,λ (dx) as a mapping of the appropriate Hilbert spaces for complex m.
We have the following analogue of [15,Theorem 3.4].
To prove Theorem 4.1 we will use the following analogue of Mora's Theorem, see [  Proof. Let ν n be the generating measure for F m 0 (V n ). Since V n (z) → V (z) uniformly in a neighborhood of m 0 , from (3.15) we see that the cumulants c k+1 (ν n ) converge as n → ∞ and sup n |c k+1 (ν n )| ≤ M k for some M < ∞. Therefore the R-transforms of ν n converge to the Rtransform of a compactly supported measure ν. Thus ν n D − → ν, and the supports of ν n are uniformly bounded in n, i.e., supp(ν n ) ⊂ [−A, A] for some 0 < A < ∞. By decreasing the value of δ we can also ensure that the densities in (3.4) are bounded as functions of x ∈ [−A, A] uniformly in n. So the integrals converge, and ν indeed generates a free exponential family with variance V in a neighborhood of m 0 .
Of course, every compactly supported mean-zero measure ν is an element of the Cauchy-Stieltjes family that it generates. Since π 0,1/σ 2 = ω 0,σ is the semicircle law, combining Proposition 3.9 with Theorem 4.1 we get the following Free Central Limit Theorem; see [8,23].