Multidimensional q-Normal and related distributions - Markov case

We define and study distributions in R^{d} that we call q-Normal. For q=1 they are really multidimensional Normal, for q\in(-1,1) they have densities, compact support and many properties that resemble properties of ordinary multidimensional Normal distribution. We also consider some generalizations of these distributions and indicate close relationship of these distributions to Askey-Wilson weight function i.e. weight with respect which Askey-Wilson polynomials are orthogonal and prove some properties of this weight function. In particular we prove a generalization of Poisson-Mehler expansion formula


Introduction
The aim of this paper is to define, analyze and possibly 'accustom' new distributions in R d . They are defined with a help of two one-dimensional distributions that first appeared recently, partially in noncommutative context and are defined through infinite products. That is why it is difficult to analyze them straightforwardly using ordinary calculus. One has to refer to some extent to notations and results of so called q−series theory.
However the distributions we are going to define and examine have purely commutative, classical probabilistic meaning. They appeared first in an excellent paper of Bożejko et al. [4] as a by product of analysis of some non-commutative model. Later they also appeared in purely classical context of so called one-dimensional random fields first analyzed by W. Bryc at al. in [1] and [3]. From these papers we can deduce much information on these distributions. In particular we are able to indicate sets of polynomials that are orthogonal with respect to measures defined by these distributions. Those are so called q−Hermite and Al-Salam-Chihara polynomials -a generalizations of well known sets of polynomials. Thus in particular we know all moments of the discussed one-dimensional distributions.
What is interesting about distributions discussed in this paper is that many of their properties resemble similar properties of normal distribution. As stated in the title we consider three families of distributions, however properties of one, called multidimensional q−Normal, are main subject of the paper. The properties of the remaining two are in fact only sketched.
All distributions considered in this paper have densities. The distributions in this paper are parametrized by several parameters. One of this parameters, called q, belongs to (−1, 1] and for q = 1 the distributions considered in this paper become ordinary normal. Two out of three families of distributions defined in this paper have the property that all their marginals belong to the same class as the joint, hence one of the important properties of normal distribution. Conditional distributions considered in this paper have the property that conditional expectation of a polynomial is also a polynomial of the same order -one of the basic properties of normal distributions. Distributions considered in this paper satisfy Gebelein inequality -property discovered first in the normal distribution context. Furthermore as in the normal case lack of correlation between components of a random vectors considered in the paper lead to independence of these components. Finally conditional distribution f C (x|y, z) considered in this paper can be expanded in series of the form f C (x|y, z) = f M (x) ∞ i=0 h i (x)g i (y, z) where f M is a marginal density, {h i } are orthogonal polynomials of f M and g i (y, z) are also polynomials. In particular if f C (x|y, z) = f C (x|q) that is when instead of conditional distribution of X|Y, Z we consider only distribution of X|Y then g i (y) = h i (y). In this case such expansion formula it is a so called Poisson-Mehler formula, a generalization of a formula with h i being ordinary Hermite polynomials and f M (x) = exp(−x 2 /2)/ √ 2π that appeared first in the normal distribution context.
On the other hand one of the conditional distributions that can be obtained with the help of distributions considered in this paper is in fact a re-scaled and normalized (that is multiplied by a constant so its integral is equal to 1) Askey-Wilson weight function. Hence we are able to prove some properties of this Askey-Wilson density. In particular we will obtain a generalization of Poisson-Mehler expansion formula for this density.
To define briefly and swiftly these one-dimensional distributions that will be later used to construct multidimensional generalizations of normal distributions, let us define the following sets Let us set also m + S (q) . . , m d ). Sometimes to simplify notation we will use so called indicator functions The two one-dimensional distributions (in fact families of distributions) are given by their densities. The first one has density: defined for |q| < 1, x ∈ R. We will set also For q = −1 considered distribution does not have density, is discrete with two equal mass points at S (−1). Since this case leads to non-continuous distributions we will not analyze it in the sequel.
The fact that such definition is reasonable i.e. that distribution defined by f N (x|q) tends to normal N (0, 1) as q −→ 1 − will be justified in the sequel. The distribution defined by f N (x|q) , −1 < q ≤ 1 will be referred to as q−Normal distribution.
The second distribution has density: defined for |q| < 1, |ρ| < 1, x ∈ R, y ∈ S (q). It will be referred to as (y, ρ, q)−Conditional Normal, distribution. For q = 1 we set (in the sequel we will justify this fact). Notice that we have f CN (x|y, 0, q) = f N (x|q) for all y ∈ S (q). The simplest example of multidimensional density that can be constructed from these two distribution is two dimensional density g (x, y|ρ, q) = f CN (x|y, ρ, q) f N (y|q) , that will be referred to in the sequel as N 2 (0, 0, 1, 1, ρ|q). Below we give some examples of plots of these densities. One can see from these pictures how large and versatile family of distributions is this family It has compact support equal to S (q) × S (q) and two parameters. One playing similar rôle to parameter ρ in two-dimensional Normal distribution. The other parameter q has a different rôle. In particular it is responsible for modality of the distribution and of course it defines its support.
As stated above, distribution defined by f N (x|q) appeared in 1997 in [4] in basically non-commutative context. It turns out to be important both for classical and noncommutative probabilists as well as for physicists. This distribution has been 'accustomed' i.e. equivalent form of the density and methods of simulation of i.i.d. sequences drawn from it are e.g. presented in [18]. Distribution f CN , although known earlier in nonprobabilistic context, appeared (as an important probability distribution) in the paper of W. Bryc [1] in a classical context as a conditional distribution of certain Markov sequence. In the following section we will briefly recall basic properties of these distributions as well as of so called q−Hermite polynomials (a generalization of ordinary Hermite polynomials). To do this we have to refer to notation and some of the results of q−series theory.
The paper is organized as follows. In section 2 after recall some of the results of q−series theory we present definition of multivariate q−Normal distribution. The following section presents main result. The last section contains lengthy proofs of the results from previous section.
2. Definition of multivariate q-Normal and some related distributions 2.1. Auxiliary results. We will use traditional notation of q−series theory i.e.
It is easy to notice that (q) n = (1 − q) n [n] q ! and that Let us also introduce two functionals defined on functions g : R −→ C, and sets: Spaces (L (q) , . L ) and (CL (y, ρ, q) , . CL ) are Hilbert spaces with the usual definition of scalar product. Let us also define the following two sets of polynomials: -the q−Hermite polynomials defined by for n ≥ 1 with H −1 (x|q) = 0, H 0 (x|q) = 1, and -the so called Al-Salam-Chihara polynomials defined by the relationship for n ≥ 0 : with P −1 (x|y, ρ, q) = 0, P 0 (x|y, ρ, q) = 1.
Polynomials (2.1) satisfy the following very useful identity originally formulated for so called continuous q−Hermite polynomials h n (can be found in e.g. [7] Thm. 13.1.5) and here below presented for polynomials H n using the relationship It is known (see e.g. [1]) that q−Hermite polynomials constitute an orthogonal base of L (q) while from [3] one can deduce that {P n (x|y, ρ, q)} n≥−1 constitute an orthogonal base of CL (y, ρ, q). Thus in particular 0 = S(q) P 1 (x|y, ρ, q) f CN (x|y, ρ, q) dx = E (X|Y = y)−ρy. Consequently, if Y has also q−Normal distribution, then EXY = ρ.

Multidimensional q−Normal and related distributions.
Before we present definition of the multidimensional q−Normal and related distributions, let us generalize the two discussed above one-dimensional distributions by introducing (m, σ 2 , q)−Normal distribution as the distribution with the density Similarly let us extend definition of (y, ρ, q)−Conditional Normal by introducing for m ∈ R, σ > 0, q ∈ (−1, 1], |ρ| < 1, (m, σ 2 , y, ρ, q)-Conditional Normal distribution as the distribution whose density is equal to f CN ((x − m)/σ|y, ρ, q) /σ.
Let m, σ ∈ R d and ρ ∈ (−1, 1) d−1 , q ∈ (−1, 1]. Now we are ready to introduce a multidimensional q−Normal distribution N d m, σ 2 , ρ|q . As an immediate consequence of the definition we see that supp(N d (m, σ 2 |q)) = m + S (q). One can also easily see that m is a shift parameter and σ is a scale parameter. Hence in particular EX = m. In the sequel we will be mostly concerned with distributions N d (0, 1, ρ|q).
Another words "lack of correlation means independence" in the case of multidimensional q−Normal distributions. More generally if the sequence ρ = (ρ 1 , . . . , ρ d−1 ) contain, say, r zeros at, say, positions t 1 , . . . , t r then the distribution of N d (0, 1, ρ) is a product distribution of r + 1 independent multidimensional q−Normal distributions: Thus in the sequel all considered vectors ρ will be assumed to contain only nonzero elements.
Let us introduce the following functions (generating functions of the families of polynomials): The basic properties of the discussed distributions will be collected in the following Lemma that contains facts from mostly [7] and the paper [3].
ii) For n ≥ 0 : iii) For n, m ≥ 0 : and convergence is absolute t, y & x and uniform in x and y.
we reduce considerations to the case N d (0, 1, ρ|q). First let us consider d − 1 dimensional marginal distributions. The assertion of Corollary is obviously true since we have assertion iv) of the Lemma 1. We can repeat this reasoning and deduce that all d − 2, d − 3, . . . , 2 dimensional distributions are multidimensional q−Normal. The fact that 1− dimensional marginal distributions are q−normal follows the fact that f CN (y|x, ρ, q) is a one-dimensional density and integrates to 1.
Thus in particular this density depends only on X j k and X jm .
Proof. i) As before, by suitable change of variables we can work with distribution N d (0, 1, ρ|q) . Then following assertion iii) of the Lemma 1 and the fact that m− dimensional marginal, with respect to which we have to integrate is also multidimensional q−Normal and that the last factor in the product representing density of this distribution is f CN x i |x jm , i−1 k=jm ρ k , q we get i). ii) First of all notice that joint distribution of (X j1 , . . . X j k , X i , X jm , . . . , X j h ) depends only on x j k , x i , x jm since sequence X i , i = 1 . . . , n is Markov. It is also obvious that the density of this distribution exist and can be found as a ratio of joint distribution of (X j k , X i , X jm ) divided by the joint density of (X j k , X jm ) . Keeping in mind that X j k , X i , X jm have the same marginal f N and because of assertion iv of Lemma 1 we get the postulated form.
Proof. Firstly observe that : which is elementary to prove. We will use modification of the formula (2.11) that is obtained from it by dividing both sides by f N (x|q) . That is formula: Now we use (2.5) and assertion v) of Lemma 1 and get ∀x, y ∈ S (q) : for every g ∈ L (q) . Thus g ∈ CL (y, ρ, q). Conversely to take a function g ∈ CL (y, ρ, q). We have is a quadratic function in x, we deduce that it reaches its maximum for x ∈ S (q) on the end points of S (q). Hence we have Since for ∀y ∈ S (q) , |ρ| , |q| < 1 : and we see that So g ∈ L (q) .

Remark 2.
Notice that the assertion of Proposition 2 is not true for q = 1 since then the respective densities are N (0, 1) and N ρy, 1 − ρ 2 .
Remark 3. Using assertion of Proposition 2 we can rephrase Corollary 2 in terms of contraction R (ρ, q) , (defined by (2.12), below). For g ∈ L (q) we have where R (ρ, q) is a contraction on the space L (q) defined by the formula (using polynomials H n for |ρ| , |q| < 1) : By the way it is known that R is not only contraction but also ultra contraction i.e. mapping L 2 on L ∞ (Bożejko).
We have also the following almost obvious observation that follows, in fact, from assertion iii) of the Lemma 1.
i) If E (g (X i ) |X j1 , . . . , X jm ) =polynomial of degree at most n of X jm , then function g must be also a polynomial of degree at most n.
ii) If additionally Eg (X i ) = 0 i) The fact that E (g (X i ) |X j1 , . . . , X jm ) is a function of X jm only, is obvious. Since g ∈ L (q) we can expand it in the series g (x) = i≥0 c i H i (x|q). By Corollary 2 we know that E (g (X i ) |X j1 , . . . , X jm ) = i≥0 c i r i H (X jm |q) for r = i−1 k=jm ρ k . Now since c i r i = 0 for i > n and r = 0 we deduce that c i = 0 for i > n. ii

Remark 4.
As it follows from the above mentioned definition, the multidimensional q−Normal distribution is not a true generalization of n−dimensional Normal law N n (m, Σ). It a generalization of distribution N n (m, Σ) with very specific matrix Σ namely with entries equal to σ ii = σ 2 i ; σ ij = σ i σ j j−1 k=i ρ k for i < j and σ ij = σ ji for i > j where σ i ; i = 1, . . . , n are some positive numbers and |ρ i | < 1, i = 1, . . . , n − 1.
We have immediate observation that follows from assertion vi) of Lemma 1.
Proof. i) Using assertions vi) and viii) of the Lemma 1 we get: Now utilizing assertion ii) of the same Lemma we get: To get ii) we utilize assertion vi) of the Lemma 1.
Assertion i) of the Proposition 4 leads to the generalization of the multidimensional q−Normal distribution that allows different one-dimensional and other marginals.
We have immediate observation that follows from assertion vii) of Lemma 1. Proposition 6. Let X ∼ (y, ρ, t, q) −MCN. Then for n ∈ N : E (P n (X|y, ρ, q)) = ρ 2 n t n . Hence in particular one can state the following Corollary.
We can define two formulae for densities of multidimensional distributions in R d . Namely one of them would have density of the form and the other of the form However to find marginals of such families of distributions is a challenge and an open question. In particular are they also of modified conditional normal type?

Main Results
In this section we are going to study properties of 3 dimensional case of multidimensional normal distribution. To simplify notation we will consider vector (Y, X, Z) having distribution N 3 ((0, 0, 0), (1, 1, 1), (ρ 1 , ρ 2 )|q) that is having density f CN (y|x, ρ 1 , q) f CN (x|z, ρ 2 , q)f N (z|q). We start with the following obvious result: Proof. It is in fact rewritten version of the proof of assertion ii) of Corollary 2.
Remark 9. Assertion i) of the Theorem 1 is in fact a generalization of Poisson-Mehler formula (that is assertion viii) of Lemma 1) for Askey-Wilson density.
Remark 10. Notice also that for q = 1, φ is a density function of normal distri- and it is obvious that expectation of any polynomial is a polynomial in . Hence it turns out that this is true for all q−Normal distributions for q ∈ (−1, 1]. As a Corollary we have the following result. constants (depending only on n, q, ρ and numbers i, j k j m ) A (n) r,s ;. r = 0, . . . , ⌊n/2⌋ , s = − ⌊n/2⌋ + r, . . . , − ⌊n/2⌋ + r + n − 2r.
0,0 . In particular : Remark 11. Notice that in general conditional variance var (X i |X 1 , . . . , X i−1 , X i+1 , . . . , X d ) is not nonrandom indicating that q−Normal distribution does not behave as Normal in this case, however if we set q = 1 in as it should be in the normal case.
Notice that examining the form of coefficients A (n) m,k for n = 1, . . . , 4 we can formulate the following Hypothesis concerning general form of them: is a polynomial in q with coefficients depending only on r and l.

Proofs
Proof of the Theorem 1 is based on the properties of the following function G l,k (y, z, t|q) = m≥0 t m [m] q ! H m+l (y|q) H m+k (z|q). We will need some of its properties. Namely we will prove the following Proposition which is in fact a generalization and reformulation (in terms of polynomials H n ) of an old result of Carlitz. Proposition 7. i) ∀k, l ≥ 0 : G k,l (y, z, t|q) = G l,k (z, y, t|q) ii) for 1 ≤ j ≤ k : iii) where τ k,i (y, z, t|q) = (H k−i (y|q)G 0,i (y, z, t|q)+(−1) k q ( k 2 ) t k H k−i (z|q)G i,0 (y, z, t|q)).
Proof. i) is obvious. iii) Take j = k and l = 0 in ii).
ii) To prove (4.1) we will use formula H n+m (x|q) = H n (x|q) H m (x|q) − min(n,m) j=1 n j q m j q [j] q !H n+m−2j (x|q). We have Hence let us assume that (4.1) is true for j = 1, 2, . . . , m. We have after applying just obtained formula (for G k,l ) applied for k− > k − m and l− > m + l : iv) For k = 0 this is obviously true. Now let us iterate (4.2) once, applied however, for G 0,k . We will get then . Thus we see that since for all i ≤ k − 1 G i,0 and G 0,i and are of the claimed form then from (4.3) it follows that G k,0 has the claimed form. Now we are ready to present the proof if Theorem 1.
Proof of the Theorem 1. To prove i) we will use formula viii) of Lemma 1, that is Poisson-Mehler expansion formula. Following (3.1) we see that First, let us concentrate on the quantity: We will apply identity (2.4), distinguish two cases n + m is even and n + m is odd, denote n + m − 2j = 2k or n + m − 2j = 2k + 1 depending om the case and sum over the set of {(n, m) : n + m − 2k ≤ 2 min(n, m), m, n ≥ 0}∪{(n, m) : n + m − 2k − 1 ≤ 2 min(n, m), m, n ≥ 0}. We have We get then Using introduced in Proposition 7 function G l,k (y, z, t|q) = m≥0 t m [m] q ! H m+l (y|q) H m+k (z|q) we can express both Hn(x|q) [n] q ! n i=0 n i q ρ i 1 ρ n−i 2 G i,n−i (y, z, ρ 1 ρ 2 |q). Our Theorem will be proved if we will be able to show that ∀l, k ≥ 0 : G l,k (y, z, t|q) = G 0,0 (y, z, t|q) Θ l,k (y, z, t|q) where Θ l,k is a polynomial of order a l in y k in z. This fact follows by induction from formula (4.3) of assertion iv) of the Proposition 7 since it expresses G k,0 in terms of k functions G i,0 and G 0,i for i = 0, . . . , k − 1 and the fact that all G l,k can be expressed by G i,0 and G 0,i ; i ≤ k + l.
Proof of Corollary 5. By Theorem 1 we know that regression E(H n (X i |q) |X j1 , . . . , X j k , X jm , . . . , X j h ) is a polynomial in X j k and X jm of order at most n. To analyze the structure of this polynomial let us present it in the form n s=0 a s,n H s (X jm |q) where coefficients a s,n are some polynomials of X j k . Now let us take conditional expectation with respect to X j1 , . . . , X j k of both sides. On one hand we get Since a s,n are polynomials in X j k of order at most n, we can present them in the form a s,n = n t=0 β t,s H t (x j k |q) .
Thus we have equality: Now we use the identity (2.4) and get Hence we deduce that β t,s = 0 for t + s > n , t + s = n − 1, n − 3, . . . ,. To count the number of coefficients A (n) j,k observe that we have n + 1 coefficients A (n) 0,k since k ranges from − n 2 to − n 2 + n , n − 1 coefficients A (n) 1,k where k ranges from − ⌊n/2⌋ + 1 to − ⌊n/2⌋ + n − 1 and so on.
Proof of Corollary 6. The proof is based on the idea of writing down system of n+2 2 n+3 2 ( n = 1, . . . , 4 ) linear equations satisfied by coefficients A (n) m,k . These equations are obtained according to the similar pattern. Namely we multiply both sides of identity (3.2) by H m (X i−1 ) and H k (X i ) and calculate conditional expectation of both sides with respect to F <i or with respect to F >i and similar formulae for F >i . We expand both sides with respect to H s , s = n + m + k − 2t, t = 0, . . . , ⌊(n + m + k)/2⌋. On the way we utilize (2.4) and compare coefficients standing by H s on both sides. Thus each obtained equation involving coefficients A (n) i,j can be indexed by s, m, j and r if we calculate conditional expectation with respect to F <i of l if we conditional expectation is calculated with respect to F >i . Of course if s = 0 then r and l lead to the same result. Formulae for A (n) i,j , for n = 1 are obtained by taking s = 1, m = 0, j = 0 and applying r and l. For n = 2 first we consider m = 0, j = 0 and s = 2 and applying r and l and then m = 0, j = 0 and s = 0. In this way we get 3 equations. The forth one is obtained by taking m = 0, j = 1, s = 1 and r. Denote m,k , for n = 1, 2 can be obtained from formulae scattered in the literature like e.g. [1], [12] or [19]. To get equations satisfied by coefficients A (n) i,j , for n = 3, 4 First n + 1 equations are obtained by taking m = 0 , k = 0 and s =3, 1 if n = 3 and s =4, 2, 0 if n = 4 and then applying operations r and l. Then, in order to get remaining 2 (in case of n = 3) or 4 (in case of n = 4) equations one has to be more careful since it often turns out that many equations obtained for some m and k are linearly dependent on the previously obtained equations. In the case of n = 3 to get remaining two linearly independent equations we took m = 2, k = 0, s = 3 and applied operations r and l. In this way we obtained system of 6 linear equations with 1,0 , A 1,1 ) T . For n = 4 remaining 4 equations we obtained by taking: (m = 1 , k = 4, s= 3, r), (m = 4, k = 1, s= 1, r), (m = 4, k = 2, s = 4, r) and (m = 2, k = 4, s = 4, , r). Recall that in this case we have 9 equations. Matrix of this system has 81 entries. That is why we will skip writing down the whole system of equations. To get the scent of how complicated these equations are we will present one equation. For n = 4 one of the equations (referring to the case m = 4, k = 1, s = 1, r) is [2] 2 q 1 + q 2 [3]  2 q 1 + q 2 [3] q ρ1 3 ρ2 3 (1 + q + q 2 + q 3 + ρ 2 i−1 ρ 2 i + qρ 2 i−1 ρ 2 i + q 2 ρ 2 i−1 ρ 2 i + q 3 ρ 2 i−1 ρ 2 i + q 4 ρ 2 i−1 ρ 2 i )A