Multivariate records based on dominance

We consider three types of multivariate records in this paper and derive the mean and the variance of their numbers for independent and uniform random samples from two prototype regions: hypercubes $[0,1]^d$ and $d$-dimensional simplex. Central limit theorems with convergence rates are established when the variance tends to infinity. Effective numerical procedures are also provided for computing the variance constants to high degree of precision.


Introduction
While the one-dimensional records (or record-breakings, left-to-right maxima, outstanding elements, etc.) of a given sample have been the subject of research and development for more than six decades, considerably less is known for multidimensional records. One simple reason being that there is no total ordering for multivariate data, implying no unique way of defining records in higher dimensions. We study in this paper the stochastic properties of three types of records based on the dominance relation under two representative prototype models. In particular, central limit theorems with convergence rates are proved for the number of multivariate records when the variance tends to infinity, the major difficulty being the asymptotics of the variance.
Dominance and maxima. A point p ∈ R d is said to dominate another point q ∈ R d if p − q has only positive coordinates, where the dimensionality d ≥ 1. Write q ≺ p or p ≻ q. The nondominated points in the set {p 1 , . . . , p n } are called maxima. Maxima represent one of the most natural and widely used partial orders for multidimensional samples when d ≥ 2, and have been thoroughly investigated in the literature under many different guises and names (such as admissibility, Pareto optimality, elites, efficiency, skylines, . . . ); see [1,4] and the references therein.
Chain records. Yet another type of records of multi-dimensional samples introduced in [14] is the chain record p 1 ≺ p i 1 ≺ p i 2 ≺ · · · ≺ p i k , where 1 < i 1 < i 2 < · · · < i k and there are no p j ≻ p ia with i a < j < i a+1 or i a < j ≤ n. See Figure 1 for an illustration of the three different types of records. Some known results and comparisons. If we drop the restriction of order, then the largest subset of indices such that is equal to the number of maximal layers (maxima being regarded as the first layer, the maxima of the remaining points being the second, and so on). Assuming that {p 1 , . . . , p n } are iud in the hypercube [0, 1] d , Gnedin [14] proved that the number of chain records Y n is asymptotically Gaussian with mean and variance asymptotic to see Theorem4 for an improvement. The author also derived exact and asymptotic formulas for the probability of a chain record P(Y n > Y n−1 ) and discussed some point-process scaling limits. The behavior of the record sequence (1) in R 2 are studied in Goldie and Resnick [16], Deuschel and Zeitouni [8]. The position of the points converges in probability to a (or a set of) deterministic curve(s). Deuschel and Zeitouni [8] also proved a weak law of large number for the longest increasing subsequence, extending a result by Vershik and Kerov [21] to a non-uniform setting; see also the breakthrough paper [3]. A completely different type of multivariate records based on convex hulls was discussed in [20].
Chain records can in some sense be regarded as uni-directional Parero records, and thus lacks the multi-directional feature of Pareto records. The asymptotic analysis of the moments is in general simpler than that for the Pareto records. On the other hand, it is also this aspect that the chain records reflect better the properties exhibited by the one-dimensional records.
Interestingly, the chain records correspond to the "left-arm" (starting from the root by always choosing the subtree corresponding to the first quadrant) of quadtrees; see [5,10]  A summary of results. We consider in the paper the distributional aspect of the above three types of records in two typical cases when the p i 's are iud in the hypercube [0, 1] d and in the d-dimensional simplex, respectively. Briefly, hypercubes correspond to situations when the coordinates are independent, while the d-dimensional simplex to that when the coordinates are to some extent negatively correlated. The hypercube case has already been studied in [14]; we will discuss this briefly by a very different approach. In addition to the asymptotic normality for the number of Pareto records in the d-dimensional simplex, our main results are summarized in the following table, where we list the asymptotics of the mean (first entry) and the variance (second entry) in each case.
X X X X X X X X X X X X  (4), and both (15) and (16) are bounded in n and in d; see Figure 3. From this table, we see clearly that the three types of records behave very differently, although they coincide when d = 1. Roughly, the number of dominating records is bounded (indeed less than two on average) in both models, while the chain records have a typical logarithmic quantity; and it is the Pareto records that reflect better the variations of the underlying models. Organization of the paper. We derive asymptotic approximations to the mean and the variance for the number of Pareto records in the next section. Since the expression for the leading coefficient of the asymptotic variance is very messy, we then address in Section 3 the numerical aspect of this constant. The tools we used turn out to be also useful for several other constants of similar nature, which we briefly discuss. We then discuss the chain records and the dominating records.

Asymptotics of the number of Pareto records
denote the d-dimensional simplex, where x := x 1 + · · ·+ x d . Assume that p 1 , . . . , p n are iud in S d . Let X n denote the number of Pareto records of {p 1 , . . . , p n }. We derive in this section asymptotic approximations to the mean and the variance and a Berry-Esseen bound for X n . The same method of proof also applies to the number of maxima, denoted by M n , which we will briefly discuss. Let q 1 , . . . , q n be iud in S d × [0, 1]. As discussed in Introduction, the distribution of X n is equivalent to the distribution of the number of maxima of {q 1 , . . . , q n }.
For notational convenience, denote by a n ≃ b n if a n = b n + O n −1/d .

Theorem 1
The mean and the variance of the number of Pareto maxima X n in random samples from the d-dimensional simplex satisfy Proof. The method of proof is similar to that given in [1], but the technicalities are more involved. We start with the expected value of X n . Let This proves (2). For the variance, we start from the second moment, which is given by . Let A be the region in R d × [0, 1] such that q 1 and q 2 are incomparable (neither dominating the other). Write q 1 = (x, u), q 2 = (y, v), x * := ( x ∧ 1) and x ∨ y := (x 1 ∨ y 1 , · · · , x d ∨ y d ) .
Then by standard majorization techniques (see [1]) Consider first J n,ℓ , 1 ≤ ℓ < d. We proceed by four changes of variables to simplify the integral starting from We then perform the change of variables Finally, we "linearize" the integrals by the change of variables since the change of variables produces the factors Proceeding in a similar manner for J n,d , we deduce that Similarly, for J n,0 , we get The change of variables x → ξ i , y → η i then yields This completes the proof of the theorem.
Remark. By the same arguments, we derive the following asymptotic estimates for the number of maxima in S d . (4)

Theorem 2 The number of Pareto records in iud samples from d-dimensional simplex is asymptotically normal with a rate given by
where Φ(x) denotes the standard normal distribution.
Proof. Define the region Let X n denote the number of maxima in D n and X n the number of maxima of a Poisson process on D n with intensity d!n. Then .
We prove that the four terms on the right-hand side of (6) all satisfy the O-bound in (5). For the first term, we consider the probability P X n = X n ≤ nP (q 1 / ∈ D n and q 1 is a maxima) For the second term on the right-hand side of (6), we use a Poisson process approximation To bound the third term, we use Stein's method similar to the proof for the case of hypercube given in [1] and deduce that where Q n is the error term resulted from the dependence between the cells decomposed and Finally, the last term in (6) is bounded above as follows.

Remark. By defining
instead and by applying the same arguments, we deduce the Berry-Esseen bound for the number of maxima in iud samples from S d

Numerical evaluations of the leading constants
The leading constants v d (see (3)) andṽ d (see (4)) appearing in the asymptotic approximations to the variance of X n and to that of M n are not easily computed via existing softwares. We discuss in this section more effective means of computing their numerical values to high degree of precision. Our approach is to first apply Mellin transforms (see [9]) and derive series representations for the integrals by standard residue calculations and then convert the series in terms of the generalized hypergeometric functions The resulting linear combinations of hypergeometric functions can then be computed easily to high degree of precision by any existing symbolic softwares even with a mediocre laptop.
The leading constant v d of the asymptotic variance of the d-dimensional Pareto records.
We consider the following integrals We start from the simplest one, I d,0 and use the integral representation for the exponential function where c > 0, ℜ(t) > 0 and the integration path (c) is the vertical line from c − i∞ to c + i∞. Substituting this representation into I d,0 , we obtain Making the change of variables y → xy yields where 1 < c < 1 + 1 d . Moving the integration path to the right, one encounters the simple poles at s = 1 + 1 d + j for j = 0, 1, . . . . Summing over all residues of these simple poles and proving that the remainder integral tends to zero, we get where the terms converge at the rate j −d−1+ 1 d . This can be expressed easily in terms of the generalized hypergeometric functions.
An alternative integral representation can be derived for I d,0 as follows.
which can also be derived directly from the original multiple integral representation and successive changes of variables (first u, then x, then v, and finally y). In particular, for d = 2, Now we turn to I d,d .
By the same arguments used above, we have where c > 1 and To evaluate I ′ d,d , assume first that 1 d < ℜ(s) < 1, so that Now the right-hand side is well-defined for 1 d < ℜ(s) < 2. Substituting this into I d,d , we obtain where 1 < c < 1 + 1 d . For computational purpose, we use the functional equation for Gamma function Γ(1 − s)Γ(s) = π sin πs , so that ds.
In this case, we have simples poles at s = j + 1/d for both integrands and s = j for the first integrand to the right of ℜ(s) = 1 for j = 2, 3, . . . . Thus summing over all the residues and proving that the remainder integral goes to zero, we obtain .
A similar argument as that used for I d,0 gives the alternative integral representation In particular, for d = 2, Now we consider I d,m for 1 ≤ m < d.
Note that each term has no pole at s = 1 − ℓ d . Thus We then deduce that the integral equals the sum of the residues at s = j + 1 d and s = j d,m,ℓ + I [2] d,m,ℓ + I [3] d,m,ℓ .
Accordingly, C [2] d = 0 for odd values of d. For the other two sums containing I [1] d,m,ℓ and I [3] d,m,ℓ , we use the identity Then Note that for large j, so that the series is absolutely convergent. Similarly, Since In particular, v 2 has a closed-form expression The leading constantṽ d of the asymptotic variance of the d-dimensional maxima. Let Then (see (4))ṽ Consider first J d,0 . By expanding (1 + x d ) −1− 1 d , interchanging and evaluating the integrals, we obtain the general terms converging at the rate O(j −d− 1 d ). The convergence rate can be accelerated as follows.
the convergence rate being now exponential. In terms of the generalized hypergeometric functions, we have The integrals J d,k can be simplified as follows.
Yet another constant in [6]. A similar but simpler integral to (4) appeared in [6], which is of the form . By Mellin inversion formula for e −t , we obtain Expanding the factor (u + w) d−2 , we obtain Here Thus we obtain . This readily gives, by converting the above series into hypergeometric functions, the numerical values of the first few K d , These are consistent with those given in Chiu and Quine (1997). In particular, K 2 = 1 4 √ π log 2.
Further simplification of this formula can be obtained as above, but the resulting integral expression is not much simpler than

Asymptotics of the number of chain records
We consider in this section the number of chain records of random samples from d-dimensional simplex; the tools we use are different from [14] and apply also to chain records for hypercube random samples, which will be briefly discussed. For other types of results, see [14].

Chain records of random samples from d-dimensional simplex
Assume that p 1 , . . . , p n are iud in the d-dimensional simplex S d . Let Y n denote the number of chain records of this sample. Then Y n satisfies the recurrence with Y 0 := 0, where for 0 ≤ k < n. An alternative expression for the probability distribution π n,k is which is more useful from a computational point of view.
where the λ ℓ 's are all complex ( ∈ R), except when d is even (in that case, −d − 1 is the unique real zero among {λ 1 , . . . , λ d−1 }). Interestingly, an essentially the same equation also arises in the analysis of random increasing k-trees; see [7].

Theorem 3
The number of chain records Y n for random samples from d-dimensional simplex is asymptotically normally distributed in the following sense where µ s := 1/(dH d ) and σ s := H The mean and the variance are asymptotic to Exact solution of the general recurrence. In general, consider the recurrence a n = b n + 0≤k<n π n,k a k (n ≥ 1), with a 0 = 0. Then the same approach used above leads to the recurrencẽ ã n +b n +b n+1 , which by iteration gives by defining b 0 =b 0 = 0. Then we obtain the closed-form solution a n = 1≤k≤n n k ã k .
A similar theory of "d-analogue" to that presented in [10] can be developed (by replacing 2 d /j d there by d!/((dj + 1) · · · (dj + d))). However, this type of calculations becomes more involved for higher moments.
Asymptotics of µ n . We now look at the asymptotics of µ n . To that purpose, we need a better expression for the finite product in the sum-expression (13).
In terms of the zeros λ j 's of the equation (z + 1) · · · (z + d) − d!, we have The zeros λ j 's are distributed very regularly as showed in Figure 4. Now we apply the integral representation for the n-th finite difference (called Rice's integrals; see [11]) and obtain Note that φ(s) is well defined and has a simple pole at s = 0. The integrand then has a double pole at s = 0; standard calculations (moving the line of integration to the left and summing the residue of the pole encountered) then lead to where the O-term can be made more explicit if needed. Here H n = 1≤j≤n 1/j denotes the harmonic numbers, and ψ(z) denotes the derivative of log Γ(z). Note that to get this expression, we used the identity The probability generating function. Let P n (y) := E[y Yn ]. Then P 0 (y) = 1 and for n ≥ 1 P n (y) = y 0≤k<n π n,k P k (y).
A very similar analysis as above then leads to a Berry-Esseen bound for Y n as follows. The asymptotic normality (without rate) was already established in [14].
In the special case when d = 2, more explicit expressions are available for n ≥ 1.

Dominating records in the d-dimensional simplex
We consider the mean and the variance of the number of dominating records in this section. Let Z n denote the number of dominating records of n iud points p 1 , . . . , p n in the ddimensional simplex S d .

Theorem 5
The mean and the variance of the number of dominating records for iud random samples from the d-dimensional simplex are given by and we obtain (16). For large n, the right-hand side of (16) converges to at an exponential rate, which, for large d, is asymptotic to 3 √ πd 4 −d . This explains the curves corresponding to Z n in Figure 3.
The proof for the dominating records in hypercubes is similar and omitted.